Epidemiology With An Open Source WebGIS Platform

Claudia Dolci (Fondazione Bruno Kessler) with Gabriele Franch (Fondazione Bruno Kessler), Shamar Droghetti (Fondazione Bruno Kessler)

14:00 on Thursday 19th September (in Session 13, starting at 2 p.m., EMCC: Room 1)

Show in Timetable

Description: We present a statistical WebGIS platform integrating visualization tools and statistical functions for epidemiological studies, entirely based on Open Source technologies. An application for cancer mapping and environmental cancer studies is the Cancer Atlas (CA-TN), the GeoICT platform of the Cancer Registry of Trentino (Italy).

We developed a new web platform supporting the visualization of epidemiological indicators on spatial geometries and the exploration of the spatial distribution of data patterns (e.g. cancer sites, age classes, gender). Together with a rich web interface for epidemiological data, CA-TN includes standard statistical tools to provide complementary information facilitating periodic reporting and cancer surveillance activities. Incidence data from the Cancer Registry in Trentino provided by the Azienda Sanitaria per i Servizi Sanitari (Servizio Epidemiologia Clinica e Valutativa) have been aggregated to the municipality level to deal with privacy issues. Furthermore, to maintain compliancy with European and Regional privacy framework, we built a distributed database infrastructure, where single patient data at record level were kept only on local Sanitary Datacenter. The system is fully built on Open Source technologies: most of components are compliant with public standards defined by OpenGeo Consortium. Spatial and statistical data of the system are structured in a PostgreSQL/PostGIS geodatabase. Maps are published using Geoserver, with an interface implemented with OpenLayer and ExtJS, Django as middleware. The central element of the system is a reconfigurable multi-level database (GeoTree), which allows high flexibility in defining the cells of spatial and temporal analysis. The GeoTree is structured as an directed acyclic graph, where each node represents an entity in space and time (usually an administrative division) and outgoing arcs point to sub-elements of the node itself. Statistical and geographical datasets are associated to the nodes and can be queried and aggregated directly using different and even custom aggregation functions (e.g. sum, mean, intersection, collection). Furthermore, using the data aggregation functionalities, GeoTree structure allows defining and computing on the fly complex indicators involving multiple datasets and results are saved in a transparent cache structure that speeds up data presentation and further data elaboration. The GeoTree structure is a systematic approach to (1) managing relations among geometrical entities at different time and spatial scale, (2) linking datasets and geometries, (3) keeping track of original data sources and associated metadata, (4) easy integration of external data sources, (5-6) aggregation and calculation of complex indicators. The main application of this multi-level database for this project is the computation of each statistical variable at census unit, municipality, sanitary district, and province level. The second technical improvement in this project is the use of R statistical environment for both GeoTree indicators calculation and graphs generation directly in database through PL/R procedural language for PostGreSQL. This communication procedure allows both the creation of dynamic graphs and the deployment of more complex analysis, like the correction algorithms for small areas estimation (SAE) of epidemiological indicators. The target users for CA-TN are medical staff – either from public administration or professionals. Authors: Dolci C., Droghetti S., Franch G., Riccadonna S., Furlanello C.