Data And Visualization Integration Via Web Based Resources

Massimo Di Stefano (Rensselaer Polytechnic Institute)

11:30 on Friday 20th September (in Session 36, starting at 11:30 a.m., Sir Clive Granger Building: A31)

Show in Timetable

Description: We are developing infrastructure for collaboration and knowledge sharing for marine Integrated Ecosystem Assessments (IEAs), using IPython Notebooks as a tool for collaborative data processing, workflow provenance and product publishing.
Abstract:

We are developing cyberinfrastructure to facilitate collaboration and knowledge sharing for marine Integrated Ecosystem Assessments (IEAs). The main tool is based on a web application (IPython Notebook) that provides the ability to work on very diverse and heterogeneous data and information sources, providing an effective way to share the source code used to generate data products and associated metadata as well as to track the workflow provenance to allow the reproducibility of a data product. Starting with a source dataset and ending with a final product for an Ecosystem Status Report. A key feature is that metadata, embedded in the final product, are acquired during the processing and plotting of the data. In this way we are able to record the provenance needed to reproduce the data products. We are using the IPython Notebook as tool for collaborative data processing, workflow provenance and products publishing. IPython (Interactive Python) can be run interactively over the web, providing to the user an effective way to work on local or shared data. Here, is an example session showing the IPython Notebook interface used to run interactively the code to produce some figures for the Northeast Shelf (NES) LME Ecosystem Status Report. We executed some geospatial data analysis using tools including GRASS GIS (Geographic Information System) and R for statistical analysis in combination with other free and open source software tools.