The GCW/SLF Open Source Software Package
In the period 2015-2017, GCW has been working with the WSL Institute for Snow and Avalanche Research (SLF) to set up interoperability between the WSL/SLF data centre being responsible for one of the CryoNet stations. WSL/SLF has kindly agreed to make the software stack they have developed available for a wider community. All projects are now available under open source licenses. The provided software tool allow to processes and manage data at various stages of the “datacycle” from sensors to published dataset.
MeteoIO
The core element in the software package is the data preprocessor MeteoIO that takes data from the sensor, through a quality control procedure into standardised NetCDF/CF files which can be published. MeteoIO was originally developed to provide robust meteorological forcing data to an operational model that forms part of the avalanche forecast at the SLF. However, it also happens to be very good at reading diverse data sources and producing a standardised output. It has a modular architecture which makes it flexible and fast to develop new use cases. It can handle both gridded and time series data and has various functions for cleaning/ processing data to various quality standards and produces QA reports. MeteoIO is a C++ library.
MeteoIO goes through several steps for preparing the data, aiming to offer within a single package all the tools that are required to bring raw data to an end data consumer: first, the data are read by one of the more than twenty available plugins supporting that many different formats or protocols (such as CSV files, NetCDF files, databases or web services). Then some basic data editing can be performed (such as merging stations that are next to each other or renaming sensors). The data can then be filtered, by applying a stack of user selected generic filters. These filters can either remove invalid data (such as despiking or low and high pass filters) or correct the data (such as precipitation undercatch correction, debiasing, Kalman filtering). Once this is done, the data are resampled to the requested time steps by various temporal interpolations methods. It is important to keep in mind that during this whole process, MeteoIO works with any sampling rates, including variable sampling rate and can resample to any point in time. If there are still missing data points at the requested time steps, it is possible to rely on data generators to produce some data out of either parametrizations (such as converting a specific humidity into a relative humidity) or very basic strategies (such as generating null precipitation to fill gaps). Finally, either the data are forwarded to the data consuming application or written back by a user-selected plugin.
For the MeteoIO git, please click here.
EnviDat
In order to publish discovery metadata for the data prepared through MeteoIO, software developed through the EnviDat project is used. EnviDat is the WSL/SLF main CKAN based dataportal and metadata repository. Core CKAN has been extended to cover specific requirements of research data management. These include an OAI-PMH server, DOI publishing and supporting metadata standards. The advantage of CKAN is that it provides a robust and intuitive UI for structured metadata submission. This enables large parts of the data
management process to be decentralised to the submitter.
For EnviDat extensions please click here.
For further information on the CKAN project, please click here.