2022-03-21 ET-W2AT Meeting

 Date

Mar 21, 2022

 Participants

  • @Rémy Giraud

  • @Jeremy Tandy (Unlicensed)

  • @Tom Kralidis (Unlicensed)

  • @peter.silva (Unlicensed)

  • @Baudouin Raoult (Unlicensed)

  • @Kai Wirt (Unlicensed)

  • @Henning Weber (Unlicensed)

  • @thorsten.buesselberg (Unlicensed)

Other Experts

  • @Kari Sheets (Unlicensed)

WMO Secretariat

  • @Peiliang Shi (Unlicensed)

  • @Enrico Fucile

  • @HADDOUCH Hassan

  • @Xiaoxia Chen

  • @Timo Proescholdt

  • @David Berry

Apologies

  • @Pablo Loyber (Unlicensed)

  • @Kenji Tsunoda (Unlicensed)

  • @Dana Ostrenga (Unlicensed)

  • @sabai.fatima (Unlicensed)

 Goals

  • How to deal monitoring

 Discussion topics

Time

Item

Presenter

Notes

Time

Item

Presenter

Notes

1

WIS monitoring

Kai Thorsten

Kai presents his slides on WIS Monitoring; same as for the joint meeting with ET-WDQMS:

  • two types of monitoring:

    • data monitoring WDQMS and GBON

    • services monitoring for checking performance of WIS2 nodes [and shared services]

  • proposal to use Prometheus for this purpose

  • need to define the metrics that we capture for each type

  • [example message from current system]

  • suggest moving from [bespoke] JSON format to a GeoJSON dialect

  • GeoJSON is an intermediate format

  • need some representation/definition of the expected schedule of GBON data

  • openmetrics

  • recommend aggregations for the metrics; base_count, data_count, percentage:

    • 1) by country

    • 2) by station ID - can include lat-lon position

  • metrics can be automatically displayed on a map in Grafana; e.g. red/green dots, heatmaps etc.

  • UX provided [in Grafana] to allow people to dig into the data

  • Questions:

    • 1) how to define the baseline for GBON - which stations are included; what about hourly observations - do we expect one observation per hour per station?

    • 2) what data should we add? Currently works with OSCAR/Surface; what about satellite data?

    • 3) Is aggregation by country sufficient? If not - then how should we aggregate?

  • Recommendations

  1. use GeoJSON at sensor centres - still need to agree on Properties within the GeoJSON doc

  2. Sensor centres expose metrics [as openmetrics] - still need to agree on metrics

  3. (service) metrics exposed directly - or have sensor centres provide the aggregate via openmetrics

2


All


Discussions

Remy >asks about deployment. prometheus deployed at the monitoring centre; the python and exporters running at the sensor centres [yes]

who operates the sensor centres? and publishes the openmetrics? are we talking about the 4 NWP centres running the WDQMS?

 Timo > a sensor centre includes all the centres in WDQMS. You could think about sensor centres as additional centres in WDQMS; only providing information on data availability and timeliness, not reporting data quality

 Remy > so a sensor centre is where the metrics are produced, this is a generic term. WDQMS centres are a subset of these

 Enrico > we won't replace what the NWP centres are already doing for WDQMS - they're using data assimilation to assess data quality

 Remy > but the GeoJSON data is what WDQMS produces?

 Kai > we have an existing system: WDQMS … it uses CSV to provide metrics to the monitoring centre. We could ask the WDQMS to adopt GeoJSON - but is there a requirement to make them change?

 Timo > what have explored the technology, these are good options but, what are the requirements for WIS monitoring? we haven't done this work yet. We need to get on with the work to define the requirements

 Remy > should we aim for WDQMS to migrate to the common solution? Should this be our ambition? (even though it may take some time to achieve this)

 Jeremy > [checks understanding of how WDQMS works]: NWP centres report (in CSV format) detailed information about data quality, they don't produce openmetrics

 Remy > can we translate the existing CSV format into GeoJSON?

 Timo > harmonising reporting across WIS2 and WDQMS, but we need to be clear about the monitoring requirements to see what the most appropriate technical solution. This might be two independent solutions for WIS2 and WDQMS

 Enrico > we also need to think how stable the metrics will be. If we generate metrics from the sensor centres, then any changes will need to be adopted by all sensor centres. If the metrics are aggregated from "information" at the monitoring centre, then the changes can be managed centrally

 Enrico > supports the need to get the requirements sorted first

 Timo > we are working on this; should have something in approx 1-month. Generate proposal in Kai's team first (TT-Monitoring?), then bring to this group

 Kai > requirements for GTS2WIS are pretty clear. For service metrics, we need to know the GBON requirements; looking at, for example, geographic distribution and what do we want to monitor for GBON

 Timo > suggests fine-grained reporting requirements are needed - e.g. did this station report; country-level aggregation is not sufficient 

Jeremy > we have a number of good options for the technology, with ability to handle geographic information, but what we need to do is determine _how_ we can use these technologies to meet the GBON requirements

Kai > can't give a timescale; need to see the requirements. The requirements for GTS2WIS are well understood, can probably do the global broker, global cache, but need to work with others (e.g. TT-GBON) to define the other requirements

Remy > this is similar to work of TT-Protocols: GeoJSON [extracting information from the BUFR message] and openmetrics are define the "protocol". Kai and team have evaluated that these are fit for purpose

 Jeremy > what level of detail do we want/need to put in the Technical Regulations?

 Enrico > be generic, with details in the Guide to WIS, e.g. define the role of Sensor Centre, use of GeoJSON (with a bit of structure), use of openmetrics. Note that specifics of both GeoJSON properties and openmetric can be added in the Guide

 Jeremy > can define the Technical Regulations to support extensibility - e.g. that GeoJSON properties and openmetrics will change over time. Update can be done using something like the fast-track process. Clearly signally that these GeoJSON properties and openmetrics specs will be updated

 Timo > suggests that we have a couple of templates for GeoJSON, sufficiently axiomatic, e.g. low level, to support a number of monitoring queries

Looking at Kai's questions

  1. how to define the baseline for GBON - which stations are included; what about hourly observations - do we expect one observation per hour per station?

 >> TT-GBON … meeting in 10-days time; present to them the proposal; indicate the responsibility for them to define the required metrics [for future iterations of the WIS2 monitoring]

 Timo > this isn't urgent; initially GBON is content to use the WDQMS monitoring based on NWP assimilation

Remy > so we're asking TT-GBON is WDQMS is sufficient?

Enrico > there are issues with the WDQMS approach based on data assimilation - because it might take some months before data from a new station is being assimilated; and also, different centres see different things based on their assimilation; better to see things in terms of data availability.

Jeremy > so what are the urgent things? Where should Kai and team focus effort on developing GeoJSON and openmetrics templates, plus who is capturing the metrics, who is exposing:

  • GTS2WIS2: looking at which data is available from GTS, which from WIS2; probably don't need to get to station-level because we won't migrate GTS to WIS2 on a station-by-station basis; migration done centre-by-centre

  • Data availability; perception from sensor centres of data available from Global Broker, Global Cache and originating centre

  • Global Broker - service performance

  • Global Cache - service performance

  • Global Catalogue - service performance

  • WIS2 node - service performance

  • Metadata quality (KPIs)

 Enrico > what about monitoring services provided by the WIS2 nodes?

Jeremy > is this urgent? Capturing these metrics will be optional

Remy > not urgent - and should be able to leverage pre-defined metrics (standard "exporters") for WIS2 node components like apache, nginx etc.

  2. (Kai questions) what data should we add? Currently works with OSCAR/Surface; what about satellite data?

3.(Kai questions) Is aggregation by country sufficient? If not - then how should we aggregate?

 >> not ready to answer these yet; will vary by type/requirement.

 Kai's recommendations?

  1. use GeoJSON at sensor centres - still need to agree on Properties within the GeoJSON doc [agreed]

  2. Sensor centres expose metrics [as openmetrics] (as opposed to putting the geojson on pub/sub) - still need to agree on metrics [agreed - noting we will have the "information" aggregation to develop metrics distributed among sensor centres]

  3. (service) metrics exposed directly from WIS2 nodes / shared services - or have sensor centres test availability / performance of services as "blackbox test" [agreed]

 Action items

Next weekly meeting to resume discussions on (i) topic tree, (ii) filenames and browsable WAF end-points at Global Caches

 Decisions