\uD83D\uDDD3 Date
13:00-15:00 UTC
\uD83D\uDC65 Participants
ET-W2AT
Rémy GIRAUD
Jeremy TANDY
Tom KRALIDIS
Experts
Kai Wirt
Chems eddine ELGARRAI
Max Marno (Synoptic)
WMO Secretariat
Hassan Haddouch
Maaike Limper
Anna Milan
\uD83D\uDDE3 Discussion topics
No | Notes |
---|---|
1 | Jeremy: At the stress test in Japan in May, all the GS operators implemented the metrics? Rémy: currently, GB (Brazil, France), GC(Germany) already provide the metrics. Jeremy: Max from Synoptic is now preparing the metrics (GC-USA & UK).
notice board of global monitor, not to ask WIS2 users Jeremy proposed to discuss the connection between alert and ticketing system at the stress test in Japan in May. However, Rémy emphasises that the objective of stress test in Japan focuses on checking if Global Services are operating a decent level and doing proper jobs. (Kai): who raise the alert? Preference: automatic alerting using alert manager. to define what GM should alert. Other GS should not create such alerts. (Rémy) GM raises the alert. GS may raise alerts but this is not a must. Criteria for GM to raise an alert. Key question: what is the threshold of the metrics to raise an alert for GM? Kai proposed to go through all the metrics offline to define the (Kai) If GM and GS raise a same alert, it is good to detect the same thing. Involving China (CMA) in the GM (Jeremy) In May, we may do the performance test. To get China on board to do the GM. (Action) Xiaoxia to contact CMA colleagues for their GM plan. (Maaike) what to communicate with the GM (China) (Rémy) we will finalize 1) what are the potential alerts and the next step is 2) the GM to implement the alert. The meeting is to define the overall architecture to keep track of the alerts we want to raise. |
2 | Metrics hierarchy https://github.com/wmo-im/wis2-metric-hierarchy/tree/main/metric-hierarchy
Tom proposed to start with data schema with extensibility and then refine it. To look at the metrics, the levels of the metrics, the threshold at which alerts are raised (triggered by GM). (Max) reset period for the metrics, example as: total download errors
Levels of metrics/threshold for alerting
(Max) to share the currently existing standard that we can follow https://docs.python.org/3/library/logging.html#levels Level and duration
Alert manager rules (Kai) Do we agree to use Alert Manager? To install the rules. For each listed metric, we create rules of severity levels, by grouping of type of global services instead of metrics. JMA may have some available. (Secretariat) no metrics received from JMA. (action) GB: Rémy; GC: Kai and Max; GDC: Tom |
✅ Action items
- Xiaoxia to contact XUE Lei (NFP on WIS matters for China) for their GM plan
Add Comment