Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  1. [lines 75, 76] Confusion on centre-id for Sint Maarten (sx-met or sx-metservice)

Action: Secretariat to check with Sint MaartenGBs to remove the subscription of sx-met. (centre-id: sx-metservice is the correct one.)

Sheet #2: wmo_wis2_gb_msg_received_total

...

  1. [line 3, 4] JP-GC reports no metrics for Antigua or ai-metservice (?) even though a small number of messages are sent

  2. [line 5] DE-GC reports no metric for Argentina even though messages are being sent according to GB metrics; from looking at Grafana DE-GC appears to have gaps in provision of metrics 

  3. [lines 17, 28, etc.] DE-GC and JP-GC often don’t report a metric (no value available, e.g., Cameroon, Guinea) where as other GCs are able to connect; GB metrics indicate that Cameroon and Guinea are connected but (excepting US-GB) not sending any messages. How are CN-GC, UK/USA-GC and KR-GC reporting “connected” when there probably wasn’t anything to download?

  4. [line 21] UK/USA-GC can’t connect to Cyprus - confirm?

Note: “last-download” timestamp - if nothing has ever been downloaded for a given data server, the value will be null (not reported)

… Generally, metrics are only set once something has been tried. They are (mostly?) not initialised. CN-GC appears to initialise _download_total to zero even when there’s been no connection. Confirm?

Decision: we need to agree on the consistent metrics considering current DWD and Uk&USA implement differently.

...

… so recommend that we prioritise getting metrics for GB and GC consistent first.

3-validation of discovery metadata at Global Discovery Catalogue; implementation of Global Broker “discard”

Tom presented the issue presented on GitHub regarding finding the Find the best time to set properties.metadata_id as required #119.

(Rémy) currently, the properties of metadata is not mandatory. During ET-WISOP kick-off meeting, we notified that we will enable metadata validation by 1st September 2025. He highlighted if we are going to change the regulations we need to present it for INFCOM in 2026 for approval and EC, which means it will be done in 2027, which is too long. He proposed another approach by using the channel in metadata record.

(Kai) we should enforce the metadata in WIS2 exchange. The problem is if The issue is that if there's an error in GDC, it could disrupt data exchange.

(Jeremy) the first part, because GB will need to look through GDC, if a metadata record is broken (or if the metadata fails validation by a GDC), the channel will be parsed down by GB. Then the data will be missing.

(Rémy) We cannot use the metadata ID starting from 1st September 2025. There's a risk that issues with one GDC could impact data sharing. To improve reliability, we need to be more flexible rather than too strict.

Remy propose to combine the truth from all GDCs and create a reference based on that on GitHub

required (wmo-im/wis2-notification-message/#119) https://github.com/wmo-im/wis2-notification-message/issues/119

  • Current situation: from 1-Sep-2025 Global Brokers will check that a valid discovery metadata record exists for data relating to any WIS2 Notification Messages published via the GB. This is done by comparing the MQTT “channel” on which the WNM is published against a list of all channels harvested from discovery metadata published in the Global Discovery Catalogue.

  • This isn’t a fool-proof check; the GB cannot distinguish between datasets if a WIS2 Node is publishing notifications about more than one dataset on the same channel, I.e., the presence of one discovery metadata record would be sufficient for the GB to approve/republish notifications from all those datasets. That said, this is a rare edge-case. 

  • Proposal: make inclusion of properties.metadata_id MANDATORY in the WNM specification.

  • The proposal is a breaking change, therefore needs an appropriate level of visibility and approval. Consequently, the proposal will be submitted to INFCOM-4.

Concerns with the current situation:

1/ The GDC is being used to configure whitelists for real-time data exchange. This raises the expectations of resilience for GDCs to that of an operational component.

  • First, a GDC may corrupt a record through bad processing.

  • Second, a GDC may not be available when the GBs request the list of valid channels (FR-GB caches the list for 48-hrs, so GDC availability isn’t a big problem, excepting that the list would not include any new entries if an older version is used).

** Recommendation: 

  1. GBs will use a “composite” list of valid channels compiled from all three GDCs, i.e., the superset of valid channels - if a channel is reported by one or more GDCs it will appear in the list used by GBs. Implementation details to be agreed. 

  2. A WIS2 Node may not be aware that of a broken linkage between discovery metadata (and the MQTT channel described therein) and the approved list of channels used by GBs - which would result in real-time notification messages and data exchange being blocked. 

** Recommendations:

  1. A WIS2 Node may choose to validate their discovery metadata prior to publication to ensure that it will be correctly parsed (CA-GDC provides a validator service).

  2. A WIS2 Node should publish discovery metadata at least 24-hours prior to starting real-time data exchange _and_ subscribe to “monitor” messages published by the GDCs (e.g., “wmo_wis2_gdc_kpi_percentage_total”) to determine that the discovery metadata has been successfully published 

  3. WIS2 Node IT Operations should include monitoring the “wmo_wis2_gb_messages_no_metadata_total” metric for their “centre-id”; GBs will increment this when the linkage is not found. Any increases in this metric should be investigated and the causes resolved.

(Actions

  1. GM (Morocco) and GM (China) to share the GM broker endpoint with the Secretariat.

  2. Kari and Steve to follow up to make sure more consistent GB (US) metrics reports

  3. All to go through the list shared in the email to have a consistent metrics

WIS2-Recommendations:

  1. Each Global Service operator should run their own Prometheus instance so that they can monitor service performance using a time-series to compare with prior periods.

  2. Each Global Service operator should have IT-Ops procedures in place to identify issues arising (e.g., failure to connect to upstream Node), diagnose faults and remedy the issue.

Next meeting

10 March