2022-02-02 ET-W2AT Meeting
Date
Feb 2, 2022 13:00-15:00 UTC
Participants
ET-W2AT
@Jeremy Tandy (Unlicensed)
@Rémy Giraud
@Dana Ostrenga (Unlicensed)
@thorsten.buesselberg (Unlicensed)
@Kai Wirt (Unlicensed)
@Henning Weber (Unlicensed)
@Tom Kralidis (Unlicensed)
@peter.silva (Unlicensed)
@Ken Tsunoda (Unlicensed)
@Li Xiang (Unlicensed)
@Baudouin Raoult (Unlicensed)
WMO Secretariat
@Peiliang Shi (Unlicensed)
@Enrico Fucile
@HADDOUCH Hassan
@Timo Proescholdt
@Anna Milan
@Xiaoxia Chen
Goals
To discuss the MQP Protocol, the list of core shared service to reach an agreement
Discussion topics
Item | Presenter | Notes |
---|---|---|
| Jeremy Rémy | Jeremy remind the key decisions taken in the weekly meetings related to the shared services and highlighted the need to approve them in this meeting MQP Protocol Rémy> Enrico created a page for decision on protocols: Jeremy> We specify two protocols: MQTT3.1 vs MQTT 5 Henning> MQTT is it only for small messages; there is any limitation on MQTT? Remy> No limitation in MQTT5. Baudouin> Are two MQTT interoperable? Tom> Peter, Enrico and me are working MQP topic structure, we expect to have first draft next months, would be good to put it on a Monday topic- starting on alignment of granularity for topic and metadata/dataset Rémy> Topic to be discussed on Monday meetings Action: needs alignment with metadata structure for datasets, topics and metadata on the agenda to discuss Kai> need to consider headers in the filenames; and link the filenames to some of metadata Rémy> The global file naming is for GTS, need to review the Global File naming Convention in the context of WIS2. Tom> discovery metadata, pub/sub, providing notification of new files Enrico> WIS2 topic principles is under discussion Decisions:
Pending discussions:
|
2. Discussion on Shared Service | ALL | Shared Services approach A description of shared services is available on https://wmo-teams.atlassian.net/wiki/spaces/WIS2/pages/306970677 Rémy>The previous meetings discuss the concept of shared services and this meeting aims to reach agreement of the concept and discuss the list of core shared services The team agreed the concept of shared service in WIS2.0:
Baudouin> GC abbreviation for both Global Catalogue and Global Cache Li Xiang> Suggest using Global Discovery Catalogue Baudouin> if it's the GISC republishing messages then the GISC will use their URLs, so it’s the responsibility of the GISC to make sure the URLs resolve Remy> we should avoid using the term GISC, we're talking about a shared service, Who will run the shared service is not yet decided. The main point is to provide a point of aggregation so that people can subscribe in one place, and we avoid NC/DCPC sources being "hammered" by 193 WMO members and everyone else! Tom> TT-WIS-Metadata is working on the draft document of WMCP2.0 to be available in 2-3 weeks. OGC API records is served as baseline for the catalogue protocol and metadata standard Action: to discuss the metadata protocol at next Monday’s meetings Action: Tom to share the link for the OGC API records and wmcp 2.0 Hassan> To test them out on the pilot projects and add the deployments of the WIS2node in a box. To involve the TT-GISC from the beginning, and involve the ET and TT to develop details of each component of shared services to complete the architecture. Summary: Decision: shared services approach approved by the team (Global Cache, Global Broker and Global Discovery Catalogue) Pending discussion:
|
3. Discussion on NC/DCPC connection with the shared service | ALL | How many instances are there for the shared service? Do you agree that NC/DCPC will connect to more than one instance of a shared service Jeremy > What about connectivity between shared service instances, and with NC/DCPC? Should a NC/DCPC connect to more than one instance of a shared service? What is the minimum number of connections between shared-service instances? Rémy > we've not yet agreed how many instances of the shared services we'll have - so let's pause on that, but we do know that NC/DCPC will want their data to be available through the shared services. We learn from WIS1 that we should allow NC/DCPC to publish messages / data at least twice to ensure that data / messages don't get lost, however avoids the tight coupling between a NC/DCPC and GISC. NC/DCPC depends on "shared services" - which will most likely be provided by GISCs Timo > we can make synchronisation issues go away if NC/DCPC publish to all Rémy> No cache synchronization. Brokers should see all the messages. The GC will just download data made available by brokers (Henning) No hard limitation for the instances of shared services. Kai > agrees that inter-shared service communication is needed - this will mitigate system failures Jeremy > politically, not everyone wants to "talk" directly to each other - so we need intermediaries Decision: no requirement for NC/DCPC to connect to ALL instances of a shared-service Kai > are there instances where an NC may not have the capability (or capacity) to publish to two instances? Decision: NC/DCPC MUST connect to at least one instance of a shared-service, and should connect to two or more instances of a shared-service Timo >If there are more than 3 GB and NC connect to at most 2 the question is how the notifications get to the other GBs. One obvious way is that GB re-publish notifications. This makes the system more complicated (and likely requires non-standard components in the GB). Need a avoid infinite loop Decision: NC/DCPC connects directly to the shared-service instance(s) - not via their GISC. Jeremy > Inter-connectivity between shared service instances … All-2-All, fully meshed, G=3 etc. Baudouin > Unidata IDD has been avoiding circular re-publication for years https://www.unidata.ucar.edu/projects/idd/ldmfaq.html , topology is here: https://rtstats.unidata.ucar.edu/rtstats/ Henning > don’t make explicit restrictions about the number of instances, and be clear on the expectation that we will have a small number of high-quality instances. So we need to work out the process to select those that host a shared service; e.g. quality gates, performance Rémy> all shared services aren't equal, an NC connecting to multiple caches to get the data gets around the problem of poor quality cache instances. The main concern relates to the Global Broker - this needs to be highly performant! Need to avoid the reason to offer a Global Broker being "prestige". We have audits - but we know from experience that it's difficult to "kick out" underperforming GISCs Jeremy > service performance will be publicly shared, "red blobs" on maps is a motivator for Members to improve performance (or at least to resolve the performance issues) Hassan> ET-AC can use the Audit and Certification process Peiliang> Monitoring will be more effective than audits. Jeremy> Audit is important. Peter> Audit Effectiveness is important. Henning> Rather than taking political issue into account, but to use technical solution for data usage Kai> Service registry (Global control center), solution is to have the connections automatically. Peiliang Algorithm to optimize the connectivity sounds great. What we need to consider is to have the monitoring system in place to monitor the daily performance. Then there will be a report at the end of year to present, indicating the global infrastructure situation. Kai> data or service monitoring? it should be distinguished. Rémy> We need to define various metrics using the same approach. Timo>The use of standards and a service oriented architecture make it technically easy to replace one components by another. example, a NC plugs into another GC and GB, whose addresses they obtain from a registry. Only components that are working (as per monitoring) are listed in the registry. Since we have standardized the MQP, the message schema and possibly the download schema, GB and GC are interchangeable Decisions approved:
Pending discussions:
|
4. Discussion on Global Cache connectivity | ALL | (Rémy) No need for Global caches to be connected to each other. Synchronization is only for global brokers and not for global cache. (Kai) We should not forget about the structure of the connections (not only the number).. We need to avoid having two halves of the brokers which are only connected by one link (Peiliang) need to ensure that centres are connected at least with a "spanning tree" Pending discussion: Define the minimum number of connections ("G") for each type of shared service (broker, catalogue, cache, monitor) |
Action items
Decision
- Data will always be made available via a MQ message that points to where the data is for download; with an optimisation to embed small data in the message [approved]
- Adopts the MQTT 3.1 and MQTT 5 in WIS2.0
- Shared services approach approved by the team (Global Cache, Global Broker and Global Discovery Catalogue)
- No requirement for NC/DCPC to connect to ALL instances of a shared-service
- NC/DCPC MUST connect to at least one instance of a shared service, and SHOULD connect to two or more instances of a shared-service
- NC/DCPC connects directly to the shared-service instance(s) - not via their GISC
- No requirement for all-to-all fully meshed connection
- There must be more than one instance of each shared service
- Global Monitoring approved as shared service
- There needs to be automated service monitoring of the WIS2 system from day 1