#William# Information Management - Lifecycle

Information Management – Lifecycle.

All data/information (D/I) should be subject to a defined lifecycle, extending from when (and in fact before) the information is first created, through its usage as operational information, until its latency date, which includes options for retention or disposal. The various stages will be described in more detail in the following sections; what follows is a brief description of the D/I lifecycle, in terms of a number of basic questions and comments about the D/I process, as described below.

It should be noted that the scope of these guidelines specifically excludes discussion of underlying observations or, in the case of (for instance) NWP data and forecasts, the underlying methodologies for generating the information, except to say that these have their own documentation requirements.  The discussion here refers to D/I at the aggregate level (such as a dataset), rather than individual values, measurements, etc, and collections of information such as collections of reports or assessments (one might, for instance, think of the aggregate of Stewardship Maturity Matrix assessments as such a collection). [WW1] 

 

 

Aspects of the Data and Information lifecycle process:

Creation: What data/information is to be generated and how? In the case of acquired (D/I), who is the provider, and what understandings/contractual agreements/licensing provisions, etc are in place for the provision of the data/information? Is there an agreed Service Level Agreement? At the operational level: what procedures are needed to ensure the correct receipt of the D/I, including checks to ensure D/I are received and not corrupted? What are the feedback processes for communicating problems with the D/I? More in the appropriate Chapter on “Creation”.  Also, note the following recommendation to ensure a D/I Management Plan is developed or in place before the D/I is created or acquired.

(It should also be noted that the rationale for the creation/acquisition for the D/I should be communicated; in fact many entities will require a case to be presented before investing in new infrastructure to acquire, manage and disseminate the D/I).

Representation and Metadata: What metadata is required with the D/I to ensure users are appropriately informed about the suitability for their purposes? This applies to both the original D/I and any post-processed products derived from the D/I.  In what formats will the D/I be received by the D/I management system, and are there programs in place to ingest/decode/format the data into the system? It is recommended that, as far as possible, data formats conform with generally accepted (WMO, ISO, community, etc) formats. A system for versioning of the D/I is also important.

Processing: A system for documenting and retaining the methodology of the procedures used for processing the D/I, including how data is transformed into information, is also required, so that, if required, the information can be reproduced from the underlying data/information.

Publication/Exchange: Any project aiming to acquire/create D/I needs to have a clear means as to how the D/I will be accessed. This includes consideration of how the D/I will be stored (operational D/I via prompt transmission and temporary storage on servers; less urgent D/I via tape, etc). Will there be an open publication to the Internet, or will access be restricted, and why? Will a specific front-end GUI be developed, or will Cloud-based tools be employed? What formats will the D/I be made available in (as above: WMO, ISO, community, etc formats preferred)? What are the applicable licensing and onforwarding restrictions? Important also to consider the visibility of the D/I via catalogues, discovery metadata. These considerations are included in much more detail in Section….

Archival/Caching and Disposal: Once the D/I have passed their immediate operational usefulness, decisions and policies are required as to how long to retain copies of the D/I. These may not be straightforward decisions. For instance, general forecast data may generally not need to be retained beyond weeks or months; however in the event of a catastrophic weather/climate event causing death or destruction, or a “historical” event, the relevant documentation may be required for legal, research, and/or historical purposes, and therefore much longer retention periods may be required.

It is recommended that NMHSs establish a retention policy for all meteorological/ climate/hydrological D/I. Such a policy should be based on input from Government, operational and user stakeholders, and should be part of a D/I Plan as described below. Specifically, retention policy should take into consideration:

  • Operational or likely operational needs;

  • Government retention guidelines or policies, or other legally-binding constraints;

  • Reproduceability of the D/I (irreproduceable data such as original weather/climate observations should be retained in perpetuity; whereas NMHSs should consider whether to retain D/I created by software that is obsolete);

  • Cost of storage, extraction/reproduction;

  • User requirements (preferably based on consultation with stakeholder groups);

  • Arrangements with partners, including reciprocal arrangements with other NMHS or Cloud service providers).

Finally, when a decision is made to dispose of D/I, the specific methodology should be defined, and a process to ensure the disposal is signed off and carried out.

Usage. It has been said that D/I have no value unless they inform users, and/or can influence decisions. Therefore, ensuring the end-user can make effective use of D/I is an important final step in the Lifecycle chain. 

Firstly, apart from ensuring that users have ready access to the D/I, it is important that guidance is provided on how to access and use the D/I, along with any caveats or restrictions on its limitations. As mentioned above, information on costing (where appropriate), licensing, etc needs to be provided. Similarly, there needs to be a clearly communicated mechanism for advising users about changes to the D/I, based on Quality Control updates or newer versions of the D/I.  It is also strongly recommended that users be provided with the opportunity for posing questions to the D/I providers and providing feedback where, for instance, data veracity is queried[1].  This may result in a reversion of the Lifecycle, possibly extending back as far as the observation process that created the data on which the D/I creation process depended.

Finally, there are a number of overarching requirements that apply to some or all phases of the D/I lifecycle, including governance, documentation, and competencies:

  • Governance may be defined as a set of procedures, policies and approval processes, along with accountabilities and compliance mechanisms, for ensuring the D/I are secure, accessible and usable. It is strongly recommended that NMHSs establish a board or leadership group to develop such a governance structure, and ensure compliance with its requirements.

  • Documentation. Providing detail on how D/I was developed or acquired was mentioned earlier, but such documentation should apply to all aspects of operations along the D/I lifecycle. Such documentation should be clear, well communicated and easy to find. Experience has shown that documentation of procedures (etc) is frequently not prioritised, but it is argued that the investment of time in this very important step will potentially avoid major problems in future should questions arise about the provenance or integrity of the D/I, or key staff leave the NMHS.  

A persistent object identifier such as a digital object identifier (DOI) for D/I should be issued.  [WW2] 

 

 

  • Competencies. D/I management is, or should be, to some extent part of the role of anyone who creates or acquires D/I. However the overall task of managing D/I across an organisation is really a specialist role, and staff employed to undertake this role need to be able to display a core set of competencies. These will range from a basic knowledge about the D/I to be managed, through knowledge of the various stages of the D/I lifecycle and the Organisation’s ingest and storage systems, through knowledge of D/I-related policies, including knowledge of relevant national legislation and international standards in, for instance, retention policy and D/I exchange policies. An ability to liaise with both the providers and users of D/I is also required. The overall set of skills required would typically be spread across the members of a D/I management section, or team. Further details on competencies will be provided at a later stage in this Guidance.

 

Data/Information Plans

It is strongly recommended that before any additional D/I is created or accepted for curation by the NMHS, that a well-defined Data/information Plan should be created, documented and agreed by all stakeholders[2]. Such a plan should cover all aspects of what is the D/I, how and why the D/I are created, and include details of resourcing, accountability and management responsibilities at varying stages of the D/I Lifecycle. It should cover storage requirements (including whether information will be cloud-based), and reference a series of other documents that describe the policies and procedures for managing the information at all stages, including D/I provision and licensing. Such a plan should have the express approval of the NMHS leadership, so that authorisation and resourcing for the D/I is guaranteed.

The plan could take the form of a survey for potential data/information creators or acquirers, who may need to consult with other relevant personnel within the NMHS. Such a plan should be designed to be simple for, e.g., basic data that has no security, commercial or other restrictions (such as D/I available under WMO No 62), but more complex for D/I that may be (a) expensive to create or manage; (b) be constrained by security, etc, or by privacy or other legislative requirements. It is thus essential that NMHS D/I managers be familiar with their country’s legislative and other requirements for D/I access, privacy, retention policy etc.

An example of a survey for such a Data/Information plan is provided in Appendix … This survey is designed to be used for all types of data/information, with more sensitive D/I subject to additional questions.      

 

  • the granularity of D/I: for data, we should probably consider at a level of a dataset or data product, rather than one individual data value/measurement. What does it mean for information? Dataset-level metadata record? Can we treat SMM-CD assessments as one type of information that requires a lifecycle approach to management?

 

  File Modified

Microsoft Word Document Information Management Life Cycle_vs2_200721.docx

Jul 27, 2021 by Xiaoxia Chen

[1] A caution here is that some users may have malevolent intentions. In Australia there was a case of a voluntary observer who queried the Quality Control processes that resulted in some of his data being flagged as suspect or wrong, claiming that it hurt his credibility within the community. This resulted in an extensive  investigation which revealed that the data provided were indeed sometimes in error, while other measurements could not be verified as erroneous or not.

[2] Stakeholders include the providers and managers of the D/I, and representatives of whoever is tasked with supplying the D/I to users.


 [WW1]Note that we will need to be clear and consistent on this question of granularity, sets, collections in our definitions section

 [WW2]Discussion point: at what point?