Use Cases to CRUD
The process of documenting scientific datasets has changed over the last several decades with the emergence of metadata standards and that change has not been completely positive. Metadata experts have been trying to explain documentation needs to scientists and data providers who have their own ideas about what documentation is really needed to understand and use their data. Scientists argued that the standards did not “fit” their data or their needs, and, in many cases, they were right.
The resulting contentious environment obviously was not optimum for creating high-quality documentation or independently understandable data and products. It is important to note that some of this tension was related to the fact that metadata standards were focused, to a large extent, on data discovery and the scientists and data providers were really interested in using and understanding their data and products.
The metadata community has learned from this frustrating situation. A more productive approach to metadata creation that links Use Cases with systems for Creating, Reading, Updating and Deleting (CRUD) metadata is illustrated as a series of steps in Figure 1.
Step 1. Documentation Questions and Requirements
The process starts with the documentation needs of the scientists and data providers expressed as simple questions, the first box on the left of Figure 1. Examples of these questions (with some answers):
- Do you need different documentation for different parts of your data?
- Do you need different documentation for different temporal and spatial subsets?
- Do you have datasets with multiple sources?
- Do you need to reference On-Line Resources?
- Do you need to describe a series of related granules?
- Do you need to describe many kinds of aggregations?
- Does data quality vary within the dataset?
- Do you need to track processing for multiple data sources?
- Do you need to track compliance with standards?
- Do you need to use spatial features to describe quality, like grids of quality flags?
- Do you need to explain why you did things to the data?
- Do you have datasets in multiple locations?
- Do you need to describe instruments used to make observations?
- Do you need to unambiguously identify things using your own namespace?
- Do you want to manage metadata using a relational or XML database?
- Do you want to serve metadata using a REST web service?
- Do you need to identify people in different roles?
- Do you need to keep track of user problems?
- Do you need to explain why you did things to the data?
- Do you need to track requirements and plans?
- Do you need to share data with international partners?
- Do you need to describe data formats and structures?
- Do you need to track data transformations and processing?
Step 2. Metadata Content
Once the questions are known for a particular dataset, the information required to answer the questions can be determined. For example, if different documentation is required for different parts of a dataset, we must be able to specify which parts of a dataset particular documentation applies to. If different kinds of aggregations are required, we must have a flexible mechanism for describing aggregations. The information required to answer these questions forms the foundation for the components in the documentation system. These components can be considered as atomic units that provide answers to documentation questions like those listed above. It is important to note that these components are information elements. They are independent of any metadata standards (the second box in Figure 1).
Step 3. Standard Implementations and Guidance
Once the information components are known, we can consider the question of how the information in those components might be represented using various standards (the third box in Figure 1). Answering this question requires 1) detailed knowledge of the array of standards that are used in documenting scientific data, 2) knowledge of how those standards are being used with a variety of datasets, analysis tools, and user groups, and 3) knowledge of the practices being used in those communities and the motivations for those practices. This is where the metadata community contributes to the process. They work with various standards to develop broad knowledge and expertise in a variety of standards. They help develop the practices that mold the standards to specific needs while extending the reach of practices that have already proven successful in other communities.
Step 4. Presentation
Many of these questions can be considered as use cases for the documentation and the documentation system. Use cases express user needs or requirements and describe how the system addresses those needs. They document how questions are asked and, more importantly, how the system presents answers to those questions. Once the standard representations and best practices are known, technology experts and developers create presentations of that standard content, i.e. answers to the scientific questions, using various technologies (fourth box in Figure 1). These interfaces are used to access and present the contents of the document repository. They make asking specific questions and answering those questions easy and intuitive. The “atomic” nature of the component approach taken here simplifies this process because the questions and answers are specific and, if the system is designed correctly, the answers to those questions are provided just as the user needs them. This is termed “just in time” documentation.
Step 5. Metadata Creation and Management
The interfaces developed for answering questions can also be used to create and update content and to delete obsolete or incorrect content, again using a variety of technical approaches designed for human and machine interaction. These interfaces, therefore, support all of the fundamental elements of an information system: Create Read Update Delete. More importantly, they drive the documentation content with specific scientific questions and use standards and best practices developed and used across multiple communities to answer those questions.