Documentation Spirals

From NOAA Environmental Data Management Wiki
Potential Spirals
Creating documentation using a series of spirals is useful in many situations. The content and order of the documentation spirals is related to the specific scientific needs and requirements of particular groups. A helpful first step for any group is to identify content that exists in all records (see ISO_Boilerplate for some suggestions). This Figure suggests some other possibilities. The first four spirals support data discovery and are similar to the NetCDF Attribute Convention for Dataset Discovery. The others are related to data understanding. Together these spirals build a strong foundation for high-quality documentation. The ISO Standard includes a number of options for building on that foundation by addressing specific scientific needs. See Use_Cases_to_CRUD for some Use Case ideas.

Reports that score existing records using these spirals (Sample) are available for many ISO records from the NGDC Metadata Home. Select a record set, then click Rubric next to a record to see the report for that record. Other views are also available. The reports are produced using this stylesheet.

All of these records and views are improving as we learn more about the ISO Standard. Please contact Ted Habermann if you have questions or suggestions.

Identification

The Identification Spiral sets the stage for discovery using text search engines. It includes a unique identifier for the metadata, a title, an abstract, theme keywords and contact information for the metadata and the dataset.

Attribute (Count) Description Best Practice Path
Metadata Identifier
O, UDD(id)
A unique phrase or string which uniquely identifies the metadata file. Each metadata record shall have a unique identifier, such as a universal unique identifier (UUID), to distinguish it from other resources. At present these identifiers are simple character strings. In order to help ensure uniqueness they should include a namespace and a code guarenteed to be unique in that namespace. For example: <gmd:fileIdentifier><gco:CharacterString>gov.noaa.class:AERO100</gco:CharacterString> </gmd:fileIdentifier>. In this case gov.noaa.class is a namespace and AERO100 is a code guaranteed to be unique in that namespace. It seems likely that the upcoming revision of ISO 19115 will support MD_Identifiers as metadata identifiers. More... /*/gmd:fileidentifier and /*/gmd:contact

MI_Metadata
Metadata Contact
M, UDD(creator_name, URL, email)
The responsible party for the metadata content. The person/organization directly responsible for metadata creation and maintenance.
Resource Title
M, UDD(title)
Name by which the dataset or resource is known More... /*/gmd:identificationInfo/*/gmd:citation/gmd:CI_Citation/gmd:title/gco:CharacterString and /gmd:date/gmd:CI_Date/gmd:date/gco:Date

CI_Citation
Resource Date
M, UDD(title)
Date associated with the resource (publication/creation/revision). Whenever possible, include both creation date and revision date. More...
Abstract
M, UDD(summary)
Brief narrative summary of the resource contents. Abstract narrative should include information on general content and features; dataset application: GIS, CAD, image, database; geographic coverage: county/city name; time period of content: begin and end date or single date; and special data characteristics or limitations. Note: Many applications limit preliminary display to the first 150-200 characters of this field so critical distinguishing characteristics should be listed first. More... /*/gmd:identificationInfo/*/gmd:abstract/gco:CharacterString and /gmd:topicCategory/gmd:MD_TopicCategoryCode and /gmd:pointOfContact/gmd:CI_ResponsibleParty

MD_DataIdentification
Topic Category
M, UDD(keywords)
The main theme(s) of the dataset. Select topicCategory from MD_TopicCategoryCode. Usually climatologyMeteorologyAtmosphere and/or oceans (keep this capitalization and spacing).
Resource Contact
O, UDD(creator_name, URL, email)
Identification and means to contact people/organizations associated with the resource. The person/organization directly responsible for answering questions about a resource. This could be a person at an archive rather than the originator of the resource (described in the citation).
Theme Keywords
M, UDD(keywords)
Keywords that describe the general theme of the resource. The NASA Global Change Master Directory and the Climate-Forecast Standard Names are good choices for keyword thesaurus.

In order to be identified by SpiralTracker, the keyword must have MD_KeywordTypeCode = theme

/*/gmd:identificationInfo/*/gmd:descriptiveKeywords/gmd:MD_Keywords/gmd:type/gmd:MD_KeywordTypeCode[@codeListValue='theme'] and /gmd:MD_Keywords/gmd:type/gmd:MD_KeywordTypeCode[@codeListValue='theme']/ancestor::node()/gmd:thesaurusName/gmd:CI_Citation/gmd:title/gco:CharacterString

MD_Keywords
Theme Keyword Thesaurus
UDD(keywords_vocabulary)
The name of a registered authoritative keyword resource.

Service metadata contains SV_ServiceIdentification, which adds the following sub-spiral to the Identification Spiral:

Attribute (Count) Description Best Practice Path
Service Type (1) A service type name from a registry of services. The values of the nameSpace and name attributes of GeneralName may be "OGC" and "catalogue", for example. /*/gmd:identificationInfo/srv:SV_ServiceIdentification/srv:serviceType
Operation Name (1) A unique identifier for SV_OperationMetadata /*/gmd:identificationInfo/srv:SV_ServiceIdentification/srv:containsOperations/srv:SV_OperationMetadata/srv:operationName
Connect Point (1) Handle for accessing the service interface Each srv:operationName must have at least one srv:connectionPoint, and may have more. /*/gmd:identificationInfo/srv:SV_ServiceIdentification/srv:containsOperations/srv:SV_OperationMetadata/srv:connectPoint

Note: When a record contains SV_ServiceIdentification, the element that is unique to MD_DataIdentification (gmd:topicCategory), is separated into an MD_DataIdentification sub-spiral. The main Identification Spiral now has two sub-spirals.

Connection

The ISO Standards for describing onlineResources make it possible to display meaningful titles and descriptions for URLs. This spiral checks that all of the URL names and descriptions exist.

Attribute (Count) Description Best Practice Path
Online Resource URLs URLs for online resources. Information for Online Resources More... //gmd:CI_OnlineResource/gmd:linkage/gmd:URL and /gmd:name/gco:CharacterString and /gmd:description/gco:CharacterString

CI_OnlineResource
Online Resource Names Title for online resources, usually displayed as the link.
Online Resource Descriptions A short paragraph describing an online resource, usually displayed with a link.

Extent

The Extent Spiral defines the spatial and temporal extent of the dataset. This information can be displayed on maps, profiles and timelines and used in spatial searches. The ISO standard supports the definition of multiple extents for each dataset. In order to simplify the process of identifying the bounding extent, it is recommended that the id attribute be set = "boundingExtent" (see ISO_Extents). This attribute must be set in order for the extent to be identified using this tool.

Attribute (Count) Description Best Practice Path
Resource Spatial Extent
C, UDD(geospatial_lat_min max, geospatial_lon_min max)
Describes the spatial, horizontal and/or vertical, and the temporal coverage in the resource. The bounding extent for the resource should be identified with id="boundingExtent": <gmd:EX_Extent id="boundingExtent"> /*/gmd:identificationInfo/gmd:MD_DataIdentification/gmd:extent/gmd:EX_Extent[@id='boundingExtent']/gmd:geographicElement or /gmd:temporalElement/gmd:EX_TemporalExtent/gmd:extent or /gmd:verticalElement/gmd:EX_VerticalExtent

EX_Extent
Temporal Extent
UDD(time_coverage_start end)
Describes the temporal coverage in the resource. A temporal element could be used to describe either the time period covered by the content of the dataset (e.g. during the Jurassic) or the date and time when the data has been collected (e.g. the date on which the geological study was completed). If both are needed, then two temporal extents should be provided. The use of multiple temporal extents should be explained in the attribute description of the extent. The bounding extent for the resource should be identified with id="boundingExtent": <gmd:EX_Extent id="boundingExtent">
Vertical Extent The elements which give the minimum and maximum of the vertical extent of the dataset. The bounding extent for the resource should be identified with id="boundingExtent": <gmd:EX_Extent id="boundingExtent">
Place Keywords (1)
UDD(keywords)
Keywords that describe the location of the resource. The NASA Global Change Master Directory is a good choice for keyword thesaurus.

In order to be identified by SpiralTracker, the keyword must have MD_KeywordTypeCode = place

/*/gmd:identificationInfo/*/gmd:descriptiveKeywords/gmd:MD_Keywords/gmd:type/gmd:MD_KeywordTypeCode[@codeListValue='place'] and /gmd:MD_Keywords/gmd:thesaurusName/gmd:CI_Citation/gmd:title/gco:CharacterString

MD_Keywords
Place Keyword Thesaurus (1)
UDD(keywords_vocabulary)
The name of a registered authoritative keyword resource.

Distribution

Discoverying that a dataset exists is not helpful unless you can also discover where the dataset is avavilable from. The distributionSpiral provides that information.

Attribute (Count) Description Best Practice Path
Distributor Contact (2) The contact for distribution of the resource. The organization directly responsible for distribution of the resource. /*/gmd:distributionInfo/gmd:MD_Distribution/gmd:distributor/gmd:MD_Distributor/gmd:distributorContact

MD_Distributor
Resource Distribution Format (2)
O
Description of distribution format. Such as: ASCII, HTML, WMS, KML
Online Resource (2)
O
Information about Internet hosted resources: availability; URL; protocol used; resource name; resource description, and resource function. Sites such as homepage, pages that display location maps, etc. /*/gmd:distributionInfo/gmd:MD_Distribution/gmd:distributor/gmd:MD_Distributor/gmd:distributorTransferOptions/gmd:MD_DigitalTransferOptions/gmd:onLine/gmd:CI_OnlineResource/gmd:linkage/gmd:URL)

CI_OnlineResource
Data Center Keywords (0)
UDD(keywords)
Keywords that describe a Data Center related to the resource. The NASA Global Change Master Directory is a good choice for keyword thesaurus.

In order to be identified by SpiralTracker, the keyword must have MD_KeywordTypeCode = dataCenter

/*/gmd:identificationInfo/*/gmd:descriptiveKeywords/gmd:MD_Keywords/gmd:type/gmd:MD_KeywordTypeCode[@codeListValue='dataCenter'] and /gmd:MD_Keywords/gmd:thesaurusName/gmd:CI_Citation/gmd:title/gco:CharacterString

MD_Keywords
Data Center Keyword Thesaurus (0)
UDD(keywords_vocabulary)
The name of a registered authoritative keyword resource.
Browse Graphic The name of, description of, and file type of an illustration of the dataset. Pictures of (or illustrations describing) platform or observing system being recorded in dataset in JPG, PNG, GIF, etc. format. /*/gmd:identificationInfo/gmd:MD_DataIdentification/gmd:graphicOverview/gmd:MD_BrowseGraphic/gmd:fileName/gco:CharacterString

MD_Keywords

Description

The Description Spiral provides additional information that may be searched in some text searches. It includes brief textural descriptions of items that are also described quantitatively in other spirals.

Attribute (Count) Description Best Practice Path
Purpose Summary of the intentions for which the dataset was developed. Purpose includes objectives for creating the dataset and what the dataset is to support. /*/gmd:identificationInfo/*/gmd:purpose/gco:CharacterString

MD_DataIdentification
Resource Extent Description Text which describes the spatial and temporal extent of the dataset. When referring to a named location this can be also listed as a keyword with type = "place". /*/gmd:identificationInfo/*/gmd:extent/gmd:EX_Extent[@id='boundingExtent']/gmd:description/gco:CharacterString

EX_Extent
Lineage Statement
O, UDD(history)
General explanation of the data producer's knowledge of the resource sources and processing. /*/gmd:dataQualityInfo/gmd:DQ_DataQuality/gmd:lineage/gmd:LI_Lineage/gmd:statement/gco:CharacterString

LI_Lineage
Project Keywords
UDD(keywords)
Keywords that describe a Project related to the resource. The NASA Global Change Master Directory and the Climate-Forecast Standard Names are good choices for keyword thesaurus.

In order to be identified by SpiralTracker, the keyword must have MD_KeywordTypeCode = project

/*/gmd:identificationInfo/*/gmd:descriptiveKeywords/gmd:MD_Keywords/gmd:type/gmd:MD_KeywordTypeCode[@codeListValue='project'] and /gmd:descriptiveKeywords/gmd:MD_Keywords/gmd:thesaurusName/gmd:CI_Citation/gmd:title/gco:CharacterString

MD_Keywords
Project Keyword Thesaurus
UDD(keywords_vocabulary)
The name of a registered authoritative keyword resource.

Content

The Content Spiral includes information about the parameters that are included in a dataset.

Attribute (Count) Description Best Practice Path
Content Type (0) Type of the content in the cell. Select contentType from MD_CoverageContentTypeCode. /*/gmd:contentInfo/*/gmd:contentType/gmd:MD_CoverageContentTypeCode
Attribute / Band Name (0) Name of the attribute / parameter in the band. This name must uniquely identify a parameter in the attributeDescription /*/gmd:contentInfo/*/gmd:dimension/gmd:MD_Band/gmd:sequenceIdentifier/gco:MemberName/gco:aName/gco:CharacterString
Attribute / Band Definition (0) Definition of the attribute / parameter in the band. /*/gmd:contentInfo/*/gmd:dimension/gmd:MD_Band/gmd:descriptor/gco:CharacterString
Attribute / Band Units (0) Definition of the units of the attribute / parameter in the band. /*/gmd:contentInfo/*/gmd:dimension/gmd:MD_Band/gmd:units

Lineage

The Lineage Spiral begins the description of how the data have been measured and processed.

Attribute (Count) Description Best Practice Path
Source Information on the sources used in the development of the dataset. Using xml ids for sources and processSteps makes it possible to refer to them from one another: <gmd:LI_Source id="src_AVHRR_GAC.1074123.65352"> /*/gmd:dataQualityInfo/gmd:DQ_DataQuality/gmd:lineage/gmd:LI_Lineage/gmd:source/gmd:LI_Source and /gmd:processStep/gmd:LI_ProcessStep

LI_Lineage
Process Step
UDD(history)
The events in the development of the dataset.

Acquisition Information

The Acquisition Information Spiral provieds information about instruments used to make observations and platforms that they are mounted on.

Attribute (Count) Description Best Practice Path
Instrument
The instrument used to collect the observations. /gmi:MI_Metadata/gmi:acquisitionInformation/gmi:MI_AcquisitionInformation/gmi:instrument/gmi:MI_Instrument

MI_Instrument
Platform
The platform used to collect the observations. /gmi:MI_Metadata/gmi:acquisitionInformation/gmi:MI_AcquisitionInformation/gmi:platform/gmi:MI_Platform

MI_Platform
Instrument Keywords (0)
UDD(keywords)
Keywords that describe the instrument used to collect the resource. The NASA Global Change Master Directory is a good choice for keyword thesaurus.

In order to be identified by SpiralTracker, the keyword must have MD_KeywordTypeCode = instrument

/gmi:MI_Metadata/gmd:identificationInfo/gmd:MD_DataIdentification/gmd:descriptiveKeywords/gmd:MD_Keywords/gmd:type/gmd:MD_KeywordTypeCode[@codeListValue='instrument'] and /gmd:thesaurusName/gmd:CI_Citation/gmd:title/gco:CharacterString

MD_Keywords
Instrument Keyword Thesaurus (0)
UDD(keywords_vocabulary)
The name of a registered authoritative keyword resource.
Platform Keywords (0)
UDD(keywords)
Keywords that describe the platform used to collect the resource. The NASA Global Change Master Directory is a good choice for keyword thesaurus.

In order to be identified by SpiralTracker, the keyword must have MD_KeywordTypeCode = platform

/gmi:MI_Metadata/gmd:identificationInfo/gmd:MD_DataIdentification/gmd:descriptiveKeywords/gmd:MD_Keywords/gmd:type/gmd:MD_KeywordTypeCode[@codeListValue='platform'] and /gmd:thesaurusName/gmd:CI_Citation/gmd:title/gco:CharacterString

MD_Keywords
Platform Keyword Thesaurus (0)
UDD(keywords_vocabulary)
The name of a registered authoritative keyword resource.

Mandatory ISO Core

Attribute (Count) Description Best Practice Path
Resource Title Name by which the dataset or resource is known /*/gmd:identificationInfo/*/gmd:citation/gmd:CI_Citation/gmd:title/gco:CharacterString
Abstract Brief narrative summary of the resource contents. Abstract narrative should include information on general content and features; dataset application: GIS, CAD, image, database; geographic coverage: county/city name; time period of content: begin and end date or single date; and special data characteristics or limitations. Note: Many applications limit preliminary display to the first 150-200 characters of this field so critical distinguishing characteristics should be listed first. /*/gmd:identificationInfo/*/gmd:abstract/gco:CharacterString
Creation Date Reference date for the cited resource; reference date and event used to describe it. Whenever possible, include both creation date and revision date. /*/gmd:identificationInfo/*/gmd:citation/gmd:CI_Citation/gmd:date/gmd:CI_Date/gmd:date/gco:Date
Resource Language Languages of the resource using standard ISO three letter codes. Three letter language code followed by an optional three letter country code: <ISO639-2/T three letter language code>{<blank space><ISO3166-1 three letter country code>} Language code is given in lowercase. Country code is given in uppercase. e.g. eng fra; CAN This attribute constitutes the default languages of the dataset. see http://www.loc.gov/standards/iso639-2/php/English_list.php for ISO639-2/T language codes; see http://userpage.chemie.fu-berlin.de/diverse/doc/ISO_3166.html for ISO3166-1 country codes. /*/gmd:identificationInfo/*/gmd:language/gmd:LanguageCode
Topic Category (1)
M, UDD(keywords)
The main theme(s) of the dataset. Select topicCategory from MD_TopicCategoryCode. Usually climatologyMeteorologyAtmosphere and/or oceans (keep this capitalization and spacing). /*/gmd:identificationInfo/*/gmd:topicCategory/gmd:MD_TopicCategoryCode
Metadata Contact
M, UDD(creator_name, URL, email)
The responsible party for the metadata content. The organization directly responsible for metadata maintenance. /*/gmd:contact
Metadata Creation Date Metadata creation date. Date of metadata creation or the last metadata update. /*/gmd:dateStamp

Conditional ISO Core

Attribute (Count) Description Best Practice Path
Resource Spatial Extent Describes the spatial, horizontal and/or vertical, and the temporal coverage in the resource. The bounding extent for the resource should be identified with id="boundingExtent": <gmd:EX_Extent id="boundingExtent"> /*/gmd:identificationInfo/*/gmd:extent/gmd:EX_Extent[@id='boundingExtent']/gmd:geographicElement']
Metadata Language Language of the metadata composed of an ISO639- 2/T three letter language code and an ISO3166-1 three letter country code. Three letter language code followed by an optional three letter country code: <ISO639-2/T three letter language code>{<;><blank space><ISO3166-1 three letter country code>} Language code is given in lowercase. Country code is given in uppercase. e.g. eng fra; CAN This attribute constitutes the default languages of the dataset. see http://www.loc.gov/standards/iso639-2/php/English_list.php for ISO639-2/T language codes; see http://userpage.chemie.fu-berlin.de/diverse/doc/ISO_3166.html for ISO3166-1 country codes. /*/gmd:language
Metadata Character Set Character coding standard in the metadata. The character set for the metadata representation is restricted to "utf8", as used for ISO/TS19139:2007 compliant XML encoding. /*/gmd:characterSet
Resource Character Set Character coding standard in the resource. The default value of the character set for the resource representation is "utf8." The character set should be reported for any resource that uses characters for its representation. Resources such as image and video for instance might not make use of character set. When dataset includes North American aboriginal languages, the character set will not usually be "utf8." /*/gmd:identificationInfo/*/gmd:characterSet

Optional ISO Core

Attribute (Count) Description Best Practice Path
Temporal Extent Describes the temporal coverage in the resource. A temporal element could be used to describe either the time period covered by the content of the dataset (e.g. during the Jurassic) or the date and time when the data has been collected (e.g. the date on which the geological study was completed). If both are needed, then two temporal extents should be provided. The use of multiple temporal extents should be explained in the attribute description of the extent. The bounding extent for the resource should be identified with id="boundingExtent": <gmd:EX_Extent id="boundingExtent"> /*/gmd:identificationInfo/*/gmd:extent/gmd:EX_Extent[@id='boundingExtent']/gmd:temporalElement/gmd:EX_TemporalExtent/gmd:extent
Vertical Extent The elements which give the minimum and maximum of the vertical extent of the dataset. The bounding extent for the resource should be identified with id="boundingExtent": <gmd:EX_Extent id="boundingExtent"> /*/gmd:identificationInfo/*/gmd:extent/gmd:EX_Extent[@id='boundingExtent']/gmd:verticalElement/gmd:EX_VerticalExtent
Resource Contact Identification and means to contact people/organizations associated with the resource. Many times this contact is at a Data Center rather than the originator of the resource. /*/gmd:identificationInfo/*/gmd:pointOfContact/gmd:CI_ResponsibleParty)
Resource Lineage Information or lack of information on the events and source data used to construct the resource. /*/gmd:dataQualityInfo/gmd:DQ_DataQuality/gmd:lineage/gmd:LI_Lineage)
Metadata Identifier A unique phrase or string which uniquely identifies the metadata file. Each metadata record shall have a unique identifier, such as a universal unique identifier (UUID), to distinguish it from other resources. /*/gmd:fileidentifier
Online Resource (2)
O
Information about Internet hosted resources: availability; URL; protocol used; resource name; resource description, and resource function. /*/gmd:distributionInfo/gmd:MD_Distribution/gmd:distributor/gmd:MD_Distributor/gmd:distributorTransferOptions/gmd:MD_DigitalTransferOptions/gmd:onLine/gmd:CI_OnlineResource/gmd:linkage/gmd:URL)
Spatial Representation Type Object(s) used to represent the geographic information. Select spatialRepresentationType from MD_SpatialRepresentationTypeCode. /*/gmd:identificationInfo/*/gmd:spatialRepresentationType)
Resource Distribution Format (2)
O
Description of distribution format. /*/gmd:distributionInfo/gmd:MD_Distribution/gmd:distributor/gmd:MD_Distributor/gmd:distributorFormat/gmd:MD_Format)
Metadata Standard Name of the metadata standard/profile used. ISO 19115-2 Geographic Information - Metadata Part 2 Extensions for imagery and gridded data /*/gmd:metadataStandardName/gco:CharacterString)
Metadata Version Version of the metadata standard/profile used. ISO 19115-2:2009(E) /*/gmd:metadataStandardVersion/gco:CharacterString)
Resource Reference System Description of the spatial and/or temporal reference systems used in the dataset. Multiple instances of Reference System Information are authorized to describe the coordinate systems being used for coordinate representation (horizontal, vertical and/or temporal). /*/gmd:referenceSystemInfo/gmd:MD_ReferenceSystem)
Resource Spatial Resolution The level of detail of the dataset expressed as equivalent scale or ground distance. /*/gmd:identificationInfo/gmd:MD_DataIdentification/gmd:spatialResolution/gmd:MD_Resolution)