IOOS DIF DATA PROVIDER GUIDE

From NOAA Environmental Data Management Wiki

Document purpose and history

The IOOS has enormous potential to present geospatial data on open oceans, coastal waters, and Great Lakes in the formats, rates, and scales required by scientists, managers, businesses, governments, and general public to support research and inform decision-making. However, because there are so many incompatible standards in the geo-information technology area, sharing data between systems and between user communities requires considerable time and expertise. DIF goal is to facilitate data sharing by establishing certain standards in data formats, encoding and transport, and let any data provider to participate in IOOS program.

The present document has been developed in addition to and in elaboration of the Guide for IOOS Data Providers [12]. The Guide for IOOS Data Providers was compiled in 2006, and has not been updated since then; it offers high-level recommendations for selection, development and implementation of DMAC-compliant services and data formats. Most of these recommendations are of limited practical importance; some of them are completely obsolete.

In contrast to the previous version of the Guide, the present document is focused on practical needs of prospective data providers. It accumulates the practical experience of IOOS Data Integration Framework (DIF) of development and implementation of standardized data encoding methods and transport mechanisms for a limited number of core IOOS data variables. Recommendations offered by the present document are completely based on real projects; the document goal is to facilitate the implementation of trusted solutions.

However, since DIF is an ongoing project, the current version of the Guide is not fully developed. It is expected that the members of community will The ultimate goal of the Wiki version of the document is to provide a vehicle for community members to share their valuable experience and knowledge by adding and editing the content.


Basic Web Services Concepts

IOOS DIF encompasses standard protocols and Web services, e.g. HTTP, Simple Object Access Protocol (SOAP), Web Services Description Language (WSDL), Extensible Markup Language (XML); information technology committees such as the World Wide Web Consortium (W3C) and Open Geospatial Consortium (OGC) are addressing these comprehensive standards.

Some of the standards and specifications that powered IOOS DIF recommendations development are briefly described here.

SWE [15] - Sensor Web Enablement Framework. SWE is an OGC topic area that comprises several standards (and draft standards). The word "sensor" is used loosely and could refer to an actual sensor, or a multi-sensor platform, or a network of sensors, or any measurement procedure.

SOS [1] - Sensor Observation Service. SOS is one of the SWE standards. It has three main functions: (1) GetCapabilities: give me the list of sensors and their locations; (2) GetObservation: give me data from sensor #42; (3) DescribeSensor: give me metadata about sensor #42.

SensorML [16] - Sensor Model Language. SensorML is the output format for metadata (not data) from the SOS DescribeSensor operation. It has many possible fields for detailed metadata about an in-situ or remote sensor. Not all fields are necessary for every type of hardware (e.g., buoys don't need the satellite orbit fields), so in practice you need to choose an appropriate subset.

O&M [17] - Observations and Measurements. O&M constrains the output format for data from the SOS GetObservation function. However, O&M is rather abstract and high-level and does not define a precise format. It's a bit like saying the output format will be "binary" and that each file will contain a header and then a data section. O&M needs additional specificity (the equivalent of saying "NetCDF + CF conventions" instead of "binary"). O&M-compliant formats include: IOOS DIF XML (see below), SWE Common (see below), WXXS (Weather Information Exchange Schema), CSML (Climate Science Modeling Language), GWML (Groundwater Markup Language), WaterML 2.0 (in development), etc.

SWE Common[16] - one possible O&M-compliant output format for the SOS GetObservation function. Schema is described in Annex B of OpenGIS® Sensor Model Language (SensorML) Implementation Specification; work is now in progress to establish separate SWE Common encoding specification.

IOOS DIF GML [5] - another possible O&M-compliant output format for the SOS GetObservation function. It is based on both O&M and OGC Geography Markup Language (GML).

OOSTethys [18] - an open source software project for facilitating the integration of ocean observing systems that includes some of the groups, mostly US, that use SOS, O&M, SWE Common, and SensorML. Volunteer effort mostly. Software for SOS+SWE Common has been developed. Several OOSTethys people are also in the IOOS Web Services group.

Recommended solutions for individual data types

By no means IOOS urges data providers to change the internal data storage formats. In order to join the data provider just have to agree to provide access to their data through one or more of the specifications adopted by the IOOS DIF.

In-situ data (sensors, groups of sensors, platforms, etc.)

Gridded data (coverages)

Images of Data (maps)

Data conversion and Interoperability

Data format conversion and aggregation

Since not all datasets can be obtained from a single large scale provider like NDBC in DIF formats, it is desirable to have a capability to aggregate data from a variety of small to medium scale sources, and provide the resulting output in DIF format as well.

In addition, in many cases the data needs to be converted from one XML format to another, or even from XML into a completely different format for the following reasons:

  • customer's application does not accept XML data at all;
  • customer's application accepts XML data but conversion to a native application format will greatly facilitate data processing, e.g. applications like MS Office;
  • data is required in a "human-readable" format, e.g. for presentation, applications like Web browser, etc.

IOOS DIF adopted two prominent technologies for the data format conversion and aggregation:

Model Data Interoperability

In order to address IOOS goals in a comprehensive and integrated fashion an efficient flow of data is required from observations and modeling to analysis. It calls, along with optimizing the observing subsystem data services, for an improvement of the existing operational models, and development of modeling capabilities in areas where none exist.

It is very important for a successful operation that the models demonstrate a certain level of portability and standardization. The users do not want to be dependable just on a specific application and tool configuration. On the contrary, they want to be able to reuse models across various applications and tools. The existing models, however, frequently generate a non-standard unique output that makes model result reuse or comparison a very complex task. Therefore, it is in the best interest of users to implement technology that will improve their capability to provide for information from currently differing models to be widely usable by various applications and platforms.

The solution appears to be a transformation of the models that would provide an information template for any application, and make it accessible for analysts via standards-based tools. The problem to accomplish this is that there are already as many models as there are different ways to present a model output; it is not feasible to rebuild all of them. On the other hand, the DIF approach requires to overlay services on the existing infrastructure rather than rebuild it. Thus, the DIF way to reach the interoperability of the existing models is to let the model provider continue to serve their non-compliant output files as-is while implementing transformation tools in conjunction with the standardized Web services.

The IOOS DIF Model Data Interoperability project was undertaken to enable interoperability between a variety of heterogenous structured grid model outputs generated by IOOS partner organizations, and make them accessible to the scientific and professional communities via standards-based tools.

Metadata

Metadata is required to allow discovering, evaluating, and accessing of geospatial resources. Metadata is applicable to services, datasets, geospatial features, and sensors. This section provides just a brief introduction into metadata in regard to the IOOS DIF. Metadata is discussed in details in the Metadata section of this Wiki.

IOOS DIF Data Providers should submit metadata in accordance with the Federal Geographic Data Committee Content Standard for Digital Geospatial Metadata (FGDC CSDGM). At the moment, the new international standard ISO 19115 is rolling out, and will possibly take over CSDGM in a form of US National Profile. In that case CSDGM metadata records should be converted into ISO. That could be done using cross-walks from CSDGM to ISO 19115 similar to the one developed for the IOOS DIF provider of ocean color data.

It should be taken into consideration that the CSDGM and ISO standards describe different sets of mandatory and optional metadata elements, as shown in the table below [13]:

Mandatory metadata elements FGDC ISO 19115
metadata language - +
metadata character set - +
standard name + +
standard version + +
data set name + +
abstract + +
data set language + +
data set character set + +
spatial schema - -
date of metadata born + +
date of metadata update - -
date of metadata revision - -
spatial extent + -
temporal extent + -
quality elements - -
organisation + +
point of contact + +
category + +
purpose of production + -
frequency of updates + -
restriction of metadata access and usage + -

Service Metadata

Service metadata is a descriptive part of metadata associated with Web service and describing it. This metadata includes information required for determination if a service is of interest for user and how the service may be invoked. The service metadata becomes available through GetCapabilities function.

Hardware (Sensor) Metadata

Sensor metadata describes individual sensors, platforms and sensor networks. It provides user with the information of sensor’s capabilities, e.g. location, range, accuracy, etc., and helps to determine a value of the data collected by the sensors.

For sensor description, the OGC Sensor Model Language (SensorML) specification is recommended. IOOS DIF SensorML templates are currently in developing for DIF core variables, and are recommended for implementation as soon as they become available.

Discovery Metadata

Discovery metadata is a portion of the metadata that is used to locate the required geospatial data.

Data QA/QC Metadata

Any information pertaining to a data quality and reliability, or any other data quality assessment details is represented by the QA/QC metadata type. Some level of quality of data information should be made publicly available so that data users are aware of the level of data validity.

Registry

As soon as IOOS Data Provider has established a data sharing service, and is ready to share the data, it should be registered with the existing Registries. Registries are used to identify, locate, and describe individual instances. Two types of Registries can be used for registration, depending on the information to register:

  • Service Registries : a high-level catalogue of available data providers and associated services; a typical example is GEOSS Registry System - a repository of all the Earth observation systems, data sets, models and other services and tools that together constitute the Global Earth Observation System of Systems (GEOSS).
  • Data Registries : a catalogue of individual datasets available from various data providers; a typical example is IOOS Catalog – a database that defines the operational status and distribution of in situ ocean observation activity among the IOOS data providers who share their resources to the public through Web services.

Web Services Testing

OGC Testing

OGC has started a Compliance & Interoperability Testing & Evaluation (CITE) initiative to provide a process for testing products conformance to OpenGIS Implementation Specifications. CITE develops tests for OGC standards, and makes those tests available online via CITE Web Portal.

OGC Web Services Compliance Test platform is based on a Test Evaluation And Measurement (TEAM) engine. It is a cross-platform application that runs in a Java Virtual Machine as a test script interpreter. It executes test scripts written in Compliance Test Language (CTL) to verify that an implementation of an OGC specification complies with the specification. The test scripts are executed with the static datasets provided by OGC to ensure the test similarity. The engine is an open source software product, available for download from a TEAM Engine page.

Compliance Test Language is an XML grammar for documenting and scripting suites of compliance tests for verifying that an implementation of a specification complies with the specification. A suite of CTL files is installed in the compliance test engine, which executes the scripts and determines whether the implementation being tested passes or fails. A detailed description of the CTL was released as a Discussion paper (OGC document 06-126).

The tests may be run online, using the OGC's online compliance tester, which is an instance of the TEAM Engine. It is also possible to download all elements of the Test Suite from CITE Web portal, and install them on some other server. In the latter case the test may be adjusted to the needs of a specific Data Provider.

At the moment the CITE portal is able to provide certified tests for the following OGC specifications:

  • Catalogue Services/Web (CSW), Version 2.0.1
  • Web Coverage Service (WCS), Version 1.0
  • Web Map Service (WMS), Versions 1.3 & 1.1.1
  • Web Feature Service (WFS), Versions 1.1 & 1.0.0
  • GeoRSS validation
  • GML Validation
  • Web Map Context (WMC), Version 1.1.0

In addition, it offers a number of beta tests for the following specifications:

  • Catalog Services for Web (CSW), Version 2.0.2
  • Sensor Observation Service (SOS), Version 1.0
  • Sensor Planning Service (SPS), Version 1.0
  • Web Coverage Service (WCS), Version 1.0.0 port to the TEAM engine
  • Web Coverage Service (WCS), Version 1.1.1
  • Web Feature Service (WFS), Version 1.1.0 with XLink support
  • Web Registry Service (WRS), Version 1.0

Following implementation of SOS services at NDBC, CO-OPS and a number of the IOOS Regional Associations, IOOS conducted SOS service compliance testing using an OGC SOS v1.0 CITE TEAM Engine. The test results revealed that many of the implementations had some level of compliance issue, varying from trivial to serious. The issues with the test can be grouped into three categories based on ease of resolution:

  1. Errors that can be easily addressed in the near-term (typos, incorrect references, etc.).
  2. Errors that do not need to be corrected in the near-term (wrong or inconsistence exception code messages, etc.).
  3. Errors due to the unnecessary strictness of the OGC test (truncated seconds in timeDate value, non-OGC URNs, HTTP instead of HTTPS in URLs, etc.)

IOOS is working with the data providers to highlight the compliance issues and get them corrected; NDBC is considered a primary target as the majority of the regions have NDBC SOS software installed. IOOS is also working with the OGC Compliance Test development group to update the test scripts. There is also an ongoing work to align SOS Web service testing to the IOOS Data Catalog development. Full OGC specification compliance is a reasonable goal for FY 2011.

NOAA IOOS Testing

The NOAA Coastal Services Center Data Transport Laboratory (CSC DTL) collaborates with local and regional observing systems, federal offices, and the IOOS expert teams to support data-sharing technologies. As part of the DIF project CSC DTL has developed test plans, and conducted a number of tests to ensure the compliance of NDBC, CO-OPS and CoastWatch Web service implementations with the OGC specifications.

Reference Implementations

Reference Implementation (RI) is an Open Source implementation of an OGC Web Service which is 100% compliant with the associated compliance tests. OGC has defined a reference implementation as an open source, fully functional implementation of a specification in reference to which other implementations can be evaluated. This is to ensure maximum transparency of its specifications for both vendors and customers.

At the moment, the OGC's CITE portal demonstrates the following Reference Implementations:

The most up to date list of available sample software can be found on the IOOS DIF Web portal.

Regional DIF Implementation

AOOS

AOOS SOS

GCOOS

GCOOS DP: Advance Data Access

SECOORA

Generic diagram showing obskml->xenia->obskml and additional formats/services

Some examples of generating ObsKML

Some instrumentation examples as ObsKML

Xenia home page

Xenia VMware home page







SOS Implementation

SOS Implementation Matrix

TDS Implemenatation Matrix

Media:Regional DAP Implementation Matrix.doc

NDBC SOS Implementation

NDBC SOS Implementation Software Beta Version 0.6.1

This PHP script and related files are used by the NDBC to provide a SOS interface to their existing MySQL database back-end. This beta software is still under active development and test.

12 DMAC Functional Components List 12 DMAC Functional Components List

Regional DMAC Implementation Conference Call Notes

  1. October 01, 2009
  2. October 19, 2009

Reference documents

  1. OpenGISВ® Sensor Observation Service, v1.0.0
  2. OpenGISВ® Web Coverage Service, v1.1.2
  3. OpenGISВ® Web Map Service, v1.3.0
  4. OpenGISВ® Geography Markup Language (GML) Encoding, v3.2.1
  5. Summary Description of DIF XML Encodings , v0.6.0, Initial Release (2008-05-10) and update to v.0.6.1 (2008-08-06)
  6. Abstract Data Content Standard for IOOS Data Integration Framework, v0.4, July 7, 2008
  7. Data Content Standard for Remotely Sensed Ocean Color Data, IOOS Data Integration Framework, v1.1, July 25, 2008
  8. Content Standard for Digital Geospatial Metadata: Extensions for Remote
  9. IOOS Documentation/Metadata Evaluation by T. Habermann, 2008
  10. XSL Transformations (XSLT), W3C Recommendation, v1.0
  11. Guide to Distributing Your Data Products Via WMS 1.1.1. A Tutorial for Data Providers by Rob Raskin, Ocean ESIP, Jet Propulsion Lab, 2004
  12. Guide for IOOS Data Providers, ver. 1.0, June 2, 2006
  13. Comparison of CEN, FGDC and ISO standards for metadata by Jan Ruzicka, Institute of Economics and Control Systems at Technical University Ostrava
  14. Model Data Interoperability for IOOS in 12 months or less: Simple procedures for serving standardized data and for obtaining scientific access. Seminar by Rich Signell, USGS / NOAA IOOS Program Office, Silver Spring, MD
  15. OGC Sensor Web Enablement (SWE) Framework
  16. OpenGISВ® SensorML Encoding, v1.0.1
  17. OpenGIS® Observations and Measurements Encoding (O&M), v1.0.1
  18. OOSTethys Community Web Site