NOAA Data Catalog

From NOAA Environmental Data Management Wiki
Jump to: navigation, search

Improved solutions for metadata publication to the NOAA Data Catalog


Scope

This covers publication of metadata collections to the NOAA Data Catalog

Goal

The goal of this session is to discuss the current process for publishing NOAA metadata and discuss potential improvements to the process. Notes from the discuss will be published to this Wiki and this Wiki will be used to documents the publishing process going forward.

Agenda

  1. Overview of current metadata publishing process - Chris MacDermaid 20 minutes
  2. Proposed Improvements - Jason Shapiro 20 minutes
  3. Collection level decision tree design - Anna Milan 20 minutes
  4. Discussion - Everyone 30 minutes

Overview Current Publication Workflow

  1. Do I need to publish a metadata record for my data?
    • NOAA Data Documentation Directive
    • All NOAA datasets, whether on-line or off-line, have a publicly-accessible ISO Geographic Metadata record
    • Metadata records shall include, at minimum, the information necessary to enable potential users to discover, access, evaluate, and use the dataset
    • Metadata records shall be made publicly accessible on-line in a Web Accessible Folder (WAF)
    • Metadata WAFs shall be registered with the NOAA Data Catalog
  2. Determine scope of collection level metadata records
  3. Author ISO 19115-2 collection level metadata records
  4. Verify individual records
    • Use Docucomp record services to validate, assess completeness with rubric and check for broken links.
  5. Set up Web Accessible Folder(s) (WAF) in your location
  6. Post the metadata records to the WAF
  7. Register WAF with EMMA for assessment of entire WAF
    • What is a WAF? How do I create a WAF? - varies depending on IT resources at different locations
  8. Create NOAA catalog account
    • email noaa catalog working group
  9. Ask for sysdamin privilege to register WAF in NOAA catalog
    • email Chris??
  10. Register WAF with the NOAA Catalog
    • Select daily, weekly, or monthly synchronization
  11. NOAA Catalog harvests record
  12. Check for any errors on the status page
    • How do I troubleshoot errors?
  13. Data.gov harvests from NOAA Catalog weekly on Mondays (RIGHT?)
  14. Check for errors via NOAA catalog email list
    • How do I troubleshoot errors?

Authoritative Collection-level WAF

  • Clustered file system
  • Regularly backed up
  • Authentication
  • Security
  • Revision Control
  • Consistent naming of records
  • Consistent directory structure
  • Enforcement of verification

Other Things to consider?

  • Line Office contacts:
    • NESDIS
    • NMFS
    • NWS
    • OAR
  • How often will the metadata be updated?
  • Who will maintain the metadata?
  • How can I find the status of the harvesting my metadata records to the NOAA Data Catalog?

How is metadata harvested by Data.gov? How can I determine the status of the harvest my metadata records to Data.gov?
Service metadata?

Proposed pipeline process

Collection Decision Tree

- for reference so far: <a _fcknotitle="true" href="Data Citation Granularity">Data Citation Granularity</a>, http://intranet.ngdc.noaa.gov/wiki/index.php/Data_Citation_Scopes