Topic outline

  • General

    Basic Marine Data Management

    This course provides an introduction to the steps required to manage and archive marine data including the guiding principles of data management and typical responsibilities for data manager.

    Aims and Objectives:

    • Provide an introduction to the management of oceanographic data, with a focus on the participants’ national and regional environments
    • Provide guidance on the establishment of national facilities for the management of oceanographic data
    • Demonstrate the use of a software package for the analysis and visualization of oceanographic data to develop a national collection
    • Introduce the concept of metadata for managing data
    • Highlight some publicly available marine datasets

    Learning Outcomes:

    • Knowledge and understanding of the management of oceanographic data from data assembly through to product generation
    • Understanding of the core tasks required for the processing, manipulation and analysis of oceanographic data
    • Awareness of the various formats used to store marine data
    • Understanding of the importance of metadata for describing and managing data
    • Ability to use software for the interactive analysis and visualization of oceanographic profile data

    Participants are requested to bring their own data for analysis in class.

    Prerequisites:
    Students should have good computer skills and experience in the use spreadsheets to manipulate data.

    Lecturers:

    • Greg Reed
    • Reiner Schlitzer
  • Topic 1

    Preliminaries

    International Oceanographic Data and Information Exchange (IODE). IODE is the programme of the Intergovernmental Oceanographic Commission (IOC) of UNESCO, which was established in 1961, to enhance marine research, exploitation and development, by facilitating the exchange of oceanographic data and information between participating Member States, and by meeting the needs of users for data and information products.

    OceanTeacher. OceanTeacher provides training tools for Oceanographic Data and Information management and is used extensively during IODE Training Courses but can also be used for self-training and continuous professional development. The OceanTeacher is comprised of two components:

    • OceanTeacher Digital Library. An online resource about marine data management and marine information management, designed for training advanced students and mid-level professionals. OceanTeacher consists of a large number of integrated articles on information and data topics, paired with a set of course manuals that "point" to selected articles and exercises, in a highly organized way. OceanTeacher is designed to disseminate knowledge to a professional audience of marine data and information managers working in marine data and information centres worldwide. 
    • OceanTeacher Classroom. A comprehensive web-based training system that has been developed as a training system for ocean data managers (working in ocean data centres), marine information managers (marine librarians) as well as for marine researchers who wish to acquire knowledge on data and/or information management.
    • Topic 2

      Introduction to Oceanographic Data Management

      Marine Data Format Types (from Ocean Teacher Digital Library). A description of some of the major data formats used to manage ocean data. Browse through the subsections describing the different data formats. Of particular interest for this training course are:

      • Archive Formats. These formats include the World Ocean Database format which will be used in this course.
      • Metadata Standards. Metadata must be described in a consistent manner and a metadata standard uses a common set of terms and definitions to describe data. This course will use the ISO19115 Metadata standard to describe oceanographic datasets.
      • Spreadsheet Formats. Spreadsheets are one of the more popular formats to describe marine data. The Ocean Data View spreadsheet format will be used in the ODV exercises.
      • Vector Formats. This format is used in Geographical Information Systems (GIS) to described line point and line data.

      Integration SchematicIntegration Schematic for Data, Formats and Software

      Physical Oceanographic Measurements (from Ocean Teacher Digital Library). A brief description of physical oceanographic parameters and instruments to provide a conceptual framework and terminology for marine data managers.

      Marine Parameter Value Ranges (from Ocean Teacher Digital Library). A simplified global range table for some common parameters.

      Quality Control (from OceanTeacher Digital Library). Quality control of data is an essential component of oceanographic data management.  Data quality control information tells users of the data how it was gathered, how it was checked, processed, what algorithms have been used, what errors were found, and how the errors have been corrected or flagged. Without it data from different sources cannot be combined or re-used to gain the advantages of integration, synthesis, and the development of long time series. Read through the sections on:

      • General Marine Data Quality Control. References to various QC manuals and procedures
      • Marine Data Quality Flags. A summary of quality control flags used iused by different programmes and data management offices.

      Major Global Data Collection Projects. There are a few important programmes that collect and manager global oceanographic data and make these data availabel in standard formats. They include:

      • World Ocean Database (WOD). WOD is a project established by the Intergovernmental Oceanographic Commission (IOC) of UNESCO which represents the world’s largest collection of ocean profile-plankton data available internationally without restriction
      • Global Temperature and Salinity Profile Program (GTSPP). GTSPP is a cooperative international program to develop and maintain a global ocean Temperature-Salinity resource with data that are both up-to-date and of the highest quality. It is a joint World Meteorological Organization (WMO) and Intergovernmental Oceanographic Commission (IOC) program.
      • Argo. Argo is a global array of 3,000 free-drifting profiling floats that measures the temperature and salinity of the upper 2000 m of the ocean.

      Folder Structure for Data

      An important component of good data management is the ability to store all data files in an efficient manner. A database management system (DBMS) can be used to to store and manipulate data however most datasets will be recieved in a variety of formats that may not be easily imported to a DBMS. To manage the many files used by a data manager, as well as the products that are created, some sort of "standard" folder structure is required.  This is an example of a "basic" folder structure, consisting of the top levels only (2 or 3 levels within the main folder).  You can of course add more subfolders as appropiate.  Long, descriptive files names are also important to uniquely identify datasets. 

      Establish a National Oceanographic Data Centre. IOC Manuals and Guides No. 5 (2nd revision) is intended as a tool for policy makers at the national level to assist them with the decision-making related to the establishment of national facilities for the management of oceanographic data (and information). It is also intended to be a reference document for national organizations involved in, or planning to be involved in, oceanographic data and information management.

      Online Marine Books (from Ocean Teacher Digital Library).  A list of freely-available online textbooks covering topics of interest to marine data and information managers.

      • Topic 3

        Case Study: Argo Programme

        Argo is an international ocean-observing programme that has deployed more than 3,000 drifting floats that gather temperature and salinity profiles in the upper 2,000 metres of the world’s oceans. In conjunction with satellite observations, the profiles gathered by these floats have allowed scientists to make significant advances in their quest to better understand the role of the oceans in world climate.

        Video on the importance of the Argo array produced for Scripps Institution of Oceanography.

        Argo float network at May 2011

        Argo float distribution - May 2011

      • Topic 4

        Ocean Data View

        Ocean Data View (ODV) is a software package for the interactive exploration, analysis and visualization of oceanographic and other geo-referenced profile or sequence data. ODV runs on Windows (7, Vista, XP, 9x, Me, NT, 2000), Mac OS X, Linux, and UNIX (Solaris, Irix, AIX) systems. ODV data and configuration files are platform-independent and can be exchanged between different systems.

        Download and install Ocean Data View
        Ocean Data View (ODV) is a software package for the interactive exploration, analysis and visualization of oceanographic and other geo-referenced profile or sequence data.
        • Go to the ODV software page
        • You will need to register to download the software.
        • Go to the Software>Latest Version>Versionx.x>Windows folder and download the latest version.
        • Go to Optional Packages>Versionx.x>Windows and download odvOP_ETOPO1_6min_W32.zip.exe.
        • Install ODV. Accept the default destination folders.
        • Install the optional packages file to <install>\Ocean Data View (mp)\coast where <install> is the directory where ODV has been installed.
      • Topic 5

        Adding CTD and Argo data to ODV

      • Introduction to Google Earth

        Google Earth is a software program that combines satellite imagery, maps and other spatial information in a desktop application. Google Earth is free to download from http://www.google.com/earth/index.html.

        Google Earth uses the KML file format to display geographic data. KML uses a tag-based structure with nested elements and attributes and is based on the XML standard.

        Example KML document

        KML

        The KML Tutorial provides more information together with samples of KML code. KML files are often compressed to KMZ format.

        Google Earth and Oceanographic Data Management

        Many ocean data management projects are now representing their data using KML, thereby making these data available through Google Earth. Google Earth is also a useful tool for for estimating the "Area of Interest" for a national data collection.

        The following are some important global oceanographic data which can be overlayed in Google Earth.

        Converting data for use in Google Earth

        There are procedures to convert data in .CSV format to KML using GIS software. For example, ArcGIS has the ability to convert maps and layers to KML files using the Layer To KML and Map To KML tools in ArcToolbox. Using the pop-up functionality in ArcMap, you can specify pop-ups for KML features containing attributes. 

        • Topic 7

          Metadata for Marine Data Managers

          To facilitate access to ocean data and information, it is important to consistently describe and classify data through the implementation of metadata. Metadata is structured information that describes information or services. The information recorded in the metadata enables people and applications to find, manage, control, understand and preserve their data assets.

          Metadata is an important component of any ocean data resource. Metadata describes “who, what, where, when, why, and how” about the data and can answers a wide range of questions about the dataset, such as:

          • Who created and maintains the data?
          • What is the content of the data?
          • Where is the geographic location?
          • Where is the data stored?
          • When was the data collected?
          • How was it produced?
          • How can it be accessed?
          • What data quality can you expect?

          Metadata provides benfits to both the data producer and the data user. Metadata helps people to locate data and services, mainly through the use metadata catalogues available on the internet. Metadata can provide information that will assist in determining the suitability of data. Organizations can also benfit from metadata. The use of metadata within an organization is part of overall good data management practices. It also provides a permanent inventory of data assets and services and can be used to manage an organization's investment in its data assets.Accessible metadata can reduce the administrative costs associated with responding to enquiries about data.

          The concept of metadata is not new – a Library catalogue contains metadata about the books held in the Library. Creating metadata is similar to library cataloguing, except the metadata creator needs to understand the scientific information behind the data in order to properly document the datasets. Most ocean data has a spatial component, that is a geographic location, and spatial metadata standards are used to describes spatial datasets in order to provide a consistent approach to the storage and retrieval of spatial data.