Contributing datasets to EMODnet Biology
- 1) Introduction
Learning outcomes of the course:
- Understand the general data flow for EMODnet Biology and how it interacts with other biodiversity data systems
- Understand the different biological data standards used in EMODnet Biology
- Be able to process a dataset from scratch in order to meet the EMODnet Biology data standards
- Be able to publish your standardized data in the IPT
- Understand the full data management cycle for EMODnet Biology, including data processing and data publication
- Be able to perform advanced data quality check on your data.
- Be able to access your data through the EMODnet Biology portal
EMODnet Biology and overview of data management:
Use for questions on the content of a topic. Guidance on best practices etc.
In this lesson we will learn a bit more about the EMODnet project and the Biology portal. More specifically, we will have a look at how the data flows to EMODnet Biology and its relation with other data systems with a focus on (marine) biodiversity.
In this lesson we will have a quick look at the data management cycle of EMODnet Biology, including an introduction to the data format (data structure and standards used).
- 2) Metadata
In this lesson we will learn about EMODnet Biology Catalogue records and the procedure to create an maintain them. You will learn which are the mandatory and which optional fields in the catalogue and which information they contain.
You will learn about the importance of technical metadata and what to keep in mind upon receiving a dataset you mean to process for EMODnet Biology.
- 3) Formatting your Dataset - DwC terms
3) Formatting your Dataset - DwC terms
In this section you will learn what are Darwin Core terms. You will have a look at the most relevant terms used in EMODnet Biology and you will learn which are the mandatory terms to be used in each of the three tables (Event, Occurrence and Extended Measurements or Facts). You will use a demo exercise to practice how to map the existing column names to the DwC terms. Let's go!
- 4) Formatting your dataset - data standardization (I)
4) Formatting your dataset - data standardization (I)
This lesson is provides detailed information about the content and explains how to format several important fields in your dataset: the eventID and OccurrenceID, the eventDate, occurrenceStatus and basisOfRecord.
After this lesson you will be able to get the LSID to populate the field scientificNameID and know the EMODnet Biology guidelines for handling difficult taxon names.
This is the solution of the taxon match. The column remarks addresses what to do for those cases that didn't match easily. The column LSID (marked in green) is the the only output of the taxon match that you need to add to your data file.
A lesson on how to store the exact positions, start and stop positions and general defined areas or regions.
In this topic we will guide you through solving the standardisation(I) assignment.
Handy excel functions to create linestrings, calculate centroid coordinates and coordinateUncertaintyInMeters.
- 5) Formatting your dataset - data structure
5) Formatting your dataset - data structure
In this part of the lesson you will learn when to use Event core or Occurrence core. We will also see how to split your data from a flat table into the three required tables of the Event core schema: Event, Occurrence and Extended Measurements or Facts (eMoF). This is necessary to integrate your data in the EurOBIS/EMODnet relational database.
- 6) Formatting your dataset - data standardization (II)
6) Formatting your dataset - data standardization (II)
In this lesson we will learn how to use controlled vocabularies (managed by BODC) to populate the Extended Measurements or Facts (eMoF) table.
- 7) Publish your dataset on IPT
7) Publish your dataset on IPT
- 8) Quality Control your dataset
8) Quality Control your dataset
- 9) Harvesting by EMODnet Biology, OBIS, GBIF and DOI's
9) Harvesting by EMODnet Biology, OBIS, GBIF and DOI's