#03 2021

SAEON launches data catalogue

By Bryan McAlister, Systems Development Lead, SAEON uLwazi Node

The first quarter of 2021 marked the launch of the new SAEON data catalogue, as well as the SAEON open data platform core services which drive this new feature. This is a significant milestone in the uLwazi Node’s contribution to SAEON’s data management systems.

The development of the data catalogue drew together the expertise in SAEON uLwazi – from data curation and data science to systems development and IT infrastructure. The launch embodies the vision of SAEON uLwazi, which is to support open data access and decision support for the South African environment and society.

The SAEON data catalogue is available at https://catalogue.saeon.ac.za/. Its core features are shown below (left).

The data catalogue provides users with the ability to discover and access environmental datasets and products for South Africa. The datasets are provided by the efforts of environmental and data scientists involved in a wide range of government and internationally funded research projects and scientific missions. The data providers are shown next.

Catalogue features

Data providers

In addition to the above, datasets are also sourced from a variety of scientific missions or programmes, including the following:

  • South African Risk and Vulnerability Atlas (SARVA)
  • Bio-Energy Atlas
  • Carbon Sinks Atlas
  • SAEON shallow marine coastal research infrastructure (SMCRI)
  • SAEON expanded freshwater and terrestrial environmental observation network (EFTEON)
  • The Department of Forestry, Fisheries and the Environment’s Marine Information Management System (MIMS)

SARVA has categorised datasets according to 15 of the 17 sustainable development goals, as shown in the following figure. Users can visit https://sarva.saeon.ac.za/sdgs/ and select a sustainable development goal (SDG) that will provide access to relevant datasets in the SAEON data catalogue.

The SAEON data catalogue is powered by a host of supporting services operating behind the scenes. These are collectively known as the SAEON open data platform core services, which drive a fully-fledged data management process and lifecycle, as shown below. Data is managed through several stages to ensure the following:

  1. Production of high-quality metadata by curators.
  2. Long-term storage and archiving of datasets.
  3. Provision of online access to data and metadata.

The online data and metadata services are compliant with open data access standards to ensure interoperability and re-use of data. By open data we refer to data that is freely available, can be accessed, used and/or changed and republished by anyone.​

All metadata comply with the DataCite standard, which, among other things, enables digital object identifiers (DOIs) to be issued for datasets. Many of the datasets also comply with relevant ISO and SANS standards as and when applicable.

Lastly, data services are made available using widely used standards such as those provided by the Open Geospatial Consortium (OGC) for spatial data and OPeNDAP for multidimensional datasets. The figure below (left) shows these metadata and data service standards, together with the various data sources and types supported by the open data platform.

Currently, the catalogue makes 2 249 records available for viewing and download. This marks a significant milestone in achieving SAEON’s objective of providing open access to earth and environmental observation data for South Africa.

It also demonstrates the maturing of SAEON data systems and infrastructure. As SAEON’s data holdings are starting to grow exponentially, they will become more visible and available to a global audience through this catalogue.