#06 2020

The SAEON Data Portal is getting better and better!

By Zach Smith, Web Developer, SAEON uLwazi Node

The end of 2020 marks the general availability of the SAEON Data Portal as a beta release. Hundreds of users are searching for SAEON’s datasets online and are (hopefully) finding what they are looking for!

The current release of this tool allows for searching SAEON’s data, generating citations in any conceivable format, visualising GIS data on a map, and of course, downloading datasets for further analysis.

At its core, the SAEON Data Portal is a search engine. Metadata records, a description of the who, why, how and what of the data, are sliced, diced and transformed in myriad ways to support full-text analysis across SAEON’s catalogue – all from a single text box entry.

Much of the work that goes into building a search engine is about understanding the information that needs to be curated and searched. As such, the obvious progression of development on this software is indicative of expanding cross-team communication, which is the bread and butter of any modern work environment.

Aside from a (hopefully) pleasing and useful search interface, with the year winding down, it is useful to take a step back and assess where the SAEON Data Portal fits into the larger context at uLwazi, and even more generally, at SAEON. The SAEON Data Portal is a small part of many larger systems, including the Open Data Platform that is still under development.

It even represents a small part of SAEON’s vision for the future; a tiny cog in a larger system.

Does ‘open-access’ mean digital? 

To send a message around the earth – some 40 000 km – takes less than a second. In recent years, lack of instant communication in South Africa or elsewhere is either the product of systemic inequality or nostalgia. To revisit this world one must place oneself in the shoes of South Africa’s 50% unconnected citizens, read a romanticized book about the past, or explore the archives of the Internet Engineering Task Force (IETF).

Among the various technical formats that the IETF oversees is a specification of the Internet implemented via carrier pigeons instead of cables. This specification, first proposed as an April Fools’ joke in 1990, has been extended twice – in 1999 and 2011 – on subsequent April Fools’ days and now forms a fairly convincing, and humorous, possible internet format that is an analogue alternative to digitally hosted websites.

Such an idea is unlikely to ‘take off’ since it is completely impractical. But, the juxtaposition of ‘tech’ stereotypes, and pigeons, and the humorous contrast of such a world begs the question: could the SAEON Data Portal have been achieved by carrier pigeon, and what are the benefits of having developed a website instead?

If technical systems can be described in terms of pigeons, then surely, it is worth reassessing the nature of what digital systems achieve compared to analogue system alternatives. What would an analogue open access platform look like, and who would build it?

The language describing the Internet is highly influenced by the analogue systems that it replaces. ‘Packets’ is an intuitive term for indicating strong entity boundaries and items of transport, and ‘dropping’ packets for indicating failed transmissions. (One of the problems with the Internet via avian carriers is the high number of dropped packets).

The vocabulary on which all digital systems exist is a mishmash of terms, ideas and metaphors that have been repurposed in ways that hint at the thought processes of early innovators. Or, in the case of certain metaphors – the ‘slave/master’ database terminology, for example – we see a general lack of thought or historical sensitivity.

The language similarities between analogue systems and their digital counterparts is significant because it at once diminishes their functional differences while allowing for effortless uptake of systems that are vastly different from how they appear. Filing cabinets, for example, fulfil the same role as databases.

But the ability to rapidly transact with a datastore, and the differences between handling information stored digitally vs physically, make these two technologies not directly comparable. And yet, the differences between filing cabinets and digital databases might go unnoticed in many environments where there is a 1:1 correlation between analogue work replaced by digital work. It is up to information workers – such as those of us at uLwazi – to be discerning and understand the philosophical differences offered by a digital world.

Looking forward to the new year 

One of the features of the SAEON Data Portal that would not be found in analogue data storage is that more than one person can make use of the service at a time.

Implementing the SAEON Data Portal as an analogue system in the era of books, journals and microfiches in library holdings would entail clear tradeoffs. Data access would not be real-time, concurrency would be limited to the number of copies of a resource, and errors (such as miss-shelving resources) would result in labour-intensive curation and significant system downtime. It would be difficult to position an open access platform as competitive without investment in the technology that underlies websites such as the SAEON Data Portal.

We are confident that certain tools belong in the web – including the SAEON Data Portal. It is with careful consideration that we have opted for building on top of a digital web-based platform rather than the South African carrier pigeon network, and the results speak for themselves.

With the coming new year, we look forward to extending our offering with data exploration features that include advanced analytics and visualisations. Watch this space!