Gazette digitisation in 48 hours

Adi Eyal
OpenUp
Published in
4 min readOct 18, 2019

Saving South Africa’s history 1 gazette at a time

OpenUp developed Open Gazettes in 2016 as part of a larger project to collect information about corporations. At the time of writing the website contains 36611 national and provincial gazettes. Recently, the University of Cape Town library donated approximately 800 volumes of national gazettes the earliest of which were published in the mid-1950s. An anxious librarian asked us to rescue them before they were pulped.

Collection of the gazettes required three trips using a bakkie (mini pick-up) filled to capacity. They are currently being stored in the board room at Codebridge.

Gazettes are important. Current gazettes can be thought of a government-wide newspaper, keeping the public informed about policies, laws, trade agreements and more. Old gazettes are fascinating, and record society at the time they were published. Here are some examples from 8 January 1960:

Minumum wage for various professions — note the gender disparity.
Books banned by the Apartheid government
Illegal punchings in boxing
An anonymous donation to the taxman!

Call me a nerd, but I love this stuff.

A little more relevant to the current debates around land ownership in South Africa — a list of plots confiscated by the Apartheid government.

Digitisation campaign

Gazettes in book format are great — but largely inaccessible to the majority of potential users. Not only are they hard to get but knowing which gazette contains the information is akin to the proverbial needle in a haystack. You can use Open Gazettes to search across the entire corpus in seconds. You can also create alerts which will send you an email when new gazettes are uploaded that match your search string.

Digitisation requires scanning, running optical character recognition (OCR), indexing, and storing. Book scanning is often prohibitively expensive. Scanning book commercially costs over R2.00 / page. It is also extremely laborious, as it requires an operator to flip through each page individually.

A cheaper approach is to chop the spines from the books using a guillotine and run the pages through a feeder. Even though this approach is an order of magnitude cheaper than preserving the integrity of the book, it can still cost 30c / page, i.e. approximately R225,000 for the entire collection gifted to us. It also not pain-free, pages get stuck in the feeder (the gazette pages are thinner than normal) and require constant supervision.

Commercial digitisation also misses out on an opportunity to create a community project and leverage the potential exposure we can attract to highlight the value of the civic tech movement. I strongly believe that in many cases, paying for a service is often the result of lazy thinking. Value exchange is a much better proposition which can yield much more benefit than the clinical transfer of cash. People want to help but don’t know how to. Digitisation gives them an opportunity to contribute a few hours of their time to save an important part of our history.

Volunteer-driven

For this batch of gazettes, I propose a 48-hour scan-athon. I would like to digitise approximately 1 million pages over a weekend. This is a mammoth task and I’m not sure whether it is possible. It will likely require custom software, commodity hardware, a large amount of coordination, and a lot of Red Bull. It should also be owned by a community — people who would like to contribute to civic tech but don’t know how.

We are looking for volunteers, engineers, project managers, quality checkers, librarians, programmers, and maybe a DJ to keep us going late into the night. Donations for equipment and people with ideas about how we can turn this into reality are also welcome.

I am going to keep a record of the progress of this project on this blog if you want to stay informed about our progress. Please get in touch if you want to help out.

--

--

Adi Eyal
OpenUp
Editor for

Adi is the founder of OpenUp (formerly Code for South Africa), a civic tech organisation that uses data and technology to promote social change.