From print-ready PDF to WCAG2.0 compliant HTML with BG Publishers

We work with government agencies and the private sector to transition print publications to digital formats.

This is what Sean said about our services:

Our work on document conversion with BGP has had a substantial benefit in our mission of disseminating valuable information on Carbon Capture and Storage. This work allows us to present complex reports in a much more accessible way to draw in greater traffic from referral sites and organic search, improve on-site accessibility and to organise our content in new ways.
Sean McClowry | General Manager, Knowledge Management/ICT Services Global CCS Institute

BG Publishers (BGP) is an Australian digital consultancy that specialises in alternative digital document conversion and production. BGP makes digital content web accessible, both online and for iPad and smart phones.

Our team of digital experts has skills in document conversions, HTML and PDF accessibility, information design, HTML classification, content analysis and search engine optimisation. BGP has extensive experience in web content management systems, e.g. Drupal, WordPress and Sitecore.

We specialise in producing still and animated content in ePUB format.

We convert print-ready PDFs of signature reports – Annual Reports and other significant publications – into WCAG2.0 compliant HTML. These are complex reports, traditionally containing the following: images, graphs, charts, tables, formulae, financial statements, a number of heading levels and printed in full colour.

A typical conversion would include the following:

✓ Images resized for optimal web delivery

✓ Tables converted to text

✓ Metadata assigned

✓ Alt texts assigned (if supplied)

✓ Text alternatives assigned (if supplied)

✓ Run a web accessibility check

We add value by writing alt texts and text alternatives if required. We can provide print styles if the client’s website requires these. To fulfil the order, generally we provide zipped files to the client.

We offer this digital document service in three ‘phases’, dependent upon the client’s needs and in-house capability. Clients can select any one of these ‘phases’ dependent upon their requirements. We offer a one-on-one technical liaison service if required to facilitate the document conversion and publishing.

Phase 1

Phase 1 is to deliver HTML that meets WCAG requirements. The HTML however does not have the client’s web style (colour scheme, navigation, logo, etc.). It requires the client’s own in-house resource to integrate the HTML with their existing website and to ensure the integration doesn’t adversely affect accessibility. In simple words, phase 1 is to provide an HTML alternative to the PDF, and the essential accessibility treatment is applied.

Clients who have in-house web resources can use the output from phase 1 to meet the minimal WCAG requirements. An example of those clients is the Torres Strait Regional Authority (see references).

Phase 2

Phase 2 takes the output of phase 1 and manually our staff:

1. break the single essential HTML at the level desired by the client and construct a structured HTML ‘book’, with navigation. By level, we mean section, chapter or paragraph break.

2. apply the client’s web styles to make this HTML ‘book’ blend into the client’s existing web system.

Although the output of phase 2 will have the client’s look and feel, it is ‘static’. This means the HTML will look good to human eyes but there is no significant advantage in terms of optimizing for search engines. To do this, a client should consider Phase 3.

An example of the client taking a phase 2 approach is the Department of Prime Minister and Cabinet (see references). For most clients with Annual Report publications, Phase 2 is most probably the preferred choice.

Phase 3

Phase 3 is what we have done for the Global CCS institute and an Australian Institute of Health and Welfare prototype. This phase automates almost everything in Phase 2 and:

1. automatically classifies the HTML pages using the client’s defined dictionary and highlights the key words

2. automatically sends site map to major search engines to promote the HTML pages.

This choice is most valuable to organisations who publish a large volume of publications.

This automated HTML import is available to clients using Drupal. Some integration is required for WordPress and Sitecore.

The benefit to the client of implementing Phase 3 is using our ‘HTML Import’ for bulk uploading of content which speeds up the process and automatically adds the services to improve search engine optimisation.


Phase 1: Torres Strait Regional Authority: Annual Report–2014

This is an example of a publication delivered in WCAG2.0 compliant HTML. We converted the print-ready PDF and provided the essential HTML. The TSRA elected to publish the HTML themselves to their own website.

Phase 2: Prime Minister and Cabinet: Annual Report–14/html/

This is an example of a publication delivered as a ‘book’. We produced the HTML files on behalf of the PM&C, using their web CSS to deliver the content in their ‘look and feel’.

Phase 3: Global CCS Institute

This particular publication is produced as a standalone ‘book’. We produced the HTML files that were automatically uploaded to the client’s website. If you search across the whole Global CCS Institute website, the system will return information from all the publications that have been published in this space. This is particularly valuable for researchers. Note that this example also includes alternative digital formats – ePUB and .mobi – attached to the publication landing page as downloadable files.

Phase 3: Australian Institute of Health and Welfare prototype

For this example, we converted three large publications from the individual print-ready PDFs to HTML. We combined the HTML into a single searchable online product. This example might appeal to research organisations that wish to share information held in discrete publications; the results deliver information from all three publications. This is similar to the Global CCS Institute example although the data sets here are smaller.

I’m Bobby Graham, digital publisher and consultant

Like what you read? Give Slobodanka (Bobby) Graham a round of applause.

From a quick cheer to a standing ovation, clap to show how much you enjoyed this story.