Gotenberg with Java: Unlocking the possibilities of dynamic PDF generation for your tax forms and crypto reports

Thomas Peltre
Waltio Tech Team

--

As professionals in the tech and crypto tax industry, we understand the importance of providing our clients with accurate and well-presented accounting documents. Whether it’s a bookkeeping ledger, a tax certificate, a stock file, the annual capital gains declaration annex, or a tax audit, these documents need to be generated with the client’s data and made available to them upon request, once an “Assessment” has been completed.

By “Assessment,” we mean a thorough analysis of the client’s files and accounts and the generation of relevant reports. This process involves reviewing the client’s transactions, categorizing them appropriately, and preparing financial statements that accurately reflect their tax situation.

However, creating these documents can be a challenging task, especially when it comes to ensuring their readability and aesthetic appeal. After all, these documents are not only intended for our clients, but may also be reviewed by tax authorities, which means they need to reflect clarity and professionalism.

In this article, we’ll explore how we can use Gotenberg with Java to generate high-quality documents that meet the needs of both our clients and tax authorities. We’ll also share tips for getting the most out of Gotenberg’s features, and provide an implementation example to demonstrate how it all works in practice. Let’s dive in!

So, what exactly is Gotenberg, and how can it help us generate high-quality documents using Java and Thymeleaf?

Gotenberg is an open-source library, by The Coding Machine, that allows you to convert HTML, Markdown, and Office documents into PDFs, among other formats. It’s built on top of the popular headless Chrome browser, and can be easily integrated into Java applications using Jotenberg, the Gotenberg java client even thougth, in our case, we preferred to incorporate it into our Docker Compose setup for easier deployment and management.

One of the key benefits of using Gotenberg is its ability to generate PDFs that are both elegant and highly customizable. Initially, we used Flying Saucer PDF to generate our documents, as Thymeleaf does not natively support PDF creation. However, we found that it had several limitations and issues, such as poor performance, limited CSS support (only v2.1), and difficulties with handling complex layouts. This became a major issue when we wanted to implement a new design for our documents, which required advanced CSS features that Flying Saucer simply couldn’t handle. Despite our best efforts, we were unable to make the design look the way we wanted it. That’s where the power of Gotenberg comes in…

A Tour of Our Gotenberg Setup: Tips and Best Practices

Enough theory — let’s see Gotenberg in action! In the following section, we’ll walk through the process of generating a PDF using Chromium Gotenberg’s module and Thymeleaf. Using a simple HTML template and some sample data, we’ll demonstrate how easy it is to convert it into high-quality PDF.

While Thymeleaf configuration and usage is an important topic, we won’t be covering it in detail in this article. Instead, we’ll assume that you have a basic understanding of Thymeleaf and focus on how to use Gotenberg.

The first step before generating a PDF is to create an HTML file and a CSS file, just as you would for a small website. The HTML file will contain the content that you want to include in the PDF, while the CSS file will define the styling and layout of the document. One of the benefits of using Gotenberg is that you can take full advantage of CSS 3, allowing you to create documents that fit to your exact specifications and design requirements.

One of the challenges of generating dynamic PDFs is the time it takes to preview the results. Unlike a static website, where changes can be seen immediately, generating a PDF requires Thymeleaf to rebuild the HTML file, load the data, and then generate the PDF. This process can be time-consuming, especially when working with complex CSS styles.

My better tips here is to use paper-css to check the design and layout before processing it.

<!DOCTYPE html>
<html>
<head>
<meta charset="UTF-8" />
<link rel="stylesheet" href="style.css" />
<link
rel="stylesheet"
href="https://cdnjs.cloudflare.com/ajax/libs/paper-css/0.3.0/paper.css"
/>
</head>

<!-- Set "A5", "A4" or "A3" for class name -->
<!-- Set also "landscape" if you need -->
<body class="A4">

<!-- Each sheet element should have the class "sheet" -->
<!-- You can add an additional class "padding-**mm" and set mm to 10, 15, 20 or 25.-->
<section class="sheet">

<!-- Write HTML just like a web page -->
<article>This is an A4 document.</article>

</section>

</body>
</html>

This template you will have a perfect preview of your futur PDF to see how your content will fit on A4 format. You just have to run live-server ! ( you can refer to paper-css documentation for details). I can guarantee that you will get an extremely similar result using Gotenberg’s API.

Check out the preview we managed to get in the browser, making it a breeze to customize our PDF document with ease!

When you are satisfied with the result, you’ll need to add your HTML, CSS, and any assets (such as images or logos) to your project.

Once you’ve added your files, as said previously, you can use Thymeleaf to enrich your HTML with dynamic data from your Java application. It allows you to use placeholders, expressions, and loops to generate HTML content on the fly. Once your HTML has been enriched with data, you can use Gotenberg to convert it to a PDF, completed with your custom CSS styling and assets.

That’s exactly what we are going to do now !

No matter which approach you choose for using Gotenberg, whether it’s creating your own Docker image or using a client, you’ll get the same results. In our case, we chose to host our own Docker image to better adapt it to our needs. By hosting our own image, we have greater control over the configuration and can make changes as needed. Additionally, it allows us to address any security concerns we may have, as we can ensure that the image is up-to-date and free from vulnerabilities. While hosting your own image may require more setup and maintenance, the benefits in terms of customization and security can make it a worthwhile investment.

Using it with Java

There is a java method you can use to take advantage of the Gotenberg chromium module.

 public InputStream convertHTMLToPDF(Map<String, byte[]> assets, String htmlContent) throws IOException, UnirestException {
if (assets == null || assets.isEmpty() || htmlContent == null || htmlContent.isEmpty()) {
throw new IllegalArgumentException("Input parameters cannot be null or empty");
}

String url = String.format("%s://%s:%d/forms/chromium/convert/html", protocol, hostname, port);
MultipartBody request = Unirest.post(url)
.field("files", IOUtils.toInputStream(htmlContent, Charset.defaultCharset()), "index.html")
.field("marginTop", 0)
.field("marginBottom", 0)
.field("marginLeft", 0)
.field("marginRight", 0);

for (Map.Entry<String, byte[]> asset : assets.entrySet()) {
// Create byte array to do not consume the stream
request = request.field("files", new ByteArrayInputStream(asset.getValue()), asset.getKey());
}

try {
HttpResponse<InputStream> response = request.asBinary();
return response.getBody();
} catch (IOException | UnirestException e) {
// Log the error message and re-throw the exception
LOGGER.error("Error converting HTML to PDF", e);
throw e;
}
}

convertHTMLToPDF takes in two parameters: a Map of assets (such as images or stylesheets) and the HTML content to be converted to PDF. It returns an InputStream that contains the generated PDF.

This method first constructs a URL for the Gotenberg API endpoint using the specified protocol, hostname, and port. It then creates a MultipartBody request using the Unirest library, which is a set of lightweight HTTP client libraries. The request includes the HTML content as a field named "index.html", as well as various margin settings.

The method then loops through the map of assets and adds each one to the request as a separate field, using the asset’s key as the field name. This allows Gotenberg to access the assets and include them in the generated PDF.

Finally, the method sends the request to the Gotenberg API and retrieves the response as a binary stream. The response body is returned as an InputStream, which can be used to save the PDF to disk or send it to a client for download.

Et voilà 🙌

After processing by Gotenberg, your final PDF document is ready to be downloaded directly from the app!

Feast your eyes on your stunning documents that you can now proudly share with your clients, and let them bask in your incredible design skills!

In conclusion

Gotenberg is a valuable addition to any developer’s toolkit, and we hope this article has given you a better understanding of how to use it to create dynamic and customizable PDFs. With Gotenberg, you can take your documents to the next level and provide your clients with the high-quality, professional-looking documents they deserve.

--

--

Thomas Peltre
Waltio Tech Team

Software developer dedicated to mastering the intricacies of both Web2 and Web3 development.