Converting HTML into PDF in Java

Using Open-Source Libraries: Jsoup and Flying Saucer

Lynn Zheng
Javarevisited
3 min readJul 24, 2020

--

I recently ran into the need to convert an HTML file into a PDF file in Java using free, open-source libraries. In this post, I will walk you through my setup process:

  • Installing Maven using Homebrew and configuring $JAVA_HOME
  • Setting up a Maven project in IntelliJ and installing jars needed for our code
  • Code to convert HTML into PDF

Installing and Configuring Maven

If you haven’t already, install Homebrew. Then we will install Maven with Homebrew.

Add the following line to ~/.bash_profile

Then in the Terminal, run:

To check if $JAVA_HOME is set correctly:

This should give you something like /Library/Java/JavaVirtualMachines/adoptopenjdk-13.0.1.jdk/Contents/Home

Configuring a Maven project in IntelliJ

Create a new project in IntelliJ and select Maven. From this step on, there are two common errors you may encounter, and I will show you how to resolve them.

Error: release version not supported

Create a Java file in src/main/java and have it print out Hello World!

Run the program and you may run into the error Error:java: error: release version 5 not supported if you are using JDK 8+. There is a post on Dev.to about resolving this error.

In the pom.xml file, add the following lines: (1.8 for JDK 8, 1.11 for JDK 11, 1.13 for JDK 13, etc.) I am using JDK 13.

Shift + Cmd + A (on Mac) or Help > Find Actions to bring up the Actions menu. Type Reimport All Maven Projects.

Now rerun the Java program to ensure that Hello World is printed correctly.

Next, let’s add the jar files our HTML to PDF code depend on in pom.xml. Add the following lines.

Maven Error: invalid target release

In IntelliJ’s Terminal, run this command to install the dependencies.

This may fail with an error message saying error: invalid target release: 1.13 . This blog post has a solution for this.

Add the following plugin to pom.xml and do Reimport All Maven Projects. Note that the source and target should be 1.8 for JDK 8, 1.10 for JDK 10, but 11 for JDK 11, 12 for JDK 12, 13 for JDK 13, etc.

Then mvn install should finish without errors.

Converting HTML to PDF

We need two steps: First, convert HTML to XHTML with Jsoup. Second, convert XHTML to PDF with Flying Saucer. XHTML is different from HTML in that XHTML is a syntactically stricter version of HTML. For instance, XHTML doesn’t allow self-closing tags like <img src=''>.

HTML to XHTML

Add Jsoup as a dependency to pom.xml:

Import Jsoup:

Create a method to convert HTML to XHTML:

XHTML to PDF

Add Flying Saucer as a dependency to pom.xml:

Import Flying Saucer:

Create a method to convert XHTML to PDF:

Optionally, you can register custom fonts used in the HTML. Right after the line ITextRenderer iTextRenderer = new ITextRenderer(); , add:

Putting Both Methods Together

I downloaded a font called Butterfly.ttf and used it in my HTML.

The output from the HTML above (by: Lynn Zheng)

The Complete Code

--

--