Using Weasyprint and Jinja2 to create PDFs from HTML and CSS

Holistic AI Engineering
10 min readJan 23, 2023

--

To start

This guide assumes that you are familiar with Python and HTML/CSS, there will be little explanation of how the basics of these work, the focus will instead be on how to use Jinja to create HTML templates that can be styled using CSS and printed to PDF using Weasyprint. This is the final output we will be building towards:

Installation

Run pip3 install Jinja2 weasyprint to install the required libraries. If you run into installation issues, the weasyprint installation page is helpful.

Initial files

Create an HTML file to lay out the structure of the document.

<!DOCTYPE html>
<html lang="en">
<head> </head>
<body>
<section id="cover">
<h1>Cover page</h1>
<h2>Descriptive subtitle</h2>
</section>
<section>
<h1>PDF Generation example</h1>
<p>PDF report generated using weasyprint</p>
</section>
</body>
</html>

Create a python file to take that template and write it to PDF

from jinja2 import Environment, FileSystemLoader
from weasyprint import HTML

environment = Environment(loader=FileSystemLoader("templates"))
report = environment.get_template("report.html")

html = HTML(string=report.render())

html.write_pdf('report.pdf')

Run python3 main.py to create your first PDF (and subsequently to view any of the changes)

The beginning of our generated PDF

Adding some styles

Now that we have our basic templates in place we can use CSS to add some pagination and formatting.

Create a CSS file to start styling the PDF report, I created mine in the templates folder.

h1 {
background: lightgrey;
}

Import and include the file inmain.pywhen generating the PDF

  • Add css = CSS(‘templates/report.css’)to create a stylesheet for weasyprint
  • Modify the write_pdf line to include the stylesheet html.write_pdf('report.pdf', stylesheets=[css])

...
css = CSS("templates/report.css")\

html.write_pdf('report.pdf', stylesheets=[css])
With some minimal styles

Paged media

Paged media is what allows us to format PDF (and other) documents using CSS. It all start pretty simply with the @page at-rule to set up pagination.

@page {
size: A4;
}

We can also get each section to always start on a new page.

section {
break-before: always;
}

From there we can continue and set up a cover page in CSS (I also removed the h1 style rule now that we know that the two work together)

@page :first {
border: 1px solid black;
margin-top: 33%;
}

section#cover h1,
section#cover h2 {
text-align: center;
}

section#cover #author {
position: absolute;
bottom: 10%;
right: 10%;
}

Finally, we can add the sections in report.html to allow for the pages to show in the PDF.

<body>
<section id="cover">
<h1>Cover page</h1>
<h2>Descriptive subtitle</h2>
<p id="author">
Company name<br />
Firstname Lastname<br />
Address1<br />
Address2<br />
City<br />
Country<br />
</p>
</section>
<section>
<h1 id="first-page">PDF Generation example</h1>
<p>PDF report generated using weasyprint</p>
</section>
</body>

Pagination

We can add a page counter at the top right of the report by using the margin at-rules to add a page number to the top right corner of every page. To achieve this we can @top-right{content: counter(page)} to the @page rule we created earlier.

@page {
size: A4;
@top-right {
content: counter(page);
}
}

We can also exclude the page numbering from the cover by adding @top-right{content: none} to the @page :first rule. If you want the first page after the cover to be numbered 1 (rather than 2) add counter-reset: page 0 to the @page :first rule.

@page :first {
counter-reset: page 0;
border: 1px solid black;
@top-right {
content: none;
}
}

We then get page numbers in the top right corner:

The page number appears on the top right of the report (note: it does not appear on the cover)

Changing the font

For our needs we needed to use a different font than what was supported locally so we turned to Google Fonts to enable us to use a larger variety of fonts. To add a font you first need to add a script tag to the HTML head (in our example we are using the Lato font):

<head>
<link rel="preconnect" href="https://fonts.googleapis.com">
<link rel="preconnect" href="https://fonts.gstatic.com" crossorigin>
<link href="https://fonts.googleapis.com/css2?family=Lato:ital,wght@0,100;0,300;0,400;0,700;0,900;1,100;1,300;1,400;1,700;1,900&display=swap" rel="stylesheet">
</head>

You can then use the font in your CSS file

font-family: Lato;

Make sure to add a font-family to the @page at-rule if you want the page numbers to also use this font.

Creating a dynamic template

Jinja allows us to add placeholders to our HTML to allow for the creation of dynamic templates that enable us to create personalised reports.

We can start by simply adding a placeholder in report.html , Jinja uses the {{ }} syntax to denote a placeholder.

<p>PDF report generated using weasyprint for {{report_recipient}}</p>

We then need to add this variable to our render function in main.py

html = HTML(string=report.render(report_recipient='Recipient User'))

Let’s add a second placeholder for the author fields on the cover. As is shown, the placeholder can be a dictionary and each key can be accessed in the template.

<p id="author">
{{author.company_name}}<br />
{{author.name}}<br />
{{author.address1}}<br />
{{author.address2}}<br />
{{author.city}}<br />
{{author.country}}<br />
</p>

We can then add the dictionary to main.py and pass it weasyprint's render function

author = {
"company_name": 'Example Company',
"name": 'Joe Person',
"address1": "Address 1",
"address2": "Address 2",
"city": "City",
"country": "Country",
}

html = HTML(string=report.render(report_recipient='Recipient User', author=author))
The ‘author’ information for the cover page has been rendered

We now want to add a new section to the report with a table with some info.

We need a way to call an API to get the data we need, to do this we need to install requests

pip3 install requests

We can then add our data to main.py

countries = requests.get("https://restcountries.com/v3.1/all").json()

And pass our data to the template

html = HTML(string=report.render(report_recipient='Recipient User', author=author, countries=countries))

We can now turn to our HTML template to decide how we will display this information. In our case, we want to display it in a table with each country having its own row. Jinja has the {% %} syntax to help us here. We can initiate a loop with the following snippet:

<section>
<table>
{% for country in countries %}
{% endfor %}
</table>
</section>

Let’s suppose we only want to render this section if the countries variable is defined, we can do that by placing an if statement around the whole section:

{% if countries %}
<section>
<table>
{% for country in countries %}
{% endfor %}
</table>
</section>
{% endif %}

Finally, let’s loop through the data to create our table

{% if countries %}
<table>
<thead>
<tr>
<th>Common Name</th>
<th>Official Name</th>
<th>Capital</th>
<th>Region</th>
<th>Subregion</th>
<th>Flag</th>
</tr>
</thead>
{% for country in countries %}
<tr>
<td>{{country.name.common}}</td>
<td>{{country.name.official}}</td>
<td>{{" ,".join(country.capital)}}</td>
<td>{{country.region}}</td>
<td>{{country.subregion}}</td>
<td>{{country.flag}}</td>
</tr>
{% endfor %}
</table>
{% endif %}

You can see that the capitals are returned by the API as a list of city names, we can join those on a " ," so as not to print a list of cities.

After running python3 main.py the first 6 pages look like the following:

The first six pages of the report

The {% %} notation is quite powerful and can be used to create if statements, for loops, and even create and set variables. You can explore the full extent of this in the jinja docs.

Including a separate template

In order to keep our codebase organised we split our pages among different templates (to avoid having a monster report.html that spans 1000s of lines). We can create another HTML file and include it in the report that is being rendered by weasyprint.

Start by creating a new HTML file (we’ll call ours conclusion.html):

We can then include this html file in our report.html template using the include functionality provided by Jinja

<body>

...

</section>

{% include 'conclusion.html' %}
</body>

The section is added to the end of the document as if the html was directly written in report.html . Any Jinja syntax/templating functionality is also available within any included html files without any further configuration.

Adding a table of contents

We now have three distinct sections and it would be nice to link to each of them from a table of contents. In order for the table of contents to link to a section we need to give it an anchor. We can do by giving each section a title with a unique ID.

(in report.html)
<section>
<h1 id="first-page">PDF Generation example</h1>
...
<section>
<h1 id="countries-table">Countries table</h1>
...
(in conclusion.html)
<section>
<h1 id="conclusion">Conclusion</h1>

We can then create a contents.html file with links to each section:

<section>
<h1>Table of contents</h1>
<ul>
<li><a href="#first-page">First Page</a></li>
<li><a href="#countries-table">Countries Table</a></li>
<li><a href="#conclusion">Conclusion</a></li>
</ul>
</section>

Include the contents section between the cover and the first page:

...
{{author.country}}<br />
</p>
</section>

{% include 'contents.html' %}

<section>
<h1 id="first-page">PDF Generation example</h1>
...

This now works but could use a little styling to improve how it looks:

ul {
list-style-type: none;
}

ul li a {
text-decoration: none;
color: black;
}

And we can add page numbers with a little more css

ul li a::after {
content: target-counter(attr(href), page);
float: right;
}
Table of contents with dynamic pagination based on the data

Creating a back cover

The back cover is created a little bit differently from the first page (as we do not have a @page :last rule). Instead we have to create a named page.

Start by creating a new section with a class of last-page in report.html :

...
</table>
</section>

{% include 'conclusion.html' %}

<section class="last-page">
This is the back cover
</section>
</body>

Create a last-page in report.css (make sure to add the @top-right rule and set content: none if you want to exclude page numbers from the back cover). I opted to also give it a border like the cover page.

section.last-page {
page: last-page;
}

@page last-page {
border: 1px solid black;
@top-right {
content: None;
}
}

TLDR;

The final report.html file:

<!DOCTYPE html>
<html lang="en">
<head> </head>
<body>
<section id="cover">
<h1>Cover page</h1>
<h2>Descriptive subtitle</h2>
<p id="author">
{{author.company_name}}<br />
{{author.name}}<br />
{{author.address1}}<br />
{{author.address2}}<br />
{{author.city}}<br />
{{author.country}}<br />
</p>
</section>

{% include 'contents.html' %}

<section>
<h1 id="first-page">PDF Generation example</h1>
<p>PDF report generated using weasyprint for {{report_recipient}}</p>
</section>
<section>
<h1 id="countries-table">Countries table</h1>
<table>
<thead>
<tr>
<th>Common Name</th>
<th>Official Name</th>
<th>Capital</th>
<th>Region</th>
<th>Subregion</th>
<th>Flag</th>
</tr>
</thead>
{% for country in countries %}
<tr>
<td>{{country.name.common}}</td>
<td>{{country.name.official}}</td>
<td>{{" ,".join(country.capital)}}</td>
<td>{{country.region}}</td>
<td>{{country.subregion}}</td>
<td>{{country.flag}}</td>
</tr>
{% endfor %}
</table>
</section>

{% include 'conclusion.html' %}

<section class="last-page">
This is the back cover
</section>
</body>
</html>

contents.html &conclusion.html :

<section>
<h1>Table of contents</h1>
<ul>
<li><a href="#first-page">First Page</a></li>
<li><a href="#countries-table">Countries Table</a></li>
<li><a href="#conclusion">Conclusion</a></li>
</ul>
</section>
<section>
<h1 id="conclusion">Conclusion</h1>
<p>Concluding paragraph of the report</p>
</section>

report.css :

@page {
size: A4;
font-family: Lato;
@top-right {
content: counter(page);
}
}

@page :first {
counter-reset: page 0;
border: 1px solid black;
@top-right {
content: none;
}
}

section {
break-before: always;
min-height: 100%;
font-family: Lato;
}

section#cover {
margin-top: 33%;
}

section#cover h1,
section#cover h2 {
text-align: center;
}

section#cover #author {
position: absolute;
bottom: 10%;
right: 10%;
}

ul {
list-style-type: none;
}

ul li a {
text-decoration: none;
color: black;
}

ul li a::after {
content: target-counter(attr(href), page);
float: right;
}

section.last-page {
page: last-page;
}

@page last-page {
border: 1px solid black;
@top-right {
content: None;
}
}

report.pdf :

(The table of contents does not seem to be clickable in the preview, download the PDF to see that in action!)

Conclusion

After a few hours spent with these libraries I’m quite impressed by what is possible with very little config and just some basic knowledge of HTML/CSS and Python. It’s really nice to have an option for rendering PDFs using technologies that we are familiar and comfortable with.

Anything that can be achieved on a web page can pretty much be translated 1:1 to a PDF (bearing in mind the page dimensions and the fact that it is a PDF, so don’t go trying to add animations or anything like that).

We’ve also explored other PDF solutions such as fpdf2 and Python Docx Template.

If you have any feedback please leave a comment below, I’m always happy to learn something new!

Holistic AI is an AI risk management company that aims to empower enterprises to adopt and scale AI confidently. We have pioneered the field of AI risk management and have deep practical experience auditing AI systems, having reviewed over 100+ enterprise AI projects covering 20k+ different algorithms. Our clients and partners include Fortune 500 corporations, SMEs, governments and regulators.

We’re hiring :)

--

--

Holistic AI Engineering

We are the engineers at Holistic AI, the company that wants to change the way humans interact with AI systems. Check us out here https://www.holisticai.com