Puppeteer vs WkHtmlToPdf and why I created a new module
In one of our recent projects at Coletiv the user had to be able to generate a PDF invoice with the list of orders he did on the platform we have built using Elixir. By that time I didn’t know what could be the best solution or if there was some native implementation to use.
With this article I want to share with you the steps I followed, the decisions I have made until I reached the final solution and why I ended up creating our own module. Hopefully the article can also help you solve a similar problem and make you contribute to the package we created.
I started by researching for existing solutions and quickly discovered that without a doubt the best solution to generate a PDF file was to have it designed in HTML and then convert it to PDF. Although I’m a backend developer at heart, I have some HTML and CSS skills that I could use on this task. It appeared that pdf_generator was the top choice for Elixir developers when there was a need to generate PDF files from code. pdf_generator is a wrapper to wkhtmltopdf, a C++ tool to create PDF files using HTML files.
I also found another alternative, a native PDF generator named gutenex, but quickly found that doing some fancy design with this module would be a more difficult process than writing HTML code (at least if you know HTML and CSS). Also this module hasn’t seen any activity over 2 year, another reason why we decided to go for pdf_generator instead.
This software is widely used over the internet as one of the main tools to convert HTML to PDF. At first I was very satisfied with the output generated, being the only problem the HTML table entity containing the orders being badly divided between multiple pages. I ended up calculating how many table rows I could fit on each page, but as I kept evolving the design I had to redo the calculations over and over again. This was very error prone, specially for the other developers working in the project as we couldn’t forget to recheck the calculations every time we did a change.
The second issue I encountered was with unicode characters, like Chinese characters. The solution for this issue was to convert each character in HTML entities first in order for them to appear in the PDF.
The third and most annoying issue was that wkhtmltopdf uses the machine display to generate the PDF file. For instance, being on an Macbook Retina (2560x1600px 13" display) I needed to exaggerate the document size in the CSS style (i.e. font-size, padding) in order to have a proper sized PDF and not a tiny one. For example, I needed to have
font-size: 42px instead of
font-size: 12px that was what I would normally use.
Although this wasn’t a very difficult problem to fix, when I deployed to a development server, which isn’t connected to a screen, the default screen size used was very small (1366x768) and the PDF file generated had the layout designed for my screen (2560x1600px 13" display) which in comparison was too big. Since I needed to do some calculation previously to show the content properly, this was unbearable to maintain.
Some other possible solutions was to use the
--dpi option, having also the
--zoom option to reduce the overall size on the PDF generated on the server side, but in the end I couldn’t replicate the same design that I’ve tested on my computer.
At this point I decided to take a step back and rethink the solution, that’s when puppeteer came into play.
Taking the suggestion of Daniel Ruf related to the display size issues, I end up exploring puppeteer, a Node API that allows you to take screenshots from webpages as well as generate PDF files using a version of Google Chrome browser in headless mode.
With this software, I finally could have the same PDF design replicated in both my laptop and on the server. Globally the PDF rendered was better than wkHtmlToPdf, being the only issue so far the file size which is 10 times higher than the size of the file generated by wkHtmlToPdf.
This also allowed me to implement the header and footer in HTML, and after some tweaking with margin parameters. I could finally generate a PDF file without having to do the previous calculations on the template items. This feature is available in the wkHtmlToPdf, but I just noticed that after exploring the puppeteer options.
Puppeteer vs WkHtmlToPdf
In order to help you with the decision of picking one of the two let’s highlight some possible reasons to choose one over the other. In the end it all comes to your specific needs.
Advantages of using puppeteer over wkHtmlToPdf
- Better PDF rendering
- Easier to use
- Uses a well maintained software (puppeteer)
- Uses less Elixir dependencies
Disadvantages of using puppeteer over wkHtmlToPdf
- Needs NodeJS
- Larger footprint (needs NodeJS plus Google Chrome image ~90MB)
- Generated file size is way bigger than the one generated by wkHtmlToPdf
If you are currently using pdf_generator wrapper and you are happy with the results that you have, you shouldn’t move away from it. If you’re searching for a PDF generator module for your Elixir project, take some time to give puppeteer_pdf a try.
HTML and PDF icon, created by Dimitry Miroliubov