Delving into the Infinite Loop issue for a Special Character with MPDF library

Sanjay
Pepperfry Tech
Published in
4 min readDec 10, 2020

As developers, one of the things we fret about in coding is an infinite loop. Not only are infinite loops inconvenient, but they also make applications, especially time-sensitive ones, unresponsive. Other potential problems include lost work or output and denied access to application functionality.

Technically, if two consecutive loop iterations produce the same state, the application is in an infinite loop. The code will need to transfer control to a statement following the loop to allow the application to escape the rut and, ideally, continue its productive execution.

The goal of benchmark applications is to execute long enough to save any pending work, finish any in-progress computations, or respond to any urgent events.

There are several situations where a single PDF (Portable Document Format) is the perfect solution for digitalized transactions. For example, generating an invoice is more reliable and efficient in PDF.

Using the mPDF PHP library, you can easily create dynamic PDF files. There are several PDF libraries available for this in PHP. For creating dynamic PDFs, you can fetch data from static or dynamic HTML web forms.

At Pepperfry, we used mPDF 6.0 to generate PDF files on-the-fly from UTF-8 encoded HTML. MPDF helps convert text into rich UI pdf format with many enhancements over the original FPDF, HTML2FPDF, and UPDF scripts, which include:

  • The acceptance of UTF-8 encoded HTML as the standard input.
  • Support for Right-to-left languages like Arabic with automatic detection of RTL characters within a document. This is useful in processing multi-lingual, cross-border invoices.
  • Automatic detection of non-RTL characters and their display in the original order.
  • Support for bookmark and meta-tag information in all character sets.
  • A single CSS stylesheet usage for all pages, with font substitution automatically for CJK characters.
  • Implementation of the word and character spacing to justify text and Unicode mode and CJK characters’ availability.
  • Support for all HTML entities and decimal and hex e.g. ' ↤
  • The PDF file can be set with password protection.
  • NB The original commands from FPDF can be used, e.g., Write(), but some are altered to allow UTF-8 encoding and RTL text to be processed, e.g., use WriteCell() and WriteMultiCell() instead of Cell() and MultiCell().

With these handy enhancements, mPDF simplifies dynamic PDF file generation with three simple steps.

Creating a Simple Application for PDF File Generation using mPDF

While creating a simple application to generate a PDF file using the mPDF library, the following dependencies need to be addressed.

  1. PHP mbstring and gd extensions must be enabled.
  2. The PHP zlib module requirement for compression of output and embedded resources such as fonts, bcmath for generating barcodes, or xml for character set conversion and SVG handling.

Usually, mPDF version 7.0 is preferred for PDF file creation from UTF-8 encoded HTML since this version recognizes non-UTF characters and for conversion.

With mPDF version 6.0 that we used, all went well until special characters were encountered in the three-step process execution (mentioned below) to create the pdf file.

1. Download the library from GitHub, then extract it and paste it into (xampp/htdoc/projectname/mpdf) or, use composer command composer require mpdf/mpdf.

2. Create dynamic HTML content or pick static content and assign it into the php variable while adding the content into the index.php file.

A sample HTML form looks like the following.

While the mPDF Code to generate would be similar to the one given below.

3. Finally, include the library class mPDF.

The hitch was with the string containing non-utf8 characters. In this case, the code to convert the HTML string to pdf would get entangled in an infinite loop without throwing an exception.

Thus, any consecutive hits for the same request to convert HTML to PDF resulted in a chain of infinite loops that meant high load on the server.

The root cause of the problem was the code below.

Here, the code purifies the HTML, checks for invalid UTF-8 characters, which, if found, converts all characters using iconv and compares with the existing string.

The presence of a non-UTF character makes the code do the following.

  1. Return a blank string and take the length of the string, which returns zero.
  2. Check the ASCII value of the actual string at position zero.
  3. Throw an error if the ASCII value is beyond 128 since HTML strings always start with values < 128 that denote a proper string.
  4. Runs into an infinite loop indefinitely (highlighted in red in the code).

How we solved the Problem

We upgraded the mPDF library to version 8.0.7, which resolved this issue. The special character-infinite loop issue has been fixed in this release by the PHP team.

Prior to writing the content to the PDF file, the following code could also be executed without leading to infinite loop issues.

Thanks to Mudassar for solving the issue and our technical writer Sowmya Narayanan.

Let us know how you would approach this issue. We’d love to hear from you.

Connect with our tech experts @Pepperfry to learn how we solve programming challenges every day. Happy Coding to you!

--

--