Convert Selected Pages from Multipage PDF into Images (PHP)

Last month we encountered a challenging task of converting certain pages from a huge PDF document to TIFF files (or any other image format) for people to preview before downloading. I thought I should share how this done with all of you.

Step1: Upload a multipage PDF file to the server (e.g 40 pages PDF).
Step2: Count the number of pages in PDF. 
Step3: Give user an option to select a page (e.g user selected 10,12,30,40).
Step4: Generate another PDF with selected pages.
Step5: Conver PDF to image.

I used fineuploader to upload multiple documents to server. Second software you need to covert PDF to image is pdfLib.

Uploadify.php have the following code, upload a file and count the pages in PDF.

Step 1 & 2

/**
* Uploadify.php
* @copyright (C) Xenopsi
* @author $Author: khuram $
*/
if(move_uploaded_file($tempFile,$targetFile))
{
$searchpath = (dirname(__FILE__)).”/input”;
$outfile_basename = “split_document”;
$title = “Split PDF Document”;
$infile = “uploads/newfile.pdf”;

$p = new pdflib();

$p->set_option(“searchpath={“ . $searchpath . “}”);

/* This means we must check return values of load_font() etc. */
$p->set_option(“errorpolicy=return”);
$p->set_option(“stringformat=utf8”);

$indoc = $p->open_pdi_document($infile, “”);
if ($indoc == 0)
throw new Exception(“Error: “ . $p->get_errmsg());

/*
* Determine the number of pages in the input document and compute
* the number of output documents.
*/
$page_count = (int) $p->pcos_get_number($indoc, “length:pages”);



echo json_encode(array(‘success’ => true,’filename’=>$new_name,’page_count’=>$page_count,”display_filename”=>$_FILES[‘qqfile’][‘name’]));
} 
else
{
echo json_encode(array(‘success’ => false));
}

Step 3: Once you get the response from uploadify.php then give user an option to select the page(s).

Step4: Generate another PDF with selected pages.

/**
* generate.php
* @copyright (C) Xenopsi
* @author $Author: khuram $
*/

/*
* Document will be split into sub-documents where each document has
* this many pages (except the last sub-document potentially).
*/
define(“SUBDOC_PAGES”, $_REQUEST[“total_pages”]+1);
$new_pages = $_POST["userSelectedPages"];
try {
 $p = new pdflib();

$p->set_option(“searchpath={“ . $searchpath . “}”);
 /* This means we must check return values of load_font() etc. */
$p->set_option(“errorpolicy=return”);
$p->set_option(“stringformat=utf8”);
 $indoc = $p->open_pdi_document($infile, “”);
if ($indoc == 0)
throw new Exception(“Error: “ . $p->get_errmsg());
/*
* Determine the number of pages in the input document and compute
* the number of output documents.
*/

$page_count = (int) $p->pcos_get_number($indoc, “length:pages”);

$outdoc_count = ($page_count / SUBDOC_PAGES) + ($page_count % SUBDOC_PAGES > 0 ? 1 : 0);
 /*
* The loop only produces a single output document that is returned over
* HTTP.
*
* For producing all output documents, change the loop condition like this:
*
* $outdoc_counter < $outdoc_count
*/

for ($outdoc_counter = 0, $page = 0; $outdoc_counter < 1; $outdoc_counter += 1)
{
$outfile = $outfile_basename . “_” . ($outdoc_counter + 1) . “.pdf”;
/*
* Open new sub-document.
*/
if ($p->begin_document(“”, “”) == 0)
throw new Exception(“Error: “ . $p->get_errmsg());
 $p->set_info(“Creator”, “Khuram Noman”);
$p->set_info(“Title”, $title . ‘ $Revision: 1.2 $’);
$p->set_info(“Subject”, “Sub-document “ . ($outdoc_counter + 1)
. “ of “ . $outdoc_count . “ of input document ‘” . $infile . “’”);
for ($i = 1; $page < $page_count && $i < SUBDOC_PAGES;
$page += 1, $i += 1) {

if(in_array($i, $new_pages))
{

/* Dummy page size; will be adjusted later */
$p->begin_page_ext(10, 10, “”);
$pagehdl = $p->open_pdi_page($indoc, $page + 1, “”);
if ($pagehdl == 0)
throw new Exception(“Error opening page: “ . $p->get_errmsg());
/*
* Place the imported page on the output page, and adjust
* the page size
*/
$p->fit_pdi_page($pagehdl, 0, 0, “adjustpage”);
$p->close_pdi_page($pagehdl);
$p->end_page_ext(“”);
}
}
/* Close the current sub-document */
$p->end_document(“”);
/*
* Return the sub-document to the user. If all split documents are to
* be processed, do something different, e.g. write the documents
* to disk and create an HTML page with a list of links for the
* sub-documents.
*/
$buf = $p->get_buffer();
$len = strlen($buf);
/* header(“Content-type: application/pdf”);
header(“Content-Length: $len”);
header(“Content-Disposition: inline; filename=” . $outfile);
print $buf;*/
}
/* Close the input document */
$p->close_pdi_document($indoc);
}
$createPDF = “uploads/converted_”.$_REQUEST[“file_name”];
$fh = fopen($createPDF, ‘w’);
fwrite($fh, $buf);
fclose($fh);

$source_file = “\\uploads\\”.$_POST[“file_name”];

Step5: Conver PDF to image.

$output = exec ( “nconvert.exe -in tiff -multi -out tiff -c 7 -o \\uploads\\pdf_filename.pdf \\downloads\\1.tif”);
Like what you read? Give Khuram Noman a round of applause.

From a quick cheer to a standing ovation, clap to show how much you enjoyed this story.