Convert PDF to Images or Images to PDF Using Python
PDF (Portable Document Format) is widely used for sharing documents while maintaining their formatting, but there are instances where the need arises to convert these documents into image formats like JPEG or PNG.
Similarly, converting images to PDF allows for combining multiple images into a single document for easier sharing and archival purposes.
In this article, I am going to introduce how to convert PDF to images and vice versa in Python using Spire.PDF for Python.
- Convert a Specific PDF Page into an Image in Python
- Convert a PDF Document into Multiple Images in Python
- Combine Multiple Images in a Single PDF in Python
Install Dependency
This solution requires Spire.PDF for Python to be installed as the dependency, which is a Python library for reading, creating and manipulating PDF documents in a Python program. You can install Spire.PDF for Python by executing the following pip command.
pip install Spire.PDF
Convert a Specific PDF Page into an Image in Python
The PdfDocument.SaveAsImage() method provided by Spire.PDF for Python allows you to convert a specific page into an image stream. You can save this stream as an image file with the desired format extension, such as PNG, JPG, or BMP.
from spire.pdf.common import *
from spire.pdf import *
# Create a PdfDocument object
pdf = PdfDocument()
# Load a PDF document
pdf.LoadFromFile("template.pdf")
# Convert a specific page to an image
with pdf.SaveAsImage(2) as imageS:
# Save the image as a JPG file
imageS.Save("PageToPNG.jpg")
pdf.Close()
Convert a PDF Document into Multiple Images in Python
We already know how to convert a specific page to an image. To convert each page into separate images, use a for loop.
from spire.pdf.common import *
from spire.pdf import *
# Create a PdfDocument object
pdf = PdfDocument()
# Load a PDF document
pdf.LoadFromFile("template.pdf")
# Iterate through all pages in the document
for i in range(pdf.Pages.Count):
# Save each page as an PNG image
fileName = "Output\\ToImage-{0:d}.png".format(i)
with pdf.SaveAsImage(i) as imageS:
imageS.Save(fileName)
pdf.Close()
Combine Multiple Images in a Single PDF in Python
To convert all images in a folder to a PDF, I iterate through each image, create a new page in the PDF with matching dimensions, and then render the image onto the page.
from spire.pdf.common import *
from spire.pdf import *
import os
# Create a PdfDocument object
doc = PdfDocument()
# Set the page margins to 0
doc.PageSettings.SetMargins(0.0)
# Get the folder where the images are stored
path = 'C:/Users/Administrator/Desktop/Images/'
files = os.listdir(path)
# Iterate through the files in the folder
for root, dirs, files in os.walk(path):
for file in files:
# Load a particular image
image = PdfImage.FromFile(os.path.join(root, file)) #FromFile(os.path.join(root, file))
# Get the image width and height
width = image.PhysicalDimension.Width
height = image.PhysicalDimension.Height
# Add a page that has the same size as the image
page = doc.Pages.Add(SizeF(width, height))
# Draw image at (0, 0) of the page
page.Canvas.DrawImage(image, 0.0, 0.0, width, height)
# Save to file
doc.SaveToFile('output/ConvertImagesToPdf.pdf')
doc.Dispose();
Conclusion
In this blog article, we’ve explored how to convert PDF to images and how to combine multiple images in a single PDF file using the Spire.PDF for Python library. This library supports many more file format conversions. If you’re interested, check out their official documentation.