Merge PDF Files or Pages into One with Python

Alice Yang
4 min readSep 12, 2023

--

Merge PDF Files with Python

Merging multiple PDF files into a single document helps to keep related information together and improves organization. It allows you to create a unified file structure, making it easier to locate and access important information when needed. This can be particularly useful when dealing with large amounts of data, such as research papers, business reports, or legal documents. In this article, we will explain how to merge PDF files into one using Python.

We will discuss the following topics:

Python Library to Merge PDF Files

To merge PDF files with Python, we can use the Spire.PDF for Python library.

Spire.PDF for Python is a feature-rich and user-friendly library that enables creating, reading, editing, and converting PDF files within Python applications. With this library, you can perform a wide range of manipulations on PDFs, including adding text or images, extracting text or images, adding digital signatures, adding or deleting pages, merging or splitting PDFs, creating bookmarks, adding text or image watermarks, inserting fillable forms and many more. In addition, you are also able to convert PDF files to various file formats, such as Word, Excel, images, HTML, SVG, XPS, OFD, PCL, and PostScript.

You can install Spire.PDF for Python from PyPI using the following pip command:

pip install Spire.Pdf

For more detailed information about the installation, you can check this official documentation: How to Install Spire.PDF for Python in VS Code.

Merge Multiple PDF Files into One PDF with Python

To merge multiple PDF files into one PDF, you can use the PdfMerger.MergeByFile() method.

Here is a simple example that shows how to merge multiple PDF files into one PDF using Python and Spire.PDF for Python:

from spire.pdf.common import *
from spire.pdf import *

# Store the paths of the files to be merged into a list
pdfs = ["File1.pdf", "File2.pdf", "File3.pdf"]

# Create merge options
mergeOptions = MergerOptions()

# Specify the output file path
outputPdf = "MergePdfFiles.pdf"

# Merge the files into a single PDF
PdfMerger.MergeByFile(pdfs, outputPdf, mergeOptions)

Merge Multiple PDF Files using Streams with Python

To merge multiple PDF files using Streams, you can use the PdfMerger.MergeByStream() method.

Here is a simple example that shows how to merge multiple PDF files using streams using Python and Spire.PDF for Python:

from spire.pdf.common import *
from spire.pdf import *

# Read PDF files into streams
stream1 = Stream("File1.pdf")
stream2 = Stream("File2.pdf")
stream3 = Stream("File3.pdf")

# Create a list from the streams
streams = [stream1, stream2, stream3]

# Specify merge options
mergeOptions = MergerOptions()

# Merge the PDF streams
outputPdf = Stream("MergeFilesByStream.pdf")
PdfMerger.MergeByStream(streams, outputPdf, mergeOptions)

for stream in streams:
stream.Close()

Merge Specific Pages of Multiple PDF Files into One PDF with Python

In addition to merging the entire PDF files, you are also able to merge a specific page or a range of pages of the PDF files into one PDF using the PdfDocument.InsertPage() or PdfDocument.InsertPageRange() method.

Here is a simple example that shows how to merge a specific page or a range of pages of different PDF files into one PDF using Python and Spire.PDF for Python:

from spire.pdf.common import *
from spire.pdf import *

# Load two PDF files
pdf1 = PdfDocument("File1.pdf")
pdf2 = PdfDocument("File2.pdf")

# Create a new PDF file
newPdf = PdfDocument()

# Import page 1 of the first PDF into the new PDF
newPdf.InsertPage(pdf1, 0)
# Import pages 1-2 of the second PDF into the new PDF
newPdf.InsertPageRange(pdf2, 0, 1)

# Save the resulting PDF file
newPdf.SaveToFile("MergePdfPagesIntoOnePdf.pdf")

pdf1.Close()
pdf2.Close()
newPdf.Close()

Merge Multiple Pages of a PDF File into a Single Page with Python

Sometimes, you may need to combine the contents of two or more pages of a PDF file into a single page. With Spire.PDF for Python, you can achieve this by creating a new larger page and then drawing the contents of the pages on specific locations of the newly created page.

Here is a simple example that shows how to merge multiple pages of a PDF file into a single page of another PDF file using Python and Spire.PDF for Python:

from spire.pdf.common import *
from spire.pdf import *

# Load a PDF file
pdf = PdfDocument("File1.pdf")

# Get the page width and page height of the loaded PDF
pageWidth = pdf.PageSettings.Width
pageHeight = pdf.PageSettings.Height

# Specify the start index and end index of the pages to be merged
startPageIndex = 0
endPageIndex = 1

# Create a new PDF file
newPdf = PdfDocument()

# Create a new page width that is the sum of the widths of the pages to be merged
newPageWidth = pageWidth * (endPageIndex - startPageIndex + 1)

# Add a new page with the new page width and the same page height to the new PDF file
newPage = newPdf.Pages.Add(SizeF(newPageWidth, pageHeight), PdfMargins(0.0))

# Specify the initial x and y coordinates
x = 0.0
y = 0.0

# Loop through the pages to be merged in the loaded PDF
for i in range(startPageIndex, endPageIndex + 1):
page = pdf.Pages[i]
# Draw the content of each page on a specific location of the new page of the new PDF file
newPage.Canvas.DrawTemplate(page.CreateTemplate(), PointF(x, y))
# Change the x coordinate
x += pageWidth

# Save the new PDF file to a specific path
newPdf.SaveToFile("MergePdfPagesIntoOnePage.pdf")

pdf.Close()
newPdf.Close()

Conclusion

This article demonstrated various scenarios to merge PDF files using Python and Spire.PDF for Python. We hope you can find it helpful.

Related Topics

Convert PDF to Word DOCX or DOC with Python

Read or Extract Text from PDF with Python — A Comprehensive Guide

Convert PDF to Images (PNG, JPG, BMP, EMF) with Python

Split PDF Files or Pages with Python

Add Watermarks to PDF with Python

--

--

Alice Yang

Skilled senior software developers with five years of experience in all phases of software development life cycle using .NET, Java and C++ languages.