Python PDF Editor

buzonliao
Python 101
Published in
2 min readOct 12, 2023
Photo by ChatGpt4

Explore the pypdf module for Python and discover how to manipulate PDF files. This guide covers rotating text, merging PDF files, adding watermarks, and removing watermarks from PDF documents. Check out the reference link for further details on pypdf capabilities and functions.

Rotate Text

  • Read the PDF file and rotate the content
from pypdf import PdfReader, PdfWriter

# rotate pdf file
reader = PdfReader("dummy.pdf")
page = reader.pages[0]
page.rotate(270)
writer = PdfWriter()
writer.add_page(page)
writer.write('tilt.pdf')

Merge PDF files

  • Input multiple PDF files and merge them
# run python3 pdf_merger.py dummy.pdf twopage.pdf wtr.pdf
import sys
from pypdf import PdfWriter

inputs = sys.argv[1:]

# merge pdf files
def pdf_merger(pdf_list):
merger = PdfWriter()

for pdf in pdf_list:
merger.append(pdf)
merger.write("merged-pdf.pdf")
merger.close()

pdf_merger(inputs)

Add Watermark

  • Add an image watermark to the PDF file
from pypdf import PdfWriter, PdfReader

watermark = PdfReader("wtr.pdf").pages[0]
writer = PdfWriter(clone_from="twopage.pdf")
for page in writer.pages:
page.merge_page(watermark, over=False) # here set to False for watermarking

writer.write("add_watermark.pdf")

Remove Watermark

  • Read watermarked PDF file
  • Remove the watermark, then save it
import sys
from pypdf import PdfReader, PdfWriter

# run python3 pdf_add_watermark.py
# run python3 pdf_remove_watermark.py add_watermark.pdf

original_pdf = sys.argv[1]
with open(original_pdf, "rb") as input_file, open('remove_watermark.pdf', "wb") as output_file:

reader = PdfReader(input_file)
writer = PdfWriter()

for n in range(len(reader.pages)):

page = reader.pages[n]
del page["/Contents"][0]

writer.add_page(page)

writer.write(output_file)

Reference

--

--