Convert Word to PDF with Python: 9 Code Examples

Alexander Stock
7 min readOct 17, 2023

--

Convert Word to PDF in Python.

Document conversion, particularly from Word to PDF, is vital for many tasks. Converting Word to PDF has numerous advantages, such as preserving formatting, ensuring compatibility across devices and systems, enhancing security, and enabling easy sharing and storage.

Converting a Word document to PDF isn’t always enough. It’s important to have the flexibility to customize conversion settings based on your specific requirements. In this article, I’ll provide you with the following 9 code examples that demonstrate various adjustable settings during the conversion process.

Install Dependency

This solution requires Spire.Doc for Python to be installed as a dependency, which is a Python library for reading, creating and manipulating Word documents in a Python program. You can install Spire.Doc for Python by executing the following pip command.

pip install Spire.Doc

Convert Word to PDF in Python

Converting a Word document to PDF with Spire.Doc is a simple task. Start by loading the Word document using the LoadFromFile or LoadFromStream method of the Document class. Afterward, convert and save the document as a PDF using the SaveToFile method.

from spire.doc import *
from spire.doc.common import *

# Create word document
document = Document()

# Load a doc or docx file
document.LoadFromFile("C:\\Users\\Administrator\\Desktop\\input.docx")

# Save the document to PDF
document.SaveToFile("output/ToPDF.pdf", FileFormat.PDF)

# Dispose resources
document.Dispose()

Convert Word to PDF/A in Python

PDF/A is a specialized format created to ensure the long-term preservation of digital documents, guaranteeing that their content remains accessible and unaltered as time goes on.

To specify the PDF conformance level, you can make use of the PdfConformanceLevel property found in the ToPdfParameterList object. By passing this object as an argument to the SaveToFile method, you can achieve the desired result.

from spire.doc import *
from spire.doc.common import *

# Create a Document object
document = Document()

# Load a Word file
document.LoadFromFile("C:\\Users\\Administrator\\Desktop\\Input.docx")

# Create a ToPdfParameterList object
parameters = ToPdfParameterList()

# Set the conformance level for PDF
parameters.PdfConformanceLevel = PdfConformanceLevel.Pdf_A1A;

# Save the Word document to PDF
document.SaveToFile("output/ToPDFA.pdf", parameters)

# Dispose resources
document.Dispose()

Convert Word to PDF with Specified Page Size in Python

During the conversion from Word to PDF, you may need to modify the page size to align with standard paper sizes like Letter, Legal, Executive, A4, A5, B5, and others. Alternatively, you might want to customize the page dimensions for non-standard sizes or optimize the document for electronic distribution.

By utilizing the PageSetup.PageSize property, you can adjust the page size of the Word document to either a standard or custom paper size. This page configuration will be applied when converting Word to PDF.

from spire.doc import *
from spire.doc.common import *

# Create word document
document = Document()

# Load a doc or docx file
document.LoadFromFile("C:\\Users\\Administrator\\Desktop\\input.docx")

# Iterate through the secitons in the document
for i in range(document.Sections.Count):

# Get a specific section
section = document.Sections.get_Item(i)

# Change the page size of each section to A4
section.PageSetup.PageSize = PageSize.A4()

# Change the page size of each section to a custom size
# section.PageSetup.PageSize = SizeF(400.0, 800.0)

# Save the document to PDF
document.SaveToFile("output/ToPDF_A4.pdf", FileFormat.PDF)

# Dispose resources
document.Dispose()

Convert Word to Password-Protected PDF

By converting a Word document into a PDF that is password-protected, you can easily and effectively safeguard sensitive information, ensuring its confidentiality and security.

To accomplish this, you can utilize the PdfSecurity.Encrypt method within the ToPdfParameterList object. This method enables you to specify both the open password and permission password for the resulting PDF file.

from spire.doc import *
from spire.doc.common import *

# Create a Document object
document = Document()

# Load a Word file
document.LoadFromFile("C:\\Users\\Administrator\\Desktop\\input.docx")

# Create a ToPdfParameterList object
parameter = ToPdfParameterList()

# Specify open password and permission password
openPsd = "abc-123"
permissionPsd = "permission"

# Protect the PDF to be generated with open password and permission password
parameter.PdfSecurity.Encrypt(openPsd, permissionPsd, PdfPermissionsFlags.Default, PdfEncryptionKeySize.Key128Bit)

# Save the Word document to PDF
document.SaveToFile("output/ToPdfWithPassword.pdf", parameter)

# Dispose resources
document.Dispose()

Convert a Specific Section in Word to PDF in Python

Converting a specific section of a Microsoft Word document into a PDF can be incredibly helpful when you’re looking to share or save just a part of a bigger document.

Spire.Doc offers the Section.Clone method which allows you to create a copy of a section, and this section can be added to a brand new document using Sections.Add method. This method lets you grab a specific section from your main document and save it as its own file, so you can turn it into a PDF super easy later on.

from spire.doc import *
from spire.doc.common import *
import io

# Create word document
document = Document()

# Load a doc or docx file
document.LoadFromFile("C:\\Users\\Administrator\\Desktop\\input.docx")

# Get a specific section of the document
section = document.Sections.get_Item(0)

# Create a new document object
newDoc = Document()

# Clone the default style to the new document
document.CloneDefaultStyleTo(newDoc)

# Clone the section to the new document
newDoc.Sections.Add(section.Clone())

# Save the new document to PDF
newDoc.SaveToFile("output/SectionToPDF.pdf", FileFormat.PDF)

# Dispose resources
document.Dispose()

Set Image Quality when Converting Word to PDF in Python

Adjusting image quality during a Word to PDF conversion is essential for balancing image clarity with file size. High-quality images lead to large files, which can be inconvenient for sharing. Reducing image quality can decrease the file size without affecting the document’s readability, especially if the images are primarily for reference rather than critical visual details.

To adjust the image quality before conversion, you can use the JPEGQuality property of the Document object. For example, by setting the JPEGQuality value to 50, you reduce the image quality to 50% of its original resolution.

from spire.doc import *
from spire.doc.common import *

# Create a Document object
document = Document()

# Load a Word file
document.LoadFromFile("C:\\Users\\Administrator\\Desktop\\input.docx")

# Compress image to 40% of its original quality
document.JPEGQuality = 40

# Preserve original image quality
# document.JPEGQuality = 100

# Save the Word document to PDF
document.SaveToFile("output/SetImageQuality.pdf", FileFormat.PDF)

# Dispose resources
document.Dispose()

Embed Fonts when Converting Word to PDF in Python

Embedding fonts in PDFs ensures that the original font styles and effects are preserved, even if the recipient doesn’t have those fonts installed on their device. This is particularly important for documents that use unique or non-standard fonts.

To ensure that all fonts used in a Word document are embedded in the corresponding PDF, set the IsEmbeddedAllFonts property of the ToPdfParameterList to true. For a more selective approach, you can define a list of fonts to embed using the EmbeddedFontNameList property.

from spire.doc import *
from spire.doc.common import *

# Create a Document object
document = Document()

# Load a Word file
document.LoadFromFile("C:\\Users\\Administrator\\Desktop\\input.docx")

# Create a ToPdfParameterList object
parameter = ToPdfParameterList()

# Embed fonts in PDF
parameter.IsEmbeddedAllFonts = True

# Save the Word document to PDF
document.SaveToFile("output/EmbedFonts.pdf", parameter)

# Dispose resources
document.Dispose()

Create Bookmarks when Converting Word to PDF in Python

Creating bookmarks when converting a Word document to PDF is particularly useful for reports, manuals, and other documents where users need to jump to different parts of the content.

When converting Word documents to PDF using Spire.Doc, you can choose to automatically generate bookmarks based on existing bookmarks or headings. This can be accomplished by setting the CreateWordsBookmarks property or the CreateWordBookmarksUsingHeadings property to True.

from spire.doc import *
from spire.doc.common import *

# Create a Document object
document = Document()

# Load a Word file
document.LoadFromFile("C:\\Users\\Administrator\\Desktop\\input.docx")

# Create a ToPdfParameterList object
parames = ToPdfParameterList()

# Create bookmarks from Word headings
parames.CreateWordBookmarksUsingHeadings = True

# Create bookmarks in PDF from existing bookmarks in Word
# parames.CreateWordBookmarks = True

# Save the document to PDF
document.SaveToFile("output/ToPdfWithBookmarks.pdf", FileFormat.PDF)

# Dispose resources
document.Dispose()

Disable Hyperlinks when Converting Word to PDF in Python

By deactivating hyperlinks during the conversion process, you can maintain better control over the document’s presentation and ensure a seamless reading experience without any unintended redirections.

To prevent hyperlinks from functioning in the generated PDF documents, set the DisableLink property of the ToPdfParameterList to true.

from spire.doc import *
from spire.doc.common import *

# Create a Document object
document = Document()

# Load a Word file
document.LoadFromFile("C:\\Users\\Administrator\\Desktop\\input.docx")

# Create an instance of ToPdfParameterList
parameter = ToPdfParameterList()

# Set DisableLink to true to remove the hyperlink effect for the result PDF page
parameter.DisableLink = True

# Save the Word document to PDF
document.SaveToFile("output/DisablePDF.pdf", parameter)

# Dispose resources
document.Dispose()

Get a Free Trial License

Spire.Doc for Python has a restriction where only the first three pages of a Word document can be converted to PDF. To overcome this limitation, you can get a free trial license, which removes the page limit and allows for unlimited conversions.

Conclusion

In this blog post, we have covered various methods and considerations for converting Word documents to PDF using C#. We discussed techniques such as converting to PDF/A format, adjusting page sizes, and many others. I hope you can find it helpful.

Related Topics

--

--

Alexander Stock

I'm Alexander Stock, a software development consultant and blogger with 10+ years' experience. Specializing in office document tools and knowledge introduction.