Split an Excel File into Multiple Files with Python

Alice Yang
7 min readDec 19, 2023

--

Splitting an Excel file into multiple files can significantly enhance the effectiveness of data management. By separating related information into distinct files, you can establish a clear hierarchy and organization that aligns with your specific needs. This allows for more efficient data retrieval and manipulation, as you can quickly locate and access the relevant files without having to sift through a single, large file. Moreover, it also minimizes the risk of accidental modifications or deletions to unrelated data while working on a specific subset, ensuring data integrity and accuracy. In this article, we will explore how to split an Excel file into multiple files using Python.

We will discuss the following topics:

Python Library to Split Excel Files

To split an Excel file into multiple files with Python, we can use the Spire.XLS for Python library. It is an easy-to-use and feature-rich library for creating, reading, editing, and converting Excel files within Python applications. With this library, you can work with a lot of spreadsheet formats, such as XLS, XLSX, XLSB, XLSM, and ODS. Moreover, you are also able to render Excel files to other types of file formats, such as PDF, HTML, CSV, Text, Image, XML, SVG, ODS, PostScript, and XPS.

You can install Spire.XLS for Python from PyPI by running the following commands in your terminal:

pip install Spire.Xls

For more detailed information about the installation, you can check this official documentation: How to Install Spire.XLS for Python in VS Code.

Split an Excel File into Multiple Files by Worksheets with Python

One common way to split an Excel file is to divide it into separate files based on worksheets. This method is helpful when you have an Excel file that contains multiple worksheets that are conceptually distinct and you want to separate them into individual files.

Here is a simple example that shows how to split an Excel file into multiple files, with each file containing only one worksheet from the original Excel file using Python and Spire.XLS for Python:

from spire.xls import Workbook, FileFormat

# Specify the input Excel file path
input_file_path = "Input.xlsx"

# Create a Workbook object
workbook = Workbook()
# Load the Excel file
workbook.LoadFromFile(input_file_path)

# Loop through the worksheets in the loaded Excel file
for worksheet in workbook.Worksheets:
# Create a new Workbook object
new_workbook = Workbook()
# Clear the default worksheets in the new workbook
new_workbook.Worksheets.Clear()

# Copy the worksheets from the loaded Excel file to the new workbook
new_workbook.Worksheets.AddCopy(worksheet)

# Specify the output file path
output_file_path = "Output/SplittingByWorksheet/" + worksheet.Name + ".xlsx"
# Save the new workbook to an xlsx file
new_workbook.SaveToFile(output_file_path, FileFormat.Version2016)

workbook.Dispose()
Split an Excel File by Worksheets with Python
Split an Excel File by Worksheets with Python

Split an Excel Worksheet into Multiple Files by Rows with Python

Splitting an Excel worksheet by rows can be helpful when you have a large dataset and want to divide it into smaller, more manageable files. This method allows you to split the data based on a specified number of rows per file. It can be beneficial for situations where you need to process or analyze smaller subsets of data or when you want to share or distribute specific portions of the dataset.

Here is a simple example that shows how to split an Excel worksheet into multiple files by specific rows using Python and Spire.XLS for Python:

from spire.xls import Workbook, FileFormat

# Define an extract_rows() function to extract specific rows from an Excel worksheet to new Excel files
def extract_rows(worksheet, start_row, end_row, output_file_path):
# Create a new workbook for the extracted rows
new_workbook = Workbook()
# Clear the default worksheets in the new workbook
new_workbook.Worksheets.Clear()
# Add a new worksheet to the new workbook
new_worksheet = new_workbook.Worksheets.Add("Sheet1")

if start_row == 1:
# Copy specific rows from the source worksheet to the new worksheet
worksheet.Range[start_row, 1, end_row, len(worksheet.Columns)].Copy(new_worksheet.Range[1, 1])
else:
# Copy the header row from the source worksheet to the new worksheet
worksheet.Range[1, 1, 1, len(worksheet.Columns)].Copy(new_worksheet.Range[1, 1])
# Copy specific rows from the source worksheet to the new worksheet
worksheet.Range[start_row, 1, end_row, len(worksheet.Columns)].Copy(new_worksheet.Range[2, 1])

# Auto-adjust the columns' widths in the new worksheet based on cell content
new_worksheet.Range.AutoFitColumns()
# Save the new workbook to an xlsx file
new_workbook.SaveToFile(output_file_path, FileFormat.Version2016)
new_workbook.Dispose()

# Specify the input and output file paths
input_file_path = "Input1.xlsx"
output_file_path1 = "Output/SplittingByRow/Rows1-3.xlsx"
output_file_path2 = "Output/SplittingByRow/RemainingRows.xlsx"

# Load an Excel file
workbook = Workbook()
workbook.LoadFromFile(input_file_path)
# Get the first worksheet
worksheet = workbook.Worksheets[0]

# Call the extract_rows() function to extract rows 1-3 from the first worksheet to a new Excel file
extract_rows(worksheet, 1, 3, output_file_path1)
# Call the extract_rows() function to extract the remaining rows from the first worksheet to a new Excel file
extract_rows(worksheet, 4, len(worksheet.Rows), output_file_path2)

workbook.Dispose()
Split Excel Worksheet by Rows with Python
Split Excel Worksheet by Rows with Python

Split an Excel Worksheet into Multiple Files by Columns with Python

When you have a wide dataset with many columns, splitting an Excel worksheet by columns can be useful. This method allows you to divide the data into multiple files based on a specified number of columns per file. It can be beneficial when you want to focus on specific sets of columns or when you need to export or share specific subsets of data that are organized by columns.

Here is a simple example that shows how to split an Excel worksheet into multiple files by specific columns using Python and Spire.XLS for Python:

from spire.xls import Workbook, FileFormat

# Define an extract_columns() function to extract specific columns from an Excel worksheet to new Excel files
def extract_columns(worksheet, start_col, end_col, output_file_path):
# Create a new workbook for the extracted columns
new_workbook = Workbook()
# Clear the default worksheets in the new workbook
new_workbook.Worksheets.Clear()
# Add a new worksheet to the new workbook
new_worksheet = new_workbook.Worksheets.Add("Sheet1")

# Copy specific columns from the source worksheet to the new worksheet
worksheet.Range[1, start_col, len(worksheet.Rows), end_col].Copy(new_worksheet.Range[1, 1])

# Auto-adjust the columns' widths in the new worksheet based on cell content
new_worksheet.Range.AutoFitColumns()
# Save the new workbook to an xlsx file
new_workbook.SaveToFile(output_file_path, FileFormat.Version2016)
new_workbook.Dispose()

# Specify the input and output file paths
input_file_path = "Input1.xlsx"
output_file_path1 = "Output/SplittingByColumn/Columns1-3.xlsx"
output_file_path2 = "Output/plittingByColumn/RemainingColumns.xlsx"

# Create a Workbook object
workbook = Workbook()
# Load an Excel file
workbook.LoadFromFile(input_file_path)
# Get the first worksheet
worksheet = workbook.Worksheets[0]

# Call the extract_columns() function to extract columns 1-3 from the first worksheet to a new Excel file
extract_columns(worksheet, 1, 3, output_file_path1)
# Call the extract_columns() function to extract the remaining columns from the first worksheet to a new Excel file
extract_columns(worksheet, 4, len(worksheet.Columns), output_file_path2)

workbook.Dispose()
Split Excel Worksheet by Columns with Python
Split Excel Worksheet by Columns with Python

Split an Excel Worksheet into Multiple Files Based on Specific Values with Python

In certain scenarios, you may want to split an Excel worksheet into multiple files based on specific values. For example, if you have an Excel worksheet containing sales data for various product categories like components, clothing, and accessories, you may find it beneficial to split the worksheet into multiple files. Each resulting file would exclusively contain the data relevant to a specific product category.

Here is a simple example that shows how to split an Excel worksheet into multiple files by specific values using Python and Spire.XLS for Python:

from spire.xls import Workbook, FileFormat, FindType, ExcelFindOptions

# Define an extract_rows_by_value() function to extract rows based on specific values from an Excel worksheet to new Excel files
def extract_rows_by_value(worksheet, value, output_file_path):
# Create a new workbook for the extracted rows
new_workbook = Workbook()
# Clear the default worksheets in the new workbook
new_workbook.Worksheets.Clear()
# Add a new worksheet to the new workbook
new_worksheet = new_workbook.Worksheets.Add("Sheet1")

# Copy the header row from the source worksheet to the new worksheet
worksheet.Range[1, 1, 1, len(worksheet.Columns)].Copy(new_worksheet.Range[1, 1])

# Find the cells containing a specific string value
ranges = worksheet.FindAll(value, FindType.Text, ExcelFindOptions.MatchEntireCellContent)

i = 2
# Loop through the found cells
for range in ranges:
# Get the rows where the cells are located
row = range.Row
# Copy the rows to the new worksheet
worksheet.Range[row, 1, row, len(worksheet.Columns)].Copy(new_worksheet.Range[i, 1])
i += 1

# Auto-adjust the columns' widths in the new worksheet based on cell content
new_worksheet.Range.AutoFitColumns()
# Save the new workbook to a new file
new_workbook.SaveToFile(output_file_path, FileFormat.Version2016)
new_workbook.Dispose()

# Specify the input and output file paths
input_file_path = "Template.xlsx"
output_file_path1 = "Output/SplittingByValue/Components.xlsx"
output_file_path2 = "Output/SplittingByValue/Clothing.xlsx"

# Create a Workbook object
workbook = Workbook()
# Load an Excel file
workbook.LoadFromFile(input_file_path)
# Get the first worksheet
worksheet = workbook.Worksheets[0]

# Call the extract_rows_by_value() function to extract rows containing a speicic string value "Components" to a new Excel file
extract_rows_by_value(worksheet, "Components", output_file_path1)
# Call the extract_rows_by_value() function to extract rows containing a speicic string value "Clothing" to a new Excel file
extract_rows_by_value(worksheet, "Clothing", output_file_path2)
workbook.Dispose()
Split Excel Worksheet Based on Specific Value with Python
Split Excel Worksheet Based on Specific Value with Python

Conclusion

This article demonstrated 4 different scenarios for splitting Excel files using Python. We hope you find it helpful.

Related Topics

--

--

Alice Yang

Skilled senior software developers with five years of experience in all phases of software development life cycle using .NET, Java and C++ languages.