Convert multiple corrupt XLS files to XLSX | Python in Finance #5

Edward Jones
Geek Culture
Published in
2 min readMay 27, 2021

--

In this article you will learn how to convert a batch of corrupt XLS files to an XLSX file.

Trade-off of automating this process

In this article, you will learn how to automate the conversion process from (corrupt) xls files to normal xlsx files.

Data can be found: here

Packages

  • pywin32: This package is basically vba for python. It allows us to interact and automate Windows applications with python.
  • os: This package allows us to use the operating system.
  • glob: This package allows us to create a list with the different file locations
import win32com.client
import os
import glob

Initialization

  • You initialize the win32com and let it run an excel application.
  • Set the object “o.Visible = False” in order to hide the excel application you have created.
  • Set the input directory: This is the directory where all your corrupt xls files are located.
  • Set the output directory: This is the directory where you want to store the converted xlsx files.
  • Create a list of file paths of the all the files in the input directory by using the glob.glob function

--

--