Business Breakthroughs: Automating Data Entry With AI

Jake Nolan
14 min readJan 9, 2024

--

An AI brain reading text from a PDF

Introduction to Data Entry and AI

Welcome to the world where tedious manual data entry becomes a thing of the past! In this AI era of digital transformation, businesses are constantly seeking ways to enhance efficiency and productivity. A lot of business solutions of today are leveraging AI to make these enhancements happen. One such leap in business productivity comes in the form of automating data entry, particularly with AI. Data entry automation has become necessary in this world of big data. Previously, this process required extremely custom solutions with lots of specific requirements, but with the boom of OpenAI and large language models (LLMs) we can create faster and more flexible data entry solutions.

In this article, we’re diving into the foundation of setting up an automated data entry stream. Our journey will take us from the initial steps of uploading PDFs to extracting their text, and finally, to the exciting part — organizing that text in JSON format with OpenAI’s API. Whether you’re a seasoned software engineer or just dipping your toes into the world of AI, this tutorial aims to provide you with a solid start to build upon.

But this isn’t just about the ‘what’ and the ‘how’; it’s also about the ‘why’. As we walk through the code and its functionalities, we’ll explore how these technologies not only streamline workflows but also open up new possibilities for data handling and analysis. So, gear up for an insightful journey into the realm of AI-driven data automation. By the end of this article, you’ll have a working setup that not only automates data entry from PDFs but also sets the stage for building more advanced applications on top of this solution for your business processes. Let’s get started!

The Need for Automation in Data Entry

In an era where data is akin to gold, managing it becomes crucial for business success. But let’s face it, manual data entry is a thankless job. It’s repetitive, prone to human error, and let’s not forget, incredibly time-consuming. This is where automation (particularly in data entry) steps in as a savior for your time and sanity.

Why Automate Data Entry?

  1. Accuracy and Consistency: Human error is an inevitable part of manual data entry. Automation significantly reduces these errors, ensuring data accuracy and consistency.
  2. Time Efficiency: Automated systems work tirelessly, processing data much faster than a human ever could. This speed translates into more time for your team to focus on tasks that matter. Tasks that require unique human ingenuity.
  3. Cost-Effectiveness: While setting up automation might require an initial investment, it saves costs in the long run by reducing the need for extensive work hours and error corrections that could prove costly to fix.
  4. Scalability: Automation systems are highly scalable. They can handle increased volumes of data without the need for proportionate increases in staff or resources.
  5. Data Analysis and Insights: Automated systems don’t just process data; they can also analyze it, providing valuable insights that can guide business decisions.

The Big Data Challenge

In the context of big data, the need for automation becomes even more pronounced. The sheer volume, variety, and velocity of data generated today make manual processing impractical, if not impossible. Automation not only handles the volume but also brings structure to this vast amount of data, making it usable and insightful while bringing the possibility of processing it all back to reality.

AI + Data Entry: A Perfect Match

Introducing AI into this equation takes things a step further. AI doesn’t just automate tasks; it brings intelligence to them. For instance, when extracting data from PDFs, AI can understand and interpret various formats and layouts, adjusting intelligently on the go, something traditional automation might struggle with. This adaptability and learning capability of AI are what make it an invaluable asset in the journey toward efficient data entry and management. Now let’s dive into the practical steps to achieve this, starting with the tools and technologies you’ll need.

What You’ll Need

Now let’s get into it. First, we need to gather our toolkit. This section will guide you through the essential tools and technologies you’ll need to follow along with this tutorial. Don’t worry, I have kept it straightforward and accessible.

Essential Tools and Technologies

  1. Python: The backbone of our project. Python’s simplicity and vast library ecosystem make it the perfect language for this task. Not to mention it is the largest used language for AI. Ensure you have Python installed on your system. If you haven’t, you can download it from python.org.
  2. OpenAI and API Key: To leverage AI for data entry, we’ll use OpenAI’s powerful language models. First, you need to install the library which is as simple as running pip install openai. Next, you'll need an API key from OpenAI. If you don't have one, you can obtain it by signing up on OpenAI's API platform. After signing up, you can create your key at API Keys. IMPORTANT, using OpenAI’s API costs money. Depending on the model you use it can add up faster or slower. We will touch back on how to adjust the model later in the tutorial.
  3. Streamlit: For creating a user-friendly interface, to uploading PDFs, and viewing results. Streamlit is a fast and easy way to build web apps for your Python projects. Installation just takes a quick pip install streamlit in your terminal.
  4. PyPDF: A Python library to read text from PDF files. It’s a crucial part of our data extraction process. Install it using pip install pypdf.
  5. A Code Editor: Any code editor will do, but if you need a suggestion, Visual Studio Code is a quick and easy-to-pick choice.

Setting Up Your Environment

Once you have these tools ready, it’s time to set up your environment. Create a new Python project in your preferred code editor and ensure you have all the libraries installed. Here’s a quick recap of the necessary installations:

pip install openai streamlit pypdf

In the next section, we’ll delve into the actual code in detail, explaining how it works and how you can modify it to suit your specific needs. So, grab a cup of coffee, and let’s get into the code!

Getting Started with Code: Setting Up Imports and Streamlit

Before diving into the main functionalities of our data entry automation tool, let’s lay the groundwork. This includes setting up the necessary imports and configuring our Streamlit application. This section will guide you step by step through this initial setup process.

Setting Up Imports

First things first, we need a Python script. I named mine main.py. Now, let’s import the libraries that our project relies on. Here’s a breakdown of our necessary imports:

# Imports
from pypdf import PdfReader
from openai import OpenAI
import streamlit as st
import json
import time

Streamlit Setup

Now, let’s set up Streamlit. This involves configuring various elements of our web app, such as the API key input, file uploader, and history management. Here’s how we start:

# Initial setup
if "OPENAI_API_KEY" not in st.session_state:
st.session_state["OPENAI_API_KEY"] = ""
if "history" not in st.session_state:
st.session_state["history"] = []
if "view_history" not in st.session_state:
st.session_state["view_history"] = False
if "viewing_history_entry" not in st.session_state:
st.session_state["viewing_history_entry"] = {}
if "pdf_uploader_int" not in st.session_state:
st.session_state["pdf_uploader_int"] = 0

Here, we’re initializing several session states in Streamlit. These states will help us manage different aspects of the application:

  • OPENAI_API_KEY: This stores the OpenAI API key entered by the user.
  • history: A list to keep track of the files processed and their results.
  • view_history: A boolean to control whether the user should be viewing a stored history result.
  • viewing_history_entry: Stores the data of the currently viewed history entry.
  • pdf_uploader_int: A helper integer used to reset the state of the PDF uploader (for cleanliness).

Running the Script

From here on out we can run and test our script’s functionality simply by starting a local Streamlit run of our code. Since my script is named main.py, I would start the session with:

streamlit run main.py

With this setup in place, we’re ready to build the side menu which will handle OpenAI key input and the history section. Stay tuned as we start to bring our AI-powered data entry tool to life!

Building the Interface: The Side Menu in Streamlit

Having set up our imports and initialized Streamlit, our next step is to craft the user interface of our application. We’ll begin by building the side menu using Streamlit’s sidebar functionality. This menu is crucial for user interaction, housing both the OpenAI API key input and the history of processed files.

Streamlit Sidebar: The Core of User Setup and History

Streamlit’s sidebar, st.sidebar, offers a dedicated space for controls and information not part of the main workflow but are essential for the application's functionality. In our case, we're utilizing it for two primary functions: managing the OpenAI API key input and displaying the processing history.

Setting Up the OpenAI API Key Input

The first component in our sidebar is the section for the OpenAI API key. This key is essential for accessing OpenAI’s services. Here’s how we set it up:

# OpenAI API key section in sidebar
with st.sidebar:
if st.session_state["OPENAI_API_KEY"] == "":
with st.form(key="openai_api_key_form"):
st.subheader("🔑 OpenAI API Key")
st.session_state["OPENAI_API_KEY"] = st.text_input("OpenAI API Key",
label_visibility="collapsed", key="openai_api_key", type="password")
submit_button, needs_key_button = st.columns(spec=[1, 1])
with submit_button:
api_key_submitted = st.form_submit_button("Save key", type="primary")
with needs_key_button:
st.link_button("Need a key?",
"<https://platform.openai.com/account/api-keys>")
if api_key_submitted:
st.rerun()
else:
with st.form(key="openai_api_key_form"):
st.subheader("🔑 OpenAI API Key")
st.success("API key saved!")
new_key_requested = st.form_submit_button("Change key")
if new_key_requested:
st.session_state["OPENAI_API_KEY"] = ""
st.rerun()

This code snippet sets up a form within the Streamlit sidebar for users to enter and save their OpenAI API key. It’s a crucial step to ensure that the application can interact with OpenAI’s API.

Managing the History Section in the Sidebar

Following the API key input, we have the history section:

# History section in sidebar
with st.sidebar:
# OpenAI API key section here...
st.markdown("# 📑 History")
for entry in st.session_state["history"]:
with st.form(key=entry["file_name"]):
st.subheader(entry["file_name"])
col1, col2, col3 = st.columns(spec=[3, 4, 3])
with col1:
delete_button_submitted = st.form_submit_button("Delete")
with col2:
download_button_submitted = st.form_submit_button("Download")
with col3:
view_button_submitted = st.form_submit_button("View")
if delete_button_submitted:
for to_delete_entry in st.session_state["history"]:
if to_delete_entry["file_name"] == entry["file_name"]:
st.session_state["history"].remove(entry)
st.rerun()
if download_button_submitted:
with open(f"{entry['file_name']}.json", "w") as f:
json.dump(entry["json"], f)
st.session_state["view_history"] = True
st.session_state["viewing_history_entry"] = entry
st.session_state["pdf_uploader_int"] += 1
st.rerun()
if view_button_submitted:
st.session_state["view_history"] = True
st.session_state["viewing_history_entry"] = entry
st.session_state["pdf_uploader_int"] += 1
st.rerun()

In the history section, we display each processed file with options to delete, download, or view the file. This enhances the application’s usability by allowing users to interact with their previously processed data directly from the sidebar. It should be noted the history is only saved from this currently running version of the script, it will not be saved once Streamlit is shut down.

In the upcoming section, we’ll start setting up the bread and butter of our application, uploading PDFs and reading text from them. As we continue, our AI-powered data entry tool becomes not only more functional but also intuitive and user-friendly. Let’s move on!

Uploading and Reading PDFs

An essential part of our data entry automation tool involves handling PDF files. In this section, we’ll explore the code to upload PDFs using Streamlit and then read their content using PyPDF.

Uploading PDFs with Streamlit

The first step in processing a PDF is, of course, getting the PDF into our system. This is where Streamlit’s file uploader comes into play. The following code snippet sets up a simple upload interface:

# Upload window
st.title("📚 PDF Data Extractor")
st.divider()
uploaded_file = st.file_uploader("Upload a PDF", type="pdf", key=f"pdf_uploader_{st.session_state['pdf_uploader_int']}")
if uploaded_file:
st.session_state["view_history"] = False
st.session_state["viewing_history_entry"] = {}
st.success("PDF uploaded successfully!")
st.divider()

Here, we create an upload window titled ‘PDF Data Extractor’ where users can upload their PDF files. The st.file_uploader Streamlit function is used for this purpose, and it is designed to handle PDF files seamlessly. Upon uploading a file, the user receives a success message, indicating that the file upload was successful.

Reading PDF Content with PyPDF

After uploading the PDF, the next step is to read its content. To achieve this, we use the PyPDF library. Our function pdf_to_text handles this:

# pdf to text handling
def pdf_to_text(uploaded_file):
try:
pdf_content = {}
reader = PdfReader(uploaded_file)
for i in range(len(reader.pages)):
raw_text = reader.pages[i].extract_text()
pdf_content[f"page_{i}"] = raw_text
print(pdf_content)
return pdf_content
except Exception as e:
st.error(e)
return

This function takes the uploaded file and uses PdfReader to read it. We iterate through each page of the PDF, extracting the text using the extract_text method. The extracted text from each page is then stored in a dictionary, making it organized and accessible for further processing.

Challenges and Considerations

Extracting text from PDFs can be achieved in different ways. This particular solution is simple and easy to deploy online. However, while solutions like this excel at typed text-based PDFs, they can struggle with PDFs that involve handwritten text or images. With more complex PDFs (like handwritten ones) comes more complex solutions for reading them. Another solution worth exploring, particularly for those harder PDFs, involves Optical Character Recognition (OCR) like Python’s pytesseract library. This library acts as a wrapper for Google’s Tesseract-OCR Engine and is more capable of handling handwritten text or images. The reason we are not using this solution in this tutorial is because it also requires a local download of Tesseract which can complicate deployment and is outside the scope of this tutorial.

Next up we are going to build out the core code that includes AI in this tutorial. So take a sip of that coffee and let’s start setting up OpenAI’s API handling and its accompanying functions.

Organizing Data with OpenAI

With our PDFs uploaded and their text extracted, the next step is to organize this data. For this, we are taking advantage of OpenAI’s GPT model, specifically gpt-4-1106-preview. The newest GPT model available at the time of writing this article (January 2024). This model is the preview to the soon-to-be-released GPT4 Turbo model, said to be cheaper and faster than the last model. That being said, this model is still not the cheapest available. If you choose to, you can switch to a cheaper model like gpt-3.5-turbo, just know it is at the cost of some functionality. Now let's dive into how we can use OpenAI's API to transform our extracted text into a structured format.

Integrating OpenAI for Data Organization

We use the OpenAI API to take the extracted text and organize it more meaningfully. Here’s the core function that handles this:

# OpenAI handling
client = OpenAI(api_key=st.session_state["OPENAI_API_KEY"])
def organize_data_with_openai(uploaded_file):
try:
pdf_content = pdf_to_text(uploaded_file)
response = client.chat.completions.create(
model="gpt-4-1106-preview",
response_format={"type": "json_object"},
messages=[
{
"role": "system",
"content": "You are a helpful assistant designed to output JSON."
},
{
"role": "user",
"content": f"Optimally organize this text data in JSON: {pdf_content}"
}
]
)
return response.choices[0].message.content
except Exception as e:
st.error(e)
return

This function first calls pdf_to_text to extract text from the uploaded file. It then sends this data to OpenAI's API, requesting the model to organize the text into a JSON format. The GPT model's response is expected to be a well-structured JSON object, making the data easier to handle and analyze. This is where the real power of AI shows in data entry. Instead of needing to parse the text and handle data organization manually, AI offers a flexible approach to organizing data effectively to be easily used later.

Handling API Responses and Errors

In this implementation, it’s important to handle exceptions effectively. Network issues or API errors can occur, and this code ensures that these are caught and displayed to the user, maintaining a smooth user experience.

Managing History

Our application not only processes data but also manages it. This involves maintaining a history of processed files, and for added benefit, I added a function to handle renaming duplicate uploads for those of us who upload the same file more than once.

Renaming Duplicate Files

To manage duplicates, we have the function rename_duplicate that handles detected duplicates:

# Duplicate renaming
def rename_duplicate(file_name, json, first):
if first:
file_name = file_name + "_0"
json["file_name"] = file_name
else:
original_name, num = file_name.rsplit("_", 1)
file_name = f"{original_name}_{int(num) + 1}"
json["file_name"] = file_name
update_history(file_name, json, first=False)

This helper function intelligently handles filenames to avoid overwriting existing files. It appends a number to the filename if a duplicate is found, ensuring each file is uniquely identified.

Updating the History

Our application also keeps a history of processed files, which is handled by the update_history function (updates to history will be found in the sidebar previously created):

# History handling
def update_history(file_name, json, first=True):
if not st.session_state["history"]:
st.session_state["history"].append({"file_name": file_name, "json": json})
return
if st.session_state["history"]:
for entry in st.session_state["history"]:
if entry["file_name"] == file_name:
rename_duplicate(file_name, json, first)
return
st.session_state["history"].append({"file_name": file_name, "json": json})

This function adds the processed file to the application’s history. If a file with the same name already exists, it calls rename_duplicate to handle it. This way, users can keep track of all processed files, access them easily, and avoid any confusion with files of similar names.

Next up, we’ll add the last section of code to present processed data to the user and allow them to interact with it. Let’s finish this!

Presenting The Data

With our data extraction and organization mechanisms in place, the final step in our journey is to present the processed data to the user. This involves displaying the data in a user-friendly format and managing user interactions with the processed files.

Implementing the Output Window

Our application’s output window is where the results of the data processing are displayed. The following code snippet handles this functionality:

# Output window
if uploaded_file:
for x in range(0, 2):
with st.spinner("Extracting data..."):
organized_data = organize_data_with_openai(uploaded_file)
if organized_data:
try:
valid_data = json.loads(organized_data)
except Exception as e:
st.error("Invalid data returned from OpenAI API.")
st.error("Retrying...")
time.sleep(3)
else:
st.error("OpenAI API error.")
st.error("Please try re-uploading the PDF.")
break
if valid_data:
file_name = uploaded_file.name.rsplit(".", 1)[0]
update_history(file_name, valid_data)
st.session_state["pdf_uploader_int"] += 1
st.session_state["view_history"] = True
st.session_state["viewing_history_entry"] = st.session_state["history"][-1]
st.rerun()
break
else:
st.error("Data extraction failed.")
st.error("Please try re-uploading the PDF.")
break
elif st.session_state["view_history"] and st.session_state["viewing_history_entry"] in st.session_state["history"]:
st.header(st.session_state["viewing_history_entry"]["file_name"])
st.json(st.session_state["viewing_history_entry"]["json"])

In this section, we process the uploaded PDF file and display the results. The data is organized using the organize_data_with_openai function and then the data is validated to be in JSON format for easy viewing. The st.spinner provides a visual indicator while the data is being processed.

Error Handling and User Feedback

The code also includes robust error handling to manage any issues that arise during data processing. If the data returned from OpenAI’s API is invalid or if there is an error during processing, the user is notified and given instructions on how to proceed, such as re-uploading the PDF.

Updating and Viewing History

After successful data processing, the file’s name and its processed content are added to the application’s history using the above update_history function. This enables users to view the history of processed files, which can be accessed through the Streamlit sidebar created earlier.

Final Output

The final output is presented in a clean, structured JSON format, making it easy for users to read and understand the organized data. This is achieved through the st.json function, which renders JSON content in a readable format within the Streamlit app.

This marks the completion of our data entry automation tool. From uploading PDFs to extracting and organizing text with AI, and finally presenting the processed data, this tool showcases the power of combining Python, Streamlit, and OpenAI. It’s a testament to how AI and new technology can streamline and enhance data management processes.

Taking It Further: Experimentation and Feedback

Congratulations on reaching the end of this tutorial! We’ve covered a lot of ground, from uploading and reading PDFs to organizing data with AI, and finally, presenting it in a user-friendly format. But remember, this is just the foundation.

Encouragement to Experiment

I encourage you to take this foundation and build upon it. Experiment with the code, tweak it, and adapt it to your specific needs. Here are a few ideas:

  • Integrate OCR functionality for handling scanned PDFs or image-based documents.
  • Explore different models from OpenAI to see how they affect accuracy and efficiency. Or even try out an open-source model like Llama2!
  • Add more features to the UI, like advanced data filtering or search functionalities.

Invitation for Feedback

Your thoughts, feedback, and questions are greatly appreciated. Please feel free to share your experiences, challenges, and triumphs in the comments below. All of it helps me improve and also contributes to the learning of the entire community.

Additional Resources From Me

Looking for more? Here is where you can find more articles, tutorials, or learn more about me:

I hope you enjoyed that coffee. Happy coding!

--

--

Jake Nolan

Nice to meet you! I am a full-time machine learning engineer. After my day job I consult, write, and research about AI.