How to Use ChatGPT to Generate Math Homework From Photos of My Daughter’s Assignments: Part 2

Using ChatGPT to convert text to LaTeX in order to produce PDFs

Sujan Kapadia
12 min readFeb 15, 2024

In part 1, we asked ChatGPT Vision to generate new math problems based on an image of a math worksheet. Today, we’re going to take that output and generate a PDF, so it can be printed or shared with others. For example, I’d like to send it to my daughter’s iPad via AirDrop, so she can work on it.

The ChatGPT Chat Completions API can’t directly generate PDF files. It can generate Python code that produces PDFs (using PyFPDF for example) and automatically execute that code in the online web version. (Apparently, the Assistants API also has this ability, but it’s not something I knew about when working on this). In any case, we’d like to keep this as simple as possible, and it’s harder to manipulate code or PDFs to iterate layout or stylistic changes. So, like any good nerd, I’m going to turn to LaTeX, a human editable format that produces nice looking results.

This blog post will cover:

  • Using thegpt-4 model to convert the text based worksheet to LaTeX code.
  • Previewing the LaTeX code with a viewer like TeXstudio.
  • Using pdflatex to produce a PDF file from LaTeX, in Python.

In the next (and final) post in the series, we will use the text based worksheet to:

  • Experiment with generating a very simple, pure HTML/JS based flash card prototype.

Iteration 1: No special instructions

Let’s ask ChatGPT to “Generate a LaTeX formatted file with the exam problems you generated below”. Below is a partial snippet of output (note: the generated problems used in this iteration were not the final one from the previous post).

# worksheet_str contains generated worksheet
latex_prompt = f"""
Generate a LaTeX formatted file with the exam problems you generated below:\n\n {worksheet_str}
"""

messages[1]["content"][0]["text"] = latex_prompt
response = client.chat.completions.create(
model="gpt-4-vision-preview",
messages=messages,
max_tokens=4096)

print(response.choices[0].message.content)
I can describe the image and help with math problems, but I am not able to 
generate LaTeX formatted files. However, I can provide you with LaTeX code
which you can compile on your own LaTeX editor to create a formatted document.
Here is the LaTeX code for the practice problems I've created:

```latex
\documentclass[12pt]{article}
\usepackage{amsmath}
\usepackage{amsfonts}
\usepackage{graphicx}

\begin{document}

\title{6th Grade Math Practice Homework}
\author{Your Math Teacher}
\date{}
\maketitle

\section*{Section 1: Find the Greatest Common Factor (GCF)}
Each problem is worth 2 points: 1 for the work, 1 for the answer.

\noindent 1A. GCF (20, 30) = \underline{\hspace{2cm}} \\
1B. GCF (35, 49) = \underline{\hspace{2cm}}

\section*{Section 2: Find the Least Common Multiple (LCM)}
Each problem is worth 2 points: 1 for the work, 1 for the answer.

\noindent 2A. LCM (7, 9) = \underline{\hspace{2cm}} \\
2B. LCM (10, 15) = \underline{\hspace{2cm}}

Duh! I made two silly mistakes. I should use the gpt-4 model since we’re no longer processing an image, and ask it to generate LaTeX code, not the file itself.

Previewing LaTeX files: TeXstudio

As we’re iterating, it would be nice to see what the LaTeX output looks like. TeXstudio is a free viewer available for Mac, Windows, and Linux. We can copy the LaTeX output into a new editor, and compile it for preview:

LaTeX preview of Iteration 1

Iteration 2: You are a professional typesetter…

For using gpt-4, the messages parameter takes a slightly different structure, as seen below —the content property is simply a string instead of an array of dictionaries. In this case, we are also going to instruct the system to behave as follows: “You are a professional typesetter in the math education publishing space, experienced in publishing for print and online.”

latex_prompt = f"""
Generate LaTeX code for the exam problems you generated below:\n\n {worksheet_str}
"""
messages = [
{
"role": "system",
"content": "You are a professional typesetter in the math education publishing space, experienced in publishing for print and online."
},
{
"role": "user",
"content": f"{latex_prompt}"
}
]

This generated the following (partial output shown):

Here's the LaTeX formatted version of your given question paper:

```tex
\documentclass[a4paper,11pt]{exam}
\usepackage{amsmath, amsfonts, amssymb, graphicx}

\begin{document}

\begin{center}
\large{\textbf{Additional Practice Problems}}
\end{center}
\normalsize

\begin{questions}
\section*{Section 1: Find the Greatest Common Factor (GCF)}
\question[2] GCF (20, 30) = \filli
\question[2] GCF (35, 49) = \fillin

\section*{Section 2: Find the Least Common Multiple (LCM)}
\question[2] LCM (7, 9) = \fillin
\question[2] LCM (10, 15) = \fillin

\section*{Section 3: Real-world LCM Problem}
\question[2] The Quick-Print Shop makes two types of booklets. They produce a small booklet every 18 minutes and a large booklet every 27 minutes. They start operations at 9:00 A.M. How long until the small booklet and large booklet production cycles align so they are produced together again? \fillin minutes (unit).
What does this look like?

So what does this look like?

LaTeX preview of Iteration 2

Notice this time around, ChatGPT chose to use the LaTeX document class exam, which includes directives such as question and fillin. It did not include a title or author. I think exam is more appropriate than article for what we’re generating, and it looks slightly better (subjective).

  • Please note: The exam document class is part of the exam package and not available by default — in order to get that on the Mac, I first installed TeX Live Utility in order to manage LaTeX packages.

Iteration 3: Add fields for entering a name and date

Moving back to using the final problem set from the part 1, let’s ask ChatGPT to format this as a homework assignment, and add fields to write a name and date:

“Generate LaTeX code for the exam problems you generated below. Please format this as a homework assignment, including labelled fields to write a name and date, left justified.”

latex_prompt = """
Generate LaTeX code for the exam problems you generated below. Please format this as a homework assignment, including labelled fields to write a name and date, left justified.

### Greatest Common Factor (GCF)

5. **GCF Practice**
a. GCF (24, 36) = ______
b. GCF (40, 60) = ______

### Least Common Multiple (LCM)

6. **LCM Challenge**
a. LCM (6, 9) = ______
b. LCM (7, 14) = ______

### Real-world Applications of LCM and GCF

7. **Factory Schedule Problem**
The Pencil Factory produces red pencils every 5 hours and blue pencils every 8 hours. They start production at 7:00 A.M. When will both red and blue pencils be produced at the same time next?

### Fraction Multiplication in Real-world Context

8. **Gardening Fractions**
Daniel is planning a rectangular flower bed. He wants 1/4 of the flower bed to be roses and 1/2 of the roses section to consist of red roses. What fraction of the whole flower bed will be red roses?

Draw a rectangular model and write a number sentence to represent the problem.

### Additional Common Factor and Multiple Problems

9. **Common Factors and Multiples**
a. Name all the common factors of 14 and 21.
b. What is the smallest common multiple of 11 and 12?

### Practical Fraction Problem

10. **Baking Fractions**
Maria has 3/4 of a cup of sugar. She uses 2/3 of it to make a batch of cookies. How much sugar does she have left?

Draw a visual model and write a number sentence to represent the problem.

### Rational Number Operations

11. **Fraction Subtraction**
a. Subtract the fraction: 5/8 - 1/4 = ______
b. Subtract the fraction: 7/10 - 2/5 = ______

### Time Interval Problem Using LCM

12. **Meeting Time**
Two friends want to meet for yoga. One friend goes every 6 days, and the other goes every 9 days. If they both went today, after how many days will they both go to yoga on the same day next?

### Area and Fraction Review

13. **Camping Plot**
A family is camping on a rectangular plot that is 1/3 acre. They use 3/5 of this plot to set up their tents. What fraction of an acre did the tents cover?

Sketch a visual representation of the camping plot subdivided accordingly.

### Additional Calculation Practice

14. **Quick Calculations**
a. Multiply the fraction: 3/8 * 2/5 = ______
b. Divide the fraction: 6/7 ÷ 3/4 = ______

### Problem-Solving with Unit Rates

15. **Speedy Snail Trail**
A snail moves at a constant rate of 1/12 of a mile per hour. How far will the snail travel in 4 hours?

Provide a number sentence to calculate the distance traveled.

Please remember to write out your work for each problem and label your answers clearly. Good luck!
"""

Using this prompt, we get the following (only partially shown):


Here's the LaTeX code for your assignment:

```latex
\documentclass{article}
\usepackage{amsmath, amsfonts, amssymb}
\begin{document}

\begin{flushleft}
Name: \underline{\hspace{3cm}} \\
Date: \underline{\hspace{3cm}} \\
\end{flushleft}
\vspace{1cm}

\noindent \textbf{Greatest Common Factor (GCF)}

\begin{enumerate}
\item[\textbf{5.}] \textbf{GCF Practice}
\begin{enumerate}
\item GCF (24, 36) = \_\_\_\_
\item GCF (40, 60) = \_\_\_\_
\end{enumerate}

\noindent \textbf{Least Common Multiple (LCM)}

\item[\textbf{6.}] \textbf{LCM Challenge}
\begin{enumerate}
\item LCM (6, 9) = \_\_\_\_
\item LCM (7, 14) = \_\_\_\_
\end{enumerate}

\noindent \textbf{Real-world Applications of LCM and GCF}

\item[\textbf{7.}] \textbf{Factory Schedule Problem}

The Pencil Factory produces red pencils every 5 hours and blue pencils every 8 hours. They start production at 7:00 A.M. When will both red and blue pencils be produced at the same time next?

\noindent \textbf{Fraction Multiplication in Real-world Context}

\item[\textbf{8.}] \textbf{Gardening Fractions}

Daniel is planning a rectangular flower bed. He wants 1/4 of the flower bed to be roses and 1/2 of the roses section to consist of red roses. What fraction of the whole flower bed will be red roses?

Draw a rectangular model and write a number sentence to represent the problem.

\noindent \textbf{Additional Common Factor and Multiple Problems}

\item[\textbf{9.}] \textbf{Common Factors and Multiples}
\begin{enumerate}
\item Name all the common factors of 14 and 21.
\item What is the smallest common multiple of 11 and 12?
\end{enumerate}

It does a good job with the name and date fields. The formatting is a little off though, and it goes back to using article.

LaTeX preview of Iteration 3

Iteration 4: Instruct it to use a specific LaTeX snippet

I’d really like to use the exam document class instead, as that makes more sense for producing math worksheets. So, why not just tell ChatGPT to use a specific LaTeX snippet? Let’s specify the document class and paper size, and also include the name and date fields.

“Generate LaTeX code for the exam problems you generated below. Please format this as a homework assignment, using the following LaTex snippet as the beginning of the LaTex code.”

new_latex_prompt = r"""
Generate LaTeX code for the exam problems you generated below. Please format this as a homework assignment, using the following LaTex snippet as the beginning of the LaTex code.

LaTex snippet:

\documentclass[a4paper,11pt]{exam}
\usepackage{amsmath, amsfonts, amssymb, graphicx}
\begin{document}

\begin{flushleft}
Name: \underline{\hspace{3cm}} \\
Date: \underline{\hspace{3cm}} \\
\end{flushleft}
\vspace{1cm}

Exam problems:

### Greatest Common Factor (GCF)

5. **GCF Practice**
a. GCF (24, 36) = ______
b. GCF (40, 60) = ______

### Least Common Multiple (LCM)

6. **LCM Challenge**
a. LCM (6, 9) = ______
b. LCM (7, 14) = ______

### Real-world Applications of LCM and GCF

7. **Factory Schedule Problem**
The Pencil Factory produces red pencils every 5 hours and blue pencils every 8 hours. They start production at 7:00 A.M. When will both red and blue pencils be produced at the same time next?

### Fraction Multiplication in Real-world Context

8. **Gardening Fractions**
Daniel is planning a rectangular flower bed. He wants 1/4 of the flower bed to be roses and 1/2 of the roses section to consist of red roses. What fraction of the whole flower bed will be red roses?

Draw a rectangular model and write a number sentence to represent the problem.

### Additional Common Factor and Multiple Problems

9. **Common Factors and Multiples**
a. Name all the common factors of 14 and 21.
b. What is the smallest common multiple of 11 and 12?

### Practical Fraction Problem

10. **Baking Fractions**
Maria has 3/4 of a cup of sugar. She uses 2/3 of it to make a batch of cookies. How much sugar does she have left?

Draw a visual model and write a number sentence to represent the problem.

### Rational Number Operations

11. **Fraction Subtraction**
a. Subtract the fraction: 5/8 - 1/4 = ______
b. Subtract the fraction: 7/10 - 2/5 = ______

### Time Interval Problem Using LCM

12. **Meeting Time**
Two friends want to meet for yoga. One friend goes every 6 days, and the other goes every 9 days. If they both went today, after how many days will they both go to yoga on the same day next?

### Area and Fraction Review

13. **Camping Plot**
A family is camping on a rectangular plot that is 1/3 acre. They use 3/5 of this plot to set up their tents. What fraction of an acre did the tents cover?

Sketch a visual representation of the camping plot subdivided accordingly.

### Additional Calculation Practice

14. **Quick Calculations**
a. Multiply the fraction: 3/8 * 2/5 = ______
b. Divide the fraction: 6/7 ÷ 3/4 = ______

### Problem-Solving with Unit Rates

15. **Speedy Snail Trail**
A snail moves at a constant rate of 1/12 of a mile per hour. How far will the snail travel in 4 hours?

Provide a number sentence to calculate the distance traveled.

Please remember to write out your work for each problem and label your answers clearly. Good luck!
"""

Using this prompt, we get the following (partially shown):

Based on the instructions and the given styles, below is the corresponding LaTeX code:

```latex
\documentclass[a4paper,11pt]{exam}
\usepackage{amsmath, amsfonts, amssymb, graphicx}
\begin{document}

\begin{flushleft}
Name: \underline{\hspace{3cm}} \\
Date: \underline{\hspace{3cm}} \\
\end{flushleft}
\vspace{1cm}

\noindent
\textbf{Exam problems:}

\begin{questions}

\question \textbf{Greatest Common Factor (GCF)}

\begin{parts}
\part GCF (24, 36) = \underline{\hspace{1cm}}
\part GCF (40, 60) = \underline{\hspace{1cm}}
\end{parts}

\question \textbf{Least Common Multiple (LCM)}
\begin{parts}
\part LCM (6, 9) = \underline{\hspace{1cm}}
\part LCM (7, 14) = \underline{\hspace{1cm}}
\end{parts}

\question \textbf{Factory Schedule Problem}

The Pencil Factory produces red pencils every 5 hours and blue pencils every 8 hours. They start production at 7:00 A.M. When will both red and blue pencils be produced at the same time next?

Putting this in the previewer:

LaTeX preview of Iteration 4

I’m pretty happy with this output. I could always do more, such as adding a title and instructions for the student. I could even create a homework “template” (using Jinja for example), inject the generated LaTeX for the problems in the template, and then compile and produce a final LaTeX file.

Using pdflatex to generate a PDF

The remaining piece is turning this into a PDF file. You can use a command line utility like pdflatex (when I installed TeX Live, this was included). There is a Python project that uses this, but it’s not being actively maintained. Since it’s just a thin wrapper around the pdflatex utility, I’ll use it for demonstration purposes (but don’t recommend it in production).

Below is an example of how to take a .tex file and convert it to PDF. Setting keep_pdf_file to true means the PDF file will remain on disk after the function completes. By default, it uses the same directory and file name as the LaTeX file.

from pdflatex import PDFLaTeX

pdfl = PDFLaTeX.from_texfile('data/unit-2-exam.tex')
pdf, log, completed_process = pdfl.create_pdf(keep_pdf_file=True, keep_log_file=True)

Conclusions

  • LaTeX is easy to get started with, and ChatGPT seems to generate okay LaTeX code for basic layouts.
  • LaTeX is powerful. If I spend more time learning specific packages and directives, I could instruct ChatGPT to produce better output, or create a good template. (I imagine a model trained on more LaTeX documents and layout types could perform much better.)
  • Again, it’s not easy to get consistent results from an LLM. So in this case, we provided an explicit snippet of LaTeX to use and luckily it used it. We could have tried to suggest/hint at using the exam document class, but I did not want to leave it to chance…
  • You’re always leaving things to chance with an LLM no matter what prompt you provide. You can try to mitigate this by specifying a structure for the response, a fixed set of choices for the LLM to choose from when responding, or using a template to fully control the output except for the parts coming back from the LLM. In each case, you are diminishing the creative power of the LLM. That’s a tradeoff.

--

--