Doing Math with Large Language Models

Karl Weinmeister
Google Cloud - Community
8 min readMay 21, 2024

How can Large Language Models (LLMs) help you with math problems? This blog post will explore a range of techniques with Google’s Gemini. Whether your challenge is at work, at home, or in school, get ready to unlock Gemini as your go-to math problem solver.

We’re going to walk through the following scenarios:

  • Solving techniques
  • Visualization
  • Multimodal problems
  • Communicating results

We’ll use Vertex AI Studio to submit prompts using the Gemini Pro 1.5 model. We’ll also provide Python code samples to run in the Colab Enterprise notebook environment in Vertex AI.

Basic solving techniques

Let’s start with the example of summing an arithmetic series 1..n. Given n=3, for example, the result is: 1 + 2 + 3 = 6.

Let’s start with the most straightforward way by simply asking Gemini, “What is the sum of numbers between 1..10?”

Vertex AI Studio prompt interface

Gemini provides the answer 55, along with an equation and how to calculate it. Note that Gemini provides a more general equation than we’ll use in this blog post, n(n+1)/2, as it accounts for sequences that start at any number, not just 1.

Vertex AI Studio response interface

Chain-of-thought

As we scale up the complexity of problems we ask Gemini, we can use chain-of-thought prompting techniques. Chain of thought can improve performance on reasoning tasks by breaking out the problem into multiple smaller steps.

For instance, we can use the information we know from our first answer to build out a multi-step prompt:

What is the sum of numbers between 1..123?
Using the formula n(n+1)/2 where n=123, follow these steps:
* Add 1 to n
* Multiply by n
* Divide by 2

Solve with code

My go-to approach is to ask Gemini to provide code to solve the problem. It’s one quick step of indirection, and provides reliable results.

Let’s scale up the problem even more with this prompt:

Provide me the code to calculate the sum of numbers between 1 and 5248206

Gemini provides us this code snippet that we can execute in a Colab Enterprise notebook:

def sum_of_range(n):
"""
This function calculates the sum of numbers from 1 to n using the arithmetic series formula.
Args:
n: An integer representing the end of the range.
Returns:
The sum of numbers from 1 to n.
"""
return n * (n + 1) // 2

# Calculate the sum from 1 to 5248206
result = sum_of_range(5248206)
print(result)

Visualization

Like solving with code, asking Gemini for the code to plot is an indirect approach. And it works really well!

Let’s say we want to calculate the area under the curve of y=2x+3 from x=1 to x=4:

Matplotlib visualization generated by Gemini

Gemini generates this code to create the visualization:

import numpy as np
import matplotlib.pyplot as plt

# Define the function
def f(x):
return 2*x + 3

# Create an array of x values from 0 to 5
x = np.linspace(0, 5, 100)

# Calculate the corresponding y values
y = f(x)

# Create the plot
plt.plot(x, y)

# Shade the area under the curve between x = 1 and x = 4
plt.fill_between(x, y, where=(x >= 1) & (x <= 4), color='blue', alpha=0.5)

# Set plot labels and title
plt.xlabel('x')
plt.ylabel('y')
plt.title('Area under the curve of y=2x+3 from x=1 to x=4')

# Set x and y limits to clearly show the bounds
plt.xlim(0, 5)
plt.ylim(0, f(5))

# Show the plot
plt.show()

Visualizations can also help us answer problems. Let’s now explore a classic question: what is 0⁰?

We know that b^n is b x b x … x b x b (n times). We also know that b⁰=1. So what happens when both b and n are zero (0⁰)? Let’s ask Gemini for a visualization that will help our understanding:

Provide code to plot b^n where b=n from 0 to 1.

What do we see? That 0⁰ approaches 1.

Matplotlib visualization using Gemini generated code

Instructing Gemini to use a solving technique

Next, let’s look at how we can instruct Gemini to use specific approaches to solve problems, using approximation, simulation, or symbolic math.

Approximating an answer

This approach breaks down a more complex problem into smaller chunks that are simpler to solve.

Let’s examine the area under the curve problem we just visualized. We can solve it numerically, using the trapezoidal rule to add up many small trapezoids under the area of the curve. Simply ask Gemini to approximate an answer. We get the following result:

def f(x):
"""
This function represents the equation y = 2x + 3
"""
return 2*x + 3

def area_under_curve(a, b, n):
"""
This function calculates the area under the curve using the trapezoidal rule.
a: starting point
b: ending point
n: number of intervals
"""
delta_x = (b - a) / n
sum = 0
for i in range(n):
x1 = a + i * delta_x
x2 = a + (i + 1) * delta_x
sum += (f(x1) + f(x2)) * delta_x / 2
return sum

# Example usage:
a = 1
b = 4
n = 100 # Number of intervals for approximation

area = area_under_curve(a, b, n)
print("The area under the curve is approximately:", area)

# The area under the curve is approximately: 24.000000000000004

Solving symbolically

Alternatively, when there is a closed-form solution, we can ask Gemini to symbolically integrate. Using SymPy, you can get an exact answer to the problem more concisely. We can ask Gemini specifically to use SymPy or use a symbolic approach to go this route, and receive this generated response:

from sympy import integrate
from sympy.abc import x

# Define the function
f = 2*x + 3

# Calculate the indefinite integral
integral = integrate(f, x)

# Evaluate the definite integral from 1 to 4
area = integral.subs(x, 4) - integral.subs(x, 1)

# Print the result
print(f"The area under the curve is: {area}")
print(f"The integral is: {integral}")
print(f"The area under the curve is: {area}")

# The expression is: 2*x + 3
# The integral is: x**2 + 3*x
# The area under the curve is: 24

Simulation

Like approximation, simulation does not provide an exact result, but can be very helpful in solving some tricky problems.

Here we will show how to apply Monte Carlo methods, where we run a simulation of many repeated trials. For instance, to approximate π, we can ask Gemini to count random points inside and outside a quadrant. The prompt can be as straightforward as:

Estimate pi using a simulation technique.
Matplotlib visualization generated by Gemini

Gemini provides this simulation code to answer the problem:

import random

def estimate_pi(num_points):
inside_count = 0
for _ in range(num_points):
x = random.random() # Generate random x in [0, 1]
y = random.random() # Generate random y in [0, 1]
if x**2 + y**2 <= 1:
inside_count += 1
return 4 * (inside_count / num_points)

print(estimate_pi(1000000)) # Estimate pi with 1 million points
# 3.141844

Multimodal problems

Many problems may have a graphical representation, and Gemini can shine in these scenarios, too. Let’s go back to the arithmetic series problem. I’ve removed the legend to avoid providing any text hints in the title or labels. Let’s see if Gemini can figure out what function this is, purely based on the data. I’ve uploaded the image and a text prompt, asking Gemini to derive the function.

Arithmetic series from 1..10 without any text hints

Gemini correctly identifies the arithmetic series formula n(n+1)/2, in an equivalent form: y = ½x² + ½x:

The chart appears to display exponential growth, as the values increase at an increasing rate. Let's analyze the data points to see if we can find a pattern and derive a possible formula.
Data Points:
(1, 1)
(2, 3)
(3, 6)
(4, 10)
(5, 15)
(6, 21)
(7, 28)
(8, 36)
(9, 45)
(10, 55)

Observations:
The difference between consecutive y-values seems to increase by 1 each time (2, 3, 4, 5, …). This is a characteristic of a quadratic function.

Possible Formula:
Based on the observation, let's assume the formula is a quadratic function of the form:
y = ax² + bx + c
We can use three data points to create a system of equations and solve for a, b, and c. Let's use the points (1, 1), (2, 3), and (3, 6):
1 = a(1)² + b(1) + c => a + b + c = 1
3 = a(2)² + b(2) + c => 4a + 2b + c = 3
6 = a(3)² + b(3) + c => 9a + 3b + c = 6

Solving this system of equations (using substitution, elimination, or matrices), we get:
a = 1/2, b = 1/2, c = 0
Therefore, the formula for the chart is:
y = (1/2)x² + (1/2)x
Verification:
We can verify this formula by plugging in other x-values and checking if we get the corresponding y-values from the chart. For example:
x = 4: y = (1/2)(4)² + (1/2)(4) = 8 + 2 = 10
x = 7: y = (1/2)(7)² + (1/2)(7) = 24.5 + 3.5 = 28

The formula seems to hold for the other points as well.
Conclusion:
The formula that represents the chart is y = (1/2)x² + (1/2)x. It's important to note that while this formula fits the given data points, there could be other functions that also fit the data or represent the underlying phenomenon more accurately depending on the context.

Communicating results

Gemini can provide results in a number of different formats. Let’s ask it to prove the arithmetic series problem in the LaTeX typesetting system popular in academia. Gemini typically responds instead with markdown for a pretty output with formatted sections, bold text, and so on.

Let’s be clear in our prompt that we want our answer in pure LaTeX:

Can you prove the arithmetic series is n(n+1)/2?

Provide your response as a full LaTeX document with no markdown.

Voila! After compiling the LaTeX output, we have the result:

Next steps

In this blog post, we’ve mapped out a roadmap for using Gemini on common math problems. From visualization to solving techniques, you can apply these techniques into your prompts to go further and faster.

And this is just the beginning. Researchers are pushing the limits of applying language models to math problems. Google DeepMind has released AlphaGeometry, an AI-based solution for complex geometry models that uses a language model. The Anima AI + Science Lab at CalTech is helping solve new proofs with a human-in-the-loop LLM theorem proving solution.

You can get started today on Gemini with AI Studio, or try out Gemini with Vertex AI on Google Cloud.

--

--