Better Programming

Advice for programmers.

Build Advanced Machine Learning Models With OpenAI’s API

5 min readMar 26, 2022

--

scalability — designed image

OpenAI released an API that allows researchers and developers to build advanced GPT machine learning models with few lines of code based on the paper “Language Models are Few-Shot Learners”.

The API has an effortless setup to start interacting with the state-of-the-art GPT models. It is scalable to advanced users to fine-tune the models based on business use cases or the user’s custom language.

Note to Readers: This article was updated in March 2025 to reflect the latest OpenAI API changes.

Let’s build a model to answer NASA solar system facts

source: nasa.gov — license in references

I tried to build a machine learning model using GPT models to answer questions from the NASA website about “10 Need-to-Know Things About the Solar System”. The model results were exciting, and this blog will work with you to build a step-by-step advanced question-answering model. Although the model might know the information from its own training, I used the reference approach in this article to show how to pass information to the model. For your company-specific needs or data, similar techniques will be helpful, or if you want to ensure the model information is up to date.

An example of the model exciting results: I asked the model how many planets in our solar system, and it managed to return “8” by looking at this paragraph: “Our solar system is made up of a star, eight planets, and countless smaller bodies such as dwarf planets, asteroids, and comets”.

Although it might look like intuitive information, the model learned the answer by itself from the given paragraph, which includes multiple details.

When asked the model how many years to complete an orbit around galactic? it returned “230 million years” from this paragraph:

It takes our solar system about 230 million years to complete one orbit around the galactic center.

This is the link for the target NASA article.

Open AI Setup

To start using the Open AI model, go to this website and create an account:

Get the API key to start building models, the “key” available under your profile/view API keys.

View API Keys

The API is available using python or node.js; this article will focus on the python code. To download Open AI using python run:

pip install openai

The Parameters

There is no need to have a deep knowledge of model building to interact with Open AI GPT models — the API democratizes the model for everyone to allow the interaction using basic parameters. However, it comes with fine-tuning capabilities for data science experts.

Model type

First, you need to select your model type; the API comes with multiple engines:

  • gpt-4: A powerful language model adept at understanding and generating human-like text.
  • gpt-4o: An enhanced version of GPT-4, this multimodal model can process both text and images.
  • gpt-4.5: Excels at tasks that benefit from creative and human like text.
  • o1: Designed with advanced reasoning abilities, the o1 model excels in complex problem-solving.

For more details: https://platform.openai.com/docs/models

Other parameters for completion API:

open AI completion parameters

Full code example to build the solar system answers model

source: unsplash.com

Import openai and set the key value:

from openai import OpenAI
client = OpenAI(api_key="enter_your_key")

Prepare the input template file that includes the context paragraph for the model to answer your questions; using files will make it easier to format the lines and draft the examples:

file — example of the model input

Load the input file using python:

with open('answer_ahmadai_template.txt', 'r') as f:     
prompt_input = f.read()

Call the openai completion API and store the response:

code — to call openai question answering

Now you can print the returned answer as follows:

answer = response.choices[0].message.content.strip()
print(answer)

Printed response: ‘8’

Parameters explanation:

  • model: the engine for your answering model.
  • prompt: the input text including the documents and the question.
  • max_tokens: max tokens to generate the answers.
  • temperature: higher values means the model will take more risks
  • stop: mark the encoding to not contain in the response.

Enhancement

When you send large chunks of text “like entire web pages” as input prompts, it quickly gets expensive, and responses slow down. The more content you submit, the longer the model takes to process, which can become impractical if you’re trying to build a responsive application.

A smart approach is to first run a semantic search, rather than submitting all the content at once. This method retrieves only a few relevant sections matching the user’s question. Doing so cuts costs, reduces response delays, and usually leads to more accurate results. You can further boost quality by sorting these snippets based on embedding similarity scores or by carefully fine-tuning your prompt wording.

Summary

OpenAI’s API makes it straightforward to set up machine learning applications. You can start simple, then gradually play around with different settings as your project grows. Currently, the main model choices are gpt-4o, gpt-4.5, and the lighter o1, each suited to slightly different tasks.

For competitive advantage, upload your custom data as a file and use the fine-tuning feature; this is helpful if you have unique data for your business or language. The follow-up article will focus more on the fine-tuning code and results.

Even though OpenAI doesn’t let you directly tweak the core of their models, using the fine-tuning features gets you pretty close to that level of customization. This approach helps keep your apps accurate, reliable, and ahead of competitors who rely just on general models.

References

Quotation: NASA content (images, videos, audio, etc) is generally not copyrighted and may be used for educational or informational purposes without needing explicit permissions.

I updated the blog in October 2022 and March 2025 to cope with the latest open API changes.

The article was written when Openai released its first models, Curie and DaVinci. It was then updated to include information about the latest API changes and models.

--

--

Ahmad Albarqawi
Ahmad Albarqawi

Written by Ahmad Albarqawi

Master’s data science scholar at UIUC.

No responses yet