Reinventing The Wheel: OpenAI APIs Through Go — Part 1

Soumit Salman Rahman
5 min readFeb 3, 2024

--

Dude! There is a WHOLE bank of libraries for that in python. What conceivable reason do you have to do this in Go?

Cause I feel like it! Deal with it!!

Other than just feeling like it there are legit reasons why you won’t do things in python

  • You don’t like python as a language (👏 Neither do I. Cheers to that 🍷)
  • You don’t have the need to use all the things that are available as python library. Maybe you just need some of the simple features for what you are build!
  • You want the rest of your code to run faster (Oh! what a concept)
  • You already have most of your code written in Go. It would be a much easier choice to stay with it to use to the libraries that you already have created for your product.

So, I was experimenting with the public endpoints from openai.com, anyscale.com and octo.ai as a means to figuring out which model does what for me. It turned out that they all expose the REST API message formats in the same way, making only the URL and the API key unique to each platform.

Although all of them have official python SDK, Go is not an area they explored (which makes sense … ROI). But independent developers to the rescue (power to the people!!). There are a whole bunch. Of these I found the following two to be the most comprehensive and bug free implementations

They both have near identical with some naming difference on the Go struct types. Both of them have work very identically with

  • Text Chat Completion
  • Embeddings
  • Custom Function Call
  • File Upload
  • File Tuning

Neither of them seem to have an option to pass on response_format for trigger JSON mode.

They both natively have named constants defined for most of the models served by OpenAI (chatgpt, ada, davinci etc.). But the SDKs are not limited to those only. Since the model parameter is a string value you can pass on the name of any model served by Anyscale and OctoAI. With that said, there are some caveats -

sashabaranov/go-openai’s implementation also supports

  • Whisper
  • DALL-E
  • Azure OpenAI specifics

otiai10/openaigo’s implementation currently is limited to texts only. It is worth noting that I found otiai10’s implementation to be slight faster although I don’t have an official benchmark comparison.

Let’s put them to use! I am using otiai10/openaigo’s implementation for all the examples below

Creating a Client

You can set the base_url and org_id (only applies to OpenAI and NOT Anyscale or OctoAI) after the instantiation.

func getClient() *openai.Client {
// Returns a singleton client. You don't necessarily have to do that
// Given the underlying implementation of otiai10/openaigo, it doesn't really matter
if client == nil {
client = openai.NewClient(os.Getenv(API_KEY))
// needed for Anyscale and OctoAI
client.BaseURL = os.Getenv(BASE_URL)
// optional but applies to OpenAI
client.Organization = os.Getenv(ORG_ID)
}
return client
}

You OpenAI’s base_url is hardcode so if nothing is specified it will default to OpenAI public endpoints. But if you want to use OctoAI or Anyscale pass in their base_url after the instantiation. This does NOT immediately validate the API key. API key is simply stored in a struct and gets sent as part of the request bearer token.

ANYSCALE_BASE_URL="https://api.endpoints.anyscale.com/v1"
OCTOAI_BASE_URL="https://text.octoai.run/v1"

Creating Chat Completion

func getChatCompletion(messages []openai.Message) (string, error) {
resp, err := getClient().ChatCompletion(
ctx.Background(),
openai.ChatCompletionRequestBody{
Model: os.Getenv(CHAT_MODEL),
Messages: messages,
},
)
if err != nil {
log.Println(err)
return "", err
}
// optionally you can add the received message back
return resp.Choices[0].Message.Content, nil
}

You can create messages like below -

messages := []openai.Message{
{
Role: "system",
Content: "You are a philosophe and You speak like Bob Marley",
},
{
Role: "user",
Content: "What is the best way for jamming",
// This is optional.
// The name field helps when you want to maintain the context of multiple different users in a thread
// Note that openai api-s require a value patter ^[a-zA-Z0-9_-]{1,64}$
Name: "BigDaddy",
},
}

Creating Embeddings

func getEmbeddings(text_list []string) [][]float32 {
// do what ever input validation you need
resp, err := getClient().CreateEmbedding(
ctx.Background(),
openai.EmbeddingCreateRequestBody{
Model: os.Getenv(EMBEDDINGS_MODEL),
Input: text_list,
})
if err != nil {
return nil
}
var vectors = make([][]float32, len(resp.Data))
for i := range resp.Data {
vectors[i] = resp.Data[i].Embedding
}
return vectors
}

Keep in mind that you are still responsible for input verifications like

  • Checking token limits.
  • Splitting large texts into smaller chunks to fit within the message token limit of the model.
  • Managing the overall context window from going out of bounds (the same way you do in python)

Counting Tokens

If you want a precise token count there is a similar 3rd party implementation for tiktoken

I use it primarily to truncate large texts primarily for embeddings and breaking user message into multiple user messages

func truncateTextForModel(text string, model string) string {
// the library is not updated with all the new models like text-embedding-3-small
// so you can pass it as a text and it would work just fine
enc, err := tokenizer.ForModel(tokenizer.Model(model))
if err != nil {
// use cl100k_base as default
enc, _ = tokenizer.Get(tokenizer.Cl100kBase)
}
tokens, _, _ := enc.Encode(text)
res, _ := enc.Decode(safetSlice[uint](tokens, 0, MAX_TOKEN_LIMIT))
return res
}

Keep in mind that tiktoken ONLY works for OpenAI models. The models hosted hosted by OctoAI and Anyscale (mistral, zephyr, codellama etc.) has their own encodings / tokenizing method and using tiktoken will not give you an accurate answer. In python there are pips with their corresponding tokenizers, but I couldn’t find something similar in Go yet. If you don’t need a precise estimation and just want to do a rough calculation to figure out truncation or splitting points you can use general estimation of 4 chars per token (This seems to work fine for me so far).

On that note, I had some interesting observations while poking around. These are agnostic to Go. These are how the endpoints behave —

  • OpenAI’s text-embeddings-3-small does not generate the same vectors for the same text. It varies with each run. However bothBAAI/bge-large-en-v1.5 and thenlper/gte-large (served by Anyscale) are consistent across runs.
  • Mistral-7B-Instruct-v0.1 doesn’t really do much with the name parameter. chatgpt-* is much more diligent in distinguishing the different users in the thread.
  • Anyscale endpoints do NOT support more than 1 system message but OctoAI and OpenAI can support multiple system messages.
  • Octoai does not have a public endpoint that provides embeddings.

I will cover text splitting and custom function calls in future posts. Until then enjoy Go-ing.

PS

Previous article in this series: Reinventing The Wheel: Publishing Your Own Python PIP | Medium

Same $hit somewhere else: Reinventing The Wheel: OpenAI, Anyscale & OctoAI APIs Through Go — Part 1 | LinkedIn

--

--