Fun with LLM “Turing Completions”

Sirsh Amarteifio
14 min readOct 12, 2023

--

The Open AI interface using functions looks like the code block below. There is something wonderfully simple and powerful about this if you think about it. In this article we are going to think about it and create a working example of an agent programming model inspired by the most foundational idea in computing.

response = openai.ChatCompletion.create(
model="gpt-4",
messages=messages,
functions=functions,
function_call="auto",
)

When you look at the basic samples for Open AI functions, you might be inclined to think about there being a small handful of functions and some small amount of messages that sit in the finite context window being passed to the LLM. But suppose we wrote it like this,

while True:
response = openai.ChatCompletion.create(
model="gpt-4",
messages=infinite_messages,
functions=infinite_functions,
function_call="auto",
)
#apply functions and update state/messages/functions
#check for halting condition

If you look at this, it can be perceived as a universal interpreter, with infinite memory. From this viewpoint, the LLM is the 1.7-trillion-parameter interpreter of a type of program that you can learn to write as we will illustrate below and it is nothing more than a continuous conversation between your program (a set of functions) and the LLM interpreter.

But wait! You say, this cannot be infinite, neither the set of functions nor the message buffer. Actually it can. The context window is of course finite but the stack of functions and the memory are infinite as we shall see.

an infinite tape and the reader of a 1.7 trillion parameter interpreter

You really have to go back to the beginning of computer science on this one. The concept of a Turing machine and an infinite tape is fundamental to the idea of computation. I asked ChatGPT to give me the summary (I added the detailed output it provided me to the endnotes as its useful to ponder the details);

A Turing machine consists of an infinite tape, a tape head that can read and write symbols on the tape, a finite set of internal states, and a transition function that dictates how the machine operates based on its current state and the symbol being read. The machine begins in an initial state and repeatedly applies the transition function until it enters a halting state, at which point the computation concludes. The power of Turing machines lies in their ability to recognize the class of all computable functions and serve as a theoretical foundation for the study of algorithms and computability theory.

So lets take that and think about what we can do with our instruction set (Python functions) and our memory. Lets treat the interpreter (the while loop above basically) and our interface to the LLM as fixed — it is simply a cursor that slides our context window along. We are free, with respect to that context, to update the available set of functions (which we keep small at any given time) and what is in the message buffer (which we keep small at any given time). Now keep in mind we are also backed by a large memory (e.g. databases as used in Retrieval Augmented Generation). To make this generic, I will introduce a very particular Python decorator below that decorates any Python function, takes inputs and returns data.

I think this is theoretically interesting and that is reason enough to play around with this. But I have grander yet grounded ambitions — I want an easy, pragmatic way to write RAG programs.

The premise of Langchain is hugely overcomplicated, I am starting to believe. It is not a good way to control program flow for agent systems. What we think of as a “chain” is really just program flow. All we have is a continuous conversation between us and the LLM (s)— its just text. So all we need to do is read and write into the context window and set the available functions (from our program). Our program does not need to be specially written for agent systems as such. We need a way to reason about these programs and then interpret them via a simple interface. Examples will make this more clear.

First to illustrate the general idea, a decorator signature could look something like the below. It can (rather, it has the option to) modify both the functions and messages while doing data lookups. It breaks some functional programming rules of allowing side-effects and changing state so it could be refactored but I don’t want to obscure the idea at this stage. In this article I will omit the use of this decorator but I wanted to make the point that we can create a general framework over functions with this approach. Ill build on it in future.

def program_step(
obj, functions: typing.List[dict], messages: typing.List[dict], **kwargs
) -> typing.List[dict]:
"""
A decorator that decorates an arbitrary function or tool.
Applies the kwargs for the function and manipulates the functions and or messages.
"""

We would re-write our agent loop above to use decorated functions in this way, thus allowing for the infinite set of functions and messages to be modified if required. This is the mechanism by which we control our agents and we do not change our interface to the LLM or worry about global prompts etc.

while True:
response = openai.ChatCompletion.create(
model="gpt-4",
messages=infinite_messages,
functions=infinite_functions,
function_call="auto",
)
#determine function my_function to call from LLM response
#determine paramters for function as kwargs from LLM response
my_function(infinite_functions,infinite_messages, **kwargs)
#check for halting condition and break

What is important to note here is the following

  • the program/function can of course return data i.e. the answers from a tool, API call, database lookup etc. — these are added to messages
  • the program/function can also provide a transition function e.g. another question to ask, a pointer to data, another function to call etc. These are also added to messages. I actually touched on this very briefly in my previous article.
  • the program/function can modify the available functions in the context of that state — for example, by querying a vector store of function descriptions from the given context say and adding some fixed ones plus the new closest matches and adding them to function scope.

I believe this, in its simplicity, is a powerful formulation of how to write LLM/agent programs. You are generally trying to “keep the conversation going” while you intelligently traverse your knowledge graph. You want to traverse the graph with the fewest possible steps to answer the question and then halt. You want it to be predictable, fast and cheap.

To see how this can work in an example more on the “finite” side of things, I’m going to take a setup I did in Langchain with three tools and rewrite it in this way. The tools become simple data functions and we can lean on traditional data engineering approaches to design those without getting caught up in how to write them as agent tools per se. This is a nice separation of concerns. Thats really a core design principle here - when writing agent programs, we don’t want to feel like we are writing agent programs!

We could attach the decorator to these tools so it plays well in the ironically dubbed “Turing Completions” programming setup above but we can postpone that for this article. We run the interpreter over our functions to answer questions. We illustrate a few things along the way

  • the possibility of returning both data and “transition suggestions”
  • the possibility of using an LLM inside functions if needed

Today we do not cover (left for further investigation)

  • modifying the function set that is available for use
  • trying to re-summarize or otherwise prune or modify the memory during program execution
  • how our data are best organized to get the most of this system e.g. things like retrieving data, links, hints, function searches etc.
  • auditing the program; states, transitions, data IO

Digging up my old examples that used Langchain, my initial foray into LLMs was here and then I refactored it a little as part of another project here . In these cases I set up the Zero Shot agent using three tools using Langchain; an SQL, vector, and key-value store tool were used. Im just going to take those tools and wrap them in functions and run the agent loop program. This was motivated by my first encounters with Open AI functions, which I describe briefly here.

These three stores are very easy to understand. The SQL tool just takes a question, converts it to a query and runs the query against a DuckDB table — sure, it trips up sometimes but you can get reasonable performance with a modicum of effort. The key-value store resolves entity keys to dictionaries stored in Redis — this is quite predictable (I don’t employ that one in this article). The vector store uses LanceDB for vector similarity search.

I created some datasets based on these Fairy Tales. The notebook gives some idea about what I did. I scraped the sections for each story and used a VectorDataStore wrapper on LanceDB to ingest the data with an Instruct Embedding for the vectors. I also included some tabular data here by asking ChatGPT to generate some tabular data for these stories. The notebook also shows snippets for ingesting those data into a ColumnarDataStore.

Once the data are ingested, I expose a search function from each of these stores. These functions will be called in the code block below that calls the Open AI API. A utility function `ops.parse_function_description` will inspect the functions and generate the descriptor that Open AI expects.

def call_open_ai_with_functions(functions, question, limit=10):
"""
(PoC testing - will improve the framework/interfaces later)
take in some functions and describe them in the way OpenAI expects
loop and use the passed in functions to answer the question

"""

plan = f""" You are an assistant that answers questions using provided data and functions. """

messages = [
{"role": "system", "content": plan},
{"role": "user", "content": question},
]

# apply any sort of aliasing in general - compatible with what we describe to OPEN AI
functions_map = {f.__name__ + ops.str_hash(): f for f in functions}
function_descriptions = [
ops.parse_function_description(k, f) for k, f in functions_map.items()
]

for _ in range(limit):
#call open AI with the current functions and data
response = openai.ChatCompletion.create(
model=DEFAULT_LLM_MODEL,
messages=messages,
functions=function_descriptions,
function_call="auto",
)

#run any function the LLM wants us to run
response_message = response["choices"][0]["message"]
function_call = response_message.get("function_call")
if function_call:
function_response = functions_map[function_call["name"]](
**json.loads(function_call["arguments"]),
)

#when we use the decorator described earlier in the article
#we can omit this and pass messages and functions into the function
#I omit this for clarity here
messages.append(response_message)
messages.append(
{
"role": "function",
"name": function_call["name"],
"content": json.dumps(function_response),
}
)

# this would be in the decorator
# this spoofs what would happen when we add the program step decorators to provide transition function
# messages.append(
# {
# "role": "function",
# "name": function_call["name"],
# "content": "If you dont find the answer that you are looking for or the _distance seems high try 'The full columnar data' function instead",
# }
# )

#halting condition met
if response["choices"][0]["finish_reason"] == "stop":
break
return response_message["content"]

To use this, below we construct the two data stores for each type of data and then setup the functions to send to the OpenAI agent loop.

These stores no longer need any Langchain or LlamaIndex in them but I have left those implementations in the classes in the repo for ongoing comparisons. The design principle here is just to create simple stores to write and query data and then expose a function for use in the agent loop.

#the notebook provides context for these test Pydantic types
#FairyTaleCharacter is a pydantic type matching the tabular data
cstore = ColumnarDataStore(FairyTaleCharacter)
#FairyTales is defined for text data as scraped from the web site
vstore = VectorDataStore(FairyTales)
#set up the functions - they are self-describing via docstrings and typing
functions = [cstore.as_function, vstore.as_function]

call_open_ai_with_functions(functions,
"which character has the most friends and where did they eventually settle down")

The response from the question in red text is a compilation of data from the tabular data and the vector store data,

The character with the most friends is Sindbad, who has 15 friends. After all his adventures and wanderings, Sindbad eventually settled down in happiness and prosperity in Bagdad.

These data are partially coming from the story itself while the bit about the number of friends Sindbad has is an SQL query executed by the columnar data tool.

In these functions, the docstrings inform the agent that certain tools are useful for questions of a quantitative or statistical nature while others are useful for narrative type questions.

Under the hood, the ColumnarDataStore executes an SQL query against DuckDB. It does this by asking an LLM to convert a question to a query and then pushes down the query to DuckDB. The logs show the output…

SELECT character_name, MAX(how_many_friends_they_have) 
FROM 's3://<BUCKET>/stores/columnar/v0/default/FairyTaleCharacter/parts/0/data.parquet'
GROUP BY character_name
ORDER BY MAX(how_many_friends_they_have) DESC
LIMIT 1;
2023-10-11 19:43:24.541 | DEBUG | monologue.core.data.stores.ColumnarStore:as_function:142 - [{'character_name': 'Sindbad', 'max(how_many_friends_they_have)': 15.0}]

And the vector store tool uses a vector search to retrieve the relevant docs. Note that it is important to describe a vector search tool (in the function docstring) as something that requires full questions to be asked. Otherwise tabular or other structured results can be sent to the function which does not work very well for vector search!

I ask another question that did require multiple queries to the same vector data store tool. Persistence is a virtue here. The question below calls the columnar tool once and the vector store tool three times.

Query: List the three characters that have spent the most time in a castle and give a detailed summary of each of them and their closest friends

The three characters who have spent the most time in a castle are Sleeping Beauty, Sindbad, and Jack. 
\n\n1. **Sleeping Beauty** is known for her story where she is pricked by an enchanted spindle and falls into a deep sleep in a castle.
A wicked fairy cast a spell which led to Sleeping Beauty and everyone in the castle falling asleep for a hundred years. This story does not provide explicit information about her closest friends.
\n\n2. **Sindbad**, a sailor from Baghdad, shares stories of his adventures and wanderings with his friends. He has encountered a range of interesting scenarios in different castles throughout his seven voyages.
In one instance, he ends up in the city of Serindib where he narrates his adventures to the king. He presents most beautiful pieces of rock crystal from the mountain he had climbed and it's written in letters of gold. Again, specific friends aren't mentioned but it can be inferred that he's close with his fellow sailors and the people he encounters during his voyages.
\n\n3. **Jack** is best known for his adventures including his encounters with giants and magic beans. On his journeys, Jack comes across many castles, often belonging to giants from whom he steals treasures. His kindness is evident when he gives one such castle and all its riches to those who have suffered. Jack teams up with King Arthur's son during one of his adventures,
indicating a close friendship. One of his most famous journeys involves climbing a beanstalk multiple times to a giant’s castle in the sky.\n\nDo note that the descriptions of their friends are a bit general as the original stories don't provide specific information about close relationships. Jack's friendship with the prince is one exception to this.

The agent loop used the descriptions of the functions, which came from their docstrings, to select the tools to answer the question.

Note that the columnar data does have some information about closest friends that were not picked up here. It is perhaps reasonable because those data are not of a statistical nature. The function description directs the agent to only use the columnar data function for quantitative details which is misleading in this case. The schema of the table and some enums is in the context though so perhaps the agent should have picked it up.

The problem here was the program arrived at a dead end because it thinks it has used the best tools for the job but it failed to find the closest friends (I also added eye color to the question). This is where the Turing machine’s transitions functions are needed to avoid deadlock as per the comment in the code block above. If I uncomment that, it is showing in effect what the decorators mentioned earlier would do under certain circumstances e.g. high vector search distances or whatever. With this, then we see the program recovers. I have added the example log output here. At this stage in testing there are too many round trips to the LLM but it is a good test of persistence.

In future I plan to do more work with the design of the data stores, functions and the various interfaces while testing the limits of this approach. The objective is to allow for a methodology where we can focus on the data modeling and then create a simple functional program flow framework for controlling agent systems that is predictable and efficient — and requires little to no time on faff like prompt engineering.

Notes

ChatGPTs full response on Turing Machine

  1. Infinite Tape: A Turing machine operates on an infinite tape, which is divided into discrete cells. Each cell on the tape can hold a symbol from a finite alphabet, which typically includes characters such as 0, 1, and blank symbols. The tape extends infinitely in both directions, providing an unbounded amount of storage.
  2. Tape Head: The Turing machine has a tape head, which is a read/write mechanism. The tape head is positioned over a specific cell on the tape and can read the symbol at that cell. It can also write a new symbol to the cell and move left or right one cell at a time. The tape head’s position is crucial for determining the machine’s current state and guiding its behavior.
  3. States: The Turing machine has a finite set of internal states. These states represent different operational modes or configurations of the machine. The behavior of the machine depends on the current state it is in. There is an initial state in which the machine starts, and there may be one or more special halting states that signal the end of the computation.
  4. Transition Function: The transition function, often denoted as δ, is a set of rules that dictate the machine’s behavior. The transition function defines how the machine processes its input and changes its state as it goes through its computation. It specifies, for each combination of the current state and the symbol read from the tape under the tape head, the following: (i) The symbol to write on the tape (which can be the same symbol), (ii) whether to move the tape head one position to the left or right (or stay in the same position). (iii) the next state to transition to.
  5. Halting Condition: The Turing machine continues its operation until it reaches a halting state. A halting state is a special state that indicates the machine has completed its computation. When the machine enters a halting state, it stops all further transitions and effectively terminates. The contents of the tape at this point represent the output of the computation.

Co-evolution of data and programs

The deep design element here is how our data and programming methodology co-evolve to create reliable programs. Just like in traditional programming, if your data are not clean and organized and indexed properly you are screwed. So here we are trying to simultaneously figure out how to organize our data while also how to write programs over our data when talking to the LLM interpreter. In the end, the separation between program and data should fade away! The “Turing states” that any program enters can be serialized and understood in the context of where in the knowledge graph we are/pointing to. We can start to recognize reusable motifs in these structures, concretely and/or abstractly.

I want to think a little bit more (systematically) about the structure of the data and actually how easy or hard it is for the program to forage for data to answer questions. And from there we can think about some sort of way to measure performance with respect to a knowledge store (a deep analysis of which will be left for a further article).

Examples of folks bridging the gap between LLM/agent programming and traditional programming

I pose the question if a lot of these things and also some of the first generation of Agent frameworks are locally optimal solutions that have prematurely converged around the quality of first generation LLMs. For example it feels like some of the precautions we need to take in GPT3/3.5 were already not really needed in GPT-4. My stance is keep doing what you’re doing as a software engineer or data engineer and wait for LLMs to catch up rather than rushing to create wrappers around contemporary LLMs that require rethinking how we program. Prompts, chain of X etc are probably already becoming obsolete.

Links

--

--