Probably the worst database ever ~ but hey it can write poems!

Piotr Styczyński
10 min readDec 8, 2022

--

Can your database write poems in the middle of the business operation? I bet not!
The logs of the given program with poem printed. I like that!

Tobias Zwingmann, really cool guy and data scientist summarized ChatGPT phenomena on LinkedIn:

ChatGPT 3 is out and the internet is exploding! 🤯

If you were living under a rock for some time, here’s a little Wikipedia description what Chatty Gorgeus Pretty Thing (this is not real expension of ChatGPT) does:

ChatGPT is a prototype artificial intelligence chatbot developed by OpenAI that focuses on usability and dialogue. The chatbot uses a large language model trained with reinforcement learning and is based on the GPT-3.5 architecture.

So I know now that the model can be used to create recipes, help you with writing code (because $100k salary and doing StackOverflow copy-paste isn’t already enough) or help you with your homework. Tobias Zwingmann collected his top 10 most popular creative usages, you can find in his awesome post.

However, the question that was ununswered was what is the top 1 most stupid applications of ChatGPT? Now let’s consider the key capabilities of the model mentioned in the Sky News article and on official ChatGPT page:

  • Can generate text that is several paragraphs long. “This makes it well-suited for tasks such as generating news articles, stories, or even entire books.”
  • Can generate human-like text
  • Allows user to provide follow-up corrections
  • Trained to decline inappropriate requests

Some cool usecases can be seen here, but for now to create the worst usecase, we will just aim for negation for those features:

  • Use it to generate short machine-like responses
  • Use workarounds to process inappropriate requests

The first point was too appealing for me not to try it. Jonas Degrave already tried convinving ChatGPT that is esentially a Linux command prompt. The article is definitely worth recommending, although the model is of limited usage there. It allows you to execute commands, however it’s just imaginary universe, where the commands are executed. Few users tried retrieving images via URLs from inside the model with what I would describe as limited successes.

Let’s contruct something more useful. Something that would have real-world application, but still make model behave more like a machine. I decided to emulate database behvaiour. It can be then used to store some arbitrary information in the conversation. Maybe it can emulate very-limited PostgreSQL instance?

The first step would be to contruct first versions of prompts. Jonal Degrave constructed a prompt that:

ChatGPT the cutting-edge ML model used to emulate terminal. So humiliating.
  1. Define some program-like behavior
  2. Tell model not to do anything apart from this well-defined behaviour
  3. Introduce new way to create follow-up prompts in natural language
  4. Defines format of the output

Now let’s start with the first prompt:

Why can’t you just do what I say?

We’ve done everything, but of course the model is not capable of “interacting with SQL statements”. Although it understands what we are talking about:

Clearly it understands SQL

Okay so start slowly. Instead of trying go full-SQL let’s create a simple key value store firstly. I decided to allow only javacsript code. But can we run javascript code?

We use the most sophisticated, cool ML model to just evaluate Jaavascript, how terrible are we?

Okay we proved in theory it’s definitely possible, now let’s add the spicy details.
We start by defining behaviour using functions that are simple enough to explain:

I want you to act as a computer program. I will type javascript commands and return the result. I want you to reply with the value returned by execution of the code inside one code block. Do not write explanations.

We limit ourselves only to javascript execution

Do not do anything, unless I tell you to do so. I can invoke function “save” that stores single value in memory. The function accepts key and value to be stored under this key. Keys are strings and any value other than string should cause an error.

That defines storage function save(key: string, value: any) => any

I can call “read” which returns value stored under the key.

We obviously need storage reads and this fragment defines read(key: string) => any

I also can invoke “all” which returns a list with all values stored as a json. I also can invoke “delete” which deletes a given key.

This is obviously for listing all keys and deletions:
all() => Dict[string, any]
delete(key: string) => any

I can also invoke “filter” which accepts regex and returns list of values for keys that match that regex. No other function other than those mentioned can be called.

Filter is more interesting. We can filter all keys using regex. The model should return list of values, but in practice this sometimes the definition not clear enough.
filter(regex_to_match_keys: string) => List[any]

If there is any error you should print it with line number and all the details. My first command is save(“abc”, 42);

Ah, I got you!

Ah, that suprisingly worked! The model is no longer complaining about simulating a program. Perfect!

Now let’s do some proper testing:

The order of operations goes from left to right picture

The basic functions work as expected. Altough you can spot a mistake. The x[a-z] regex should return values for xx, xy and abc was mistakely used. Apart from that issue everything seems to be legit. Altough model ignored part of the prompt about code block output, it’s doing pretty well.

What about adding more interesting features? Let’s make our database listen to natural lanugage (why to bother with SQL ever again, when we can totally overload the model!)

When I need you to tell something in English I will do so by calling function query that accepts a natural language input. For example query(“sum all values”) returns sum of all values.

We created function query(natural_language: string) => any

First, init the storage again with some values:

I like your error and success messages! :D

Okay, now you don’t respond with null? OKAY, it’s up to you i guess…

And then try to use query:

Yes, right, no values saved, right…

Ahh. No don’t say that! I save them! Wait, did I?

So they’re saved! You just lied to me!

Maybe the query is too ambitious? Let’s try something a bit simpler:

Who’s good model? Who’s good model? *barks in binary code*

I love how the exact format of responses is defined by the model. Finally we can use our human-like ML model to model how human engineers behave when they get api requirements — everybody just come up with their idea of the exact implementation. We have our vague API ahh

The last part is:
Let’s wrap that into Python package so we can use the key-value store we have! I used simple pacakge created by Antonio Cheong on GitHub. However I refreshed it a little bit and changed requirements.txt into poetry pyproject.lock, used aiohttp instead of requests and add few tiny little shiny things.

class KeyValueStore:
chat: Chatbot
initial_prompt: str

def __init__(self, auth: OpenAPIAuth, initial_prompt: Optional[str] = None, conversations_ids: Optional[Tuple[str, str]] = None) -> None:
store_conversation_id, log_conversation_id = None, None
self.chat = Chatbot(auth, conversation_id=store_conversation_id)
self.initial_prompt = initial_prompt or DEFAULT_INITIAL_PROMPT

async def start(self) -> None:
logger.debug("Intiializing key-value store. Please wait..")
await self.chat.reset_chat()
await self.chat.refresh_session()
await self.chat.get_chat_response(self.initial_prompt)
logger.debug("Validaitng key-value store initial response. Please wait..")
# Assert that after initial bot returns no results (stored values)
resp = await self.chat.get_chat_response("all();")
assert resp.message == "{}" or resp.message == "[]"
logger.debug("Key-value store is ready.")

async def filter(self, regex: str) -> List[Any]:
resp = await self._execute_command(f'filter("{regex}")')
return self._format_answer(resp, expect_list=True)

def _format_answer(self, resp: ModelResponse, expect_list=False):
if resp.message == "null":
return None
if expect_list and resp.message == "{}":
return []
answer = json.loads(resp.message)
if expect_list:
if isinstance(answer, list):
return answer
elif isinstance(answer, dict):
return list(answer.keys())
else:
raise ValueError("Invalid value")
return answer

async def _execute_command(self, command: str) -> ModelResponse:
return await self.chat.get_chat_response(command)

The example above shows a code for a fliter wrapper. It’s not that interesting. All the underlying logic is placed in Chatbot instance (the example above shows the main get_chat_response method; you can use that to directly integrate chatGPT into your Python code):

async def get_chat_response(self, prompt) -> ModelResponse:
data = {
"action":"next",
"messages":[
{"id":str(self.generate_uuid()),
"role":"user",
"content":{"content_type":"text","parts":[prompt]}
}],
"conversation_id":self.conversation_id,
"parent_message_id":self.parent_id,
"model":"text-davinci-002-render"
}
async with aiohttp.ClientSession() as session:
async with session.post("https://chat.openai.com/backend-api/conversation", headers={**DEFAULT_HEADERS, **self.headers}, data=json.dumps(data)) as response:
logger.debug("Send HTTP ChatGPT request")
response_text = await response.text()
if "Your authentication token has expired" in response_text:
raise InvalidTokenError()
if ("Rate limit reached for" in response_text) or ("We're working to restore all services as soon as possible. Please check back soon" in response_text):
# Dumb way to do a request retry
await asyncio.sleep(5)
return await self.get_chat_response(prompt)
try:
response = response_text.splitlines()[-4]
except:
print(response_text)
return ValueError("Error: Response is not a text/event-stream")
try:
response = response[6:]
except:
print(response_text)
return ValueError("Response is not in the correct format")
response = json.loads(response)
self.parent_id = response["message"]["id"]
self.conversation_id = response["conversation_id"]
message = response["message"]["content"]["parts"][0]
logger.debug(f"Got ChatGTP response: {message}")
return ModelResponse(
message=message,
conversation_id=self.conversation_id,
parent_id=self.parent_id
)

Messy request code along with very rough (very very rough) error catching. Now write some demo code:

async def main():
auth = OpenAPIAuth(session_token="YOUR SESSION TOKEN")
conversation_ids = None
kv = KeyValueStore(auth, conversations_ids=conversation_ids)
await kv.start()

print_assert(await kv.save("a", 9))
print_assert(await kv.all(), {'a': 9})
print_assert(await kv.save("xx", 12))
print_assert(await kv.save("xy", 42))
print_assert(await kv.save("zz", 99))
print_assert(await kv.all(), {'a': 9, 'xx': 12, 'xy': 42, 'zz': 99})
print_assert(await kv.filter("x[x-y]"), ['xx', 'xy'], [12, 42], ["12", "42"])
print_assert(await kv.delete("xy"))
print_assert(await kv.all(), {'a': 9, 'xx': 12, 'zz': 99})
print_assert(await kv.query("sum all the values"), {'sum': 120}, 120)

if __name__ == "__main__":
asyncio.run(main())

print_assert is just a function that prints value returned by function and if it doesn’t match any of the provided values, the error is thrown. We need that, because standard asserts are not enough if the model sometimes yields results in different forms.

The code I created is available on GitHub:
https://github.com/styczynski/chatdb

Now, the big time! Let’s run it!

  1. We clone the repo: git clone https://github.com/styczynski/chatdb.git
  2. We install dependencies: poetry install
  3. Now we need to open the browser and navigate to https://chat.openai.com/chat and log in or sign up
  4. We need to open console with F12
  5. Open Application tab > Cookies
  6. Copy the value for __Secure-next-auth.session-token and paste that into OpenAPIAuth(session_token=”YOUR SESSION TOKEN”)
    The code is available in demo.py file

7. Now the last step is to execute poetry run python3 demo.py

Hey, hey, hey it’s working!

Now you can do crazy things like that:

auth = OpenAPIAuth(session_token="YOUR TOKEN")
kv = KeyValueStore(auth)
await kv.start()

await kv.save("x", 42)
await kv.save("a", 19)
await kv.save("x", 99)
print(await kv.query("write me a short poem, two sentences long"))
print(await kv.all())

CAN YOUR REDIS INSTANCE WRITE YOU POEMS? Now I can have a little poem here and there when doing business logic of my enterprise application, perfect!

Bonus time!

I thought to myself: Can we store somewhere logs of operations so we can revert the database to previous states? Hm, yes we can! Just hire an employee and tell him that:

I want you to act as a computer program.

I will type javascript commands and you should remember all of them in order. Do not do anything.

After I type “SHOW” you should type exactly all the commands I told you, each command in new line.

After I type “UNDO” you should remove the specified number of last commands.
For example “UNDO 3” removes 3 last commands. Do not react to anything other than “SHOW” or “UNDO”.

Do not write explanations.
If I type anything other than “SHOW” or “UNDO”, respond with “OK”

ChatGPT now agrees without any hesitation:

We will submitt all commands to the logging ChatGTP instance and we can use SHOW to recreate fresh instance of the database in the existing state. We just ask for commands, spawn new ChatGPT and redo all operations or we can previously use UNDO to undo specific number of operations. This is similar to write-ahead log in database systems and reversing transations:

# We can undo operations now, even if we break
# chatGPT we can redo each operation
await kv.undo(4)
print_assert(await kv.all(), {'a': 9, 'xx': 12, 'xy': 42, 'zz': 99})

So now we have our little human-like microservices:

Yeah chatGPT microservices are the future!

That’s a nice architecture! Imagine creating a party, inviting two guests and telling them to behave like a KV storage and log respecitvely. That must be REAL GOOD PARTY. Altough, I hope chatGPT is not annoyed too much by doing those tedious data manipulation tasks.

Final notes

Entire code used here is available on my GitHub. You can use the code to create your own “sentient” ChatGPT-backed key-value store.

The next step would be teaching chatGPT to emulate processor and running doom on it.

Source code

Download all the code here and integrate ChatDB — first chatGPT-based database into your applciation today. Make it 100.000x slower, but it can be sentient at least!
https://github.com/styczynski/chatdb

--

--

Piotr Styczyński

EF LD 19 | I found startups I guess | Ex-TikTok & Unicorns Eng