Keeping 👀 on AI tools with Marvin

Published in

The Prefect Blog

5 min readJun 16, 2023

a tapestry of random, potentially related events

Since March 14, when I posted about instrumenting LLM tools with Prefect, a whole lot has happened in the AI tooling landscape.

Most significant to my world, Marvin has taken on a life of its own. Leveraging the diverse LLM support from the open-source community, the data engineering amenities offered by Prefect, and the type-safety of Pydantic — Marvin has grown up to be a pythonic LLM toolbox that you can drop right into traditional software contexts to solve problems. If you want to learn more about how Marvin works, read the docs!

expanding the original use case

Making a Slack-based assistant for our Prefect Community was the start of all things Marvin, and it has been the use-case where we prove out our concepts to make sure they work in practice.

Today we’ll look at a concrete example of observability in an AI system — namely how we can track Marvin’s global token usage with the Events and Automations systems offered by Prefect Cloud (generous free tier btw).

background

You can chat with our Prefect assistant bot (affectionately, also Marvin) in the #ask-marvin channel in our Slack community. There’s a parallel Marvin bot we run internally to assist with tasks like support, debugging and summarizing internal process documents .

Each time you or I tag any of our Marvins, Slack will hit that Marvin app’s FastAPI endpoints with the Slack event data and prompt a Prefect-aware Bot with the user question:

Each time, the bot will leverage an LLM (like GPT 4) and may use an arbitrary number of tools in order to answer the user question — you can learn more about the state of Marvin’s OpenAI tool use here.

Between LLM calls and plugin execution, this can be time and token intensive stuff!

Prefect Cloud offers the idea of an Event , which is just a named and timestamped payload that we can emit from anywhere and then react to.

emitting events

Once Marvin settles on a response and sends it to you, we’ll run a little prefect routine from inside the endpoint called emit_event:

from prefect.events import emit_event

...

async def _slackbot_response(event: SlackEvent):

    # other things happening ...

    response = await bot.say(text)

    slack_response = await _post_message(
        channel=event.channel, message=response.content, thread_ts=event.ts
    )

    prompt_tokens = count_tokens(text)
    response_tokens = count_tokens(response.content)

    # this will do nothing if Prefect credentials are not configured
    emit_event(
        event=f"bot.{bot.name}.responded",
        resource={"prefect.resource.id": f"bot.{bot.name}"},
        payload={
            "user": event.user,
            "channel": event.channel,
            "thread_ts": thread,
            "text": text,
            "response": response.content,
            "prompt_tokens": prompt_tokens,
            "response_tokens": response_tokens,
            "total_tokens": prompt_tokens + response_tokens,
        },
    )
    
    # more things happening ...

… basically saying, “hey! Marvin responded in slack” to Prefect Cloud.

As a technical note, our community Marvin bot is running as a GCP Cloud Run service and has two important environment variables set:

PREFECT_API_KEY=pnu_XXX
PREFECT_API_URL=https://api.prefect.cloud/api/accounts/<UUID/workspaces/<UUID>

which allow it to talk to our Prefect Cloud workspace.

So after I emit my event, I can pop into the marvin-bot workspace’s Event Feed and see some details about it:

Now this event lives in the Events DB of my Prefect Cloud workspace, so I can query events over time for token usage by user or whatever else.

Being able to track aggregate token usage is great, but most exciting to me is the idea of using this event to immediately trigger other actions.

the automations system

Automations in Prefect Cloud have the form: Trigger > Action

So, with Prefect Cloud as a listener, I can have the receipt of certain events trigger an additional action. So here’s my trigger:

Where any events with resource ids like `bot.*` will trigger my automation

… which reads as, trigger this automation once I receive 1 event where the resource id looks like bot.* .

Now, we decide what our action will be. Let’s choose Run a deployment , since I already implemented a small flow to increment a token tracker:

I can use jinja to template event payload data into string fields

Once I finish configuring the automation, a flow will run each time a user tags Marvin in Slack, incrementing my token tracker.

Flow run timeline view for token tracker flow

Now, if I only ever wanted to increment a token tracker, I could have just done that directly from within the _slackbot_response method from the Marvin app itself, without emitting an event that triggers a Prefect flow.

However, the larger idea here is that I can easily do arbitrary, observable work by eventing directly off of the user’s interaction with my application.

I can maintain global state with Prefect Cloud and use it in many ways:

To govern the state transitions of tasks within my application, and observe each unit of work performed (especially when the LLM is the one deciding how and when to use additional tools)
I could moderate user-bot interactions with my application(s), in a way that I can modify without re-deploying the Marvin service (i.e., edit an ACL Block in Cloud; the app reads from the Block in Cloud)
if I were using my own self-hosted LLM for my Marvin app, I could have this interaction trigger a flow to update a domain-specific training set
I could service user requests received by the Marvin app which require work from different applications, using the Prefect Cloud Events and Automations systems as the router of that work

going forward

I’m excited to expand the set of events that are emitted by Marvin, and in turn expand the set of actionable information I can leverage as a builder.

Having spent a lot of time in batch-schedule-land historically, its lovely to enter a fully event-driven world without breaking out a whole new streaming solution / tool.

As for Marvin’s future, we’ll be working on a first-class and opt-in Event instrumentation of Marvin’s LLM utilities to make as it as easy as possible to build transparent and reactive systems to control our use of AI in code.

Let us know if you have any questions in our Prefect Slack community or our Marvin Discord.

—

Happy Engineering!