AI agents will change everything, here’s how to get your own

Jody Doherty-Cove
8 min readDec 4, 2023

--

This piece was written ahead of my practical training workshop at the NCTJ’s Artificial Intelligence in Journalism event.

Imagine a world where the tedious tasks you dread are handled not by you, but by a smart digital assistant. That’s the reality OpenAI made possible on November 6, 2023. They launched a feature that lets anyone, regardless of their tech skills, create a custom ‘GPT’ — a digital helper programmed for specific jobs. This isn’t just a small step; it’s a giant leap in making AI accessible to everyone.

But this isn’t just about delegating tasks; it’s about supercharging your capabilities in an unprecedented way.

AI agents are here and this is how you get your own

Picture this: not just one, but hundreds of these digital helpers at your beck and call, each expertly programmed for specific tasks. Whether it’s sifting through data, or organising your schedule, these assistants eventually will be able to do it all — and more. The beauty lies in their versatility and the sheer scale of what they can achieve, working simultaneously to amplify your productivity and efficiency.

In the upcoming sections, we’ll explore the inner workings of these digital assistants. ‘Instructions,’ which set out their purpose and goals; ‘Knowledge,’ encompassing the vast pool of information they tap into; and ‘Actions,’ how they engage and interact with the wider digital world.

This innovation is particularly significant in the context of journalism. It transcends mere efficiency; it’s about expanding the horizons of journalistic capabilities. Digital assistants can undertake a range of tasks — from in-depth research to data analysis and preliminary reporting — freeing journalists to delve deeper into their stories, using traditional human intelligence to craft narratives that resonate and inform.

In this blog, I will take a simple idea — a bot dedicated to the noble task of submitting Freedom of Information requests on my behalf. We will delve into the core concepts that make such a creation possible, and then we’ll take a look at what these new, formidable GPTs mean in the short and long term.

First, head to this link.

FYI: You will need a ChatGPT+ account to create your own GPT.

Giving it purpose

Using instructions

Every digital agent requires a purpose, its raison d’etre. Think of this purpose as a need you wish to fulfil. For me, the goal is to simplify the process for journalists to send Freedom of Information (FOI) requests to authorities. Everything we do from this point forward will be centered around this objective.

To begin, I simply communicate with the GPT creator, explaining my desired outcome. This interaction automatically tailors the ‘Instructions’ to align with my request, setting this agent apart from ChatGPT by giving it a clearly defined role. Conveniently, it even generates its own name and profile picture.

The chat to create the GPT, left, and GPT’s self-configured instructions based on the chat, right

Now let’s take it for a test run.

I pose a query: “How much did Oxford council spend on tea and biscuits in the first half of this year?” Initially, the responses may vary, reflecting the broadness of its purpose. It’s set to ‘assist,’ but this isn’t quite specific enough. Like humans, our digital agents excel when their purpose is crystal clear, so we’ll need to add more detail.

Without defining exactly what we want it to do, it will guess what it needs to do — specificity is key

In this instance, I need the assistant to draft the FOI email — not offer suggestions or general advice like before. To amend this, I simply go back to the conversation interface and give it more precise instructions, as I would when speaking to a person.

By speaking to our GPT, left, we can get clearly defined output, right

Now, you can see that it has refined its instructions for specificity. When we ask the question again, we get the exact response we’re looking for.

This is a significant step forward. Having an assistant draft an email saves time, but let’s push it further.

I ask, “Who should I send this to?”. My ever-helpful agent replies with a list of places I might be able to find FOI addresses. The assistant’s reply, while charming in its ‘do-it-yourself’ ethos, isn’t exactly what I’d call a full-throttle use of AI’s brainpower. Time to beef up its ‘Knowledge’ base.

Giving it knowledge

Using files and search

Currently, we’re interacting with a Large Language Model (LLM), which excels in language but isn’t primarily designed for retrieving specific information. Anyone who’s ever gotten a bizarre, left-field answer from GPT can vouch that it isn’t exactly Sherlock Holmes when it comes to digging up facts.

Our goal is to enhance its knowledge base, guiding it on where and when to find the necessary information. Simplifying the LLM’s function, it predominantly predicts the next word or sentence based on your input. This approach, while ingenious, doesn’t always lead to the most reliable outcomes. To elevate our GPT’s capabilities beyond educated guesses, we need to alter its behaviour to perform targeted tasks.

It’s similar to the difference between the LLM correctly guessing ‘2+2 = 4’ based on its training data, and knowing when to use a calculator to verify that ‘2+2 = 4’.

With this in mind, we must teach our assistant when and where to obtain required information. GPTs provide a couple of options for this: A) We can upload our knowledge in documents, or B) We can instruct it to search the web for knowledge.

When it comes to uploading our own knowledge, we can use files like PDFs or CSV documents as a reference source. For our project, we could upload a list of email addresses for every authority under the sun. Handy, but that’s like keeping a pet dinosaur — high maintenance.

Alternatively, GPTs can leverage web search capabilities, like using Bing, to find information. In our scenario, I’ll instruct the GPT to search the web for the relevant authority’s contact address before drafting the email. This process involves going back to the chat interface and specifying that the assistant should perform a web search to find the required information.

Now, my GPT will use its browsing capability to find the relevant email

Now, when I pose my question, the assistant will automatically locate the relevant contact information before composing the email. Impressively, it even displays its search process and the websites it visits during the task. Upon delivering the answer, it provides the source of its information.

So, we’ve now established a clear purpose in its instructions and guided it on enhancing its knowledge. The next step is enabling it to interact effectively with the external world, which involves giving it the metaphorical ‘hands’ it needs through actions.

Giving it hands

Using actions

Great, our email is drafted, and we even have the address to send it to. But there’s one more step, and in my view, it’s the most groundbreaking of all — ‘Actions’. Up until now, our interaction has been confined to a chatbot, limited to the realms of our conversation. But imagine unleashing the full potential of our assistant, enabling it to perform tasks on our behalf.

Consider our scenario: the GPT drafts the email, but we’re still manually copying and pasting the content. What if our digital agent could autonomously send it for us?

A word of caution: this is where things get technical and we encounter limitations. Technically, you can enable ‘Actions’ under the configure tab, which allows the GPT to interact with other programs. This could range from sending information to other apps or fetching specific data like weather forecasts.

You can add ‘actions’ in the GPT’s configure tab

This process is more complex than previous configurations, so I won’t dive into a detailed tutorial here. If you’re eager to explore this, using Zapier Plugins is a quick entry point, here’s a YouTuber who explains it brilliantly.

But let’s talk about the limitations. In my FOI example, setting up the email sending function using the Zapier method involved quite a bit of trial and error. More importantly, the system doesn’t always flawlessly execute instructions (but then again, who does?).

It sometimes stumbles, like when it sent an email to Brighton and Hove City Council without allowing for a review of the draft as it had been instructed to do. Thankfully, no harm was done — the email was fine — but this incident highlights a crucial point. When we give AI the power to interact with the external world, it becomes increasingly important that our digital creations listen to all our orders. It is also a reminder that with great power comes great need for a ‘STOP’ button.

Despite these limitations, my FOI agent is a star employee. They helped me to dispatch eight FOI requests in just ten minutes, covering everything from local restaurant hygiene reports to authorities’ emergency response plans.

Going forward

Looking ahead, the simplicity of creating these virtual assistants is only going to improve, opening up a world of possibilities. But what does this mean for us?

As I mentioned at the beginning, the ease of crafting these digital agents is a massive leap towards democratising computer intelligence. In the past, something like this would require an army of developers, product managers, and a constant cycle of upkeep. Now, it’s within reach for just about anyone, and it doesn’t break the bank.

We’ve walked through creating one digital agent. Now, let your imagination run wild — think 10, 20, 50, or even 100 personal digital minions, each a specialist in its own right, all working towards your goals. Ponder an overlord bot, a sort of digital ‘middle manager,’ orchestrating this symphony of assistants, aligning their tasks with your ambitions. This scenario might seem like science fiction, but given the rapid evolution of this technology, it could be closer than we think.

A workforce of AI agents under your command?

But let’s not get too far ahead of ourselves and consider the immediate impact. Take our FOI example: if everyone begins submitting requests at lightning speed with the AI’S help, how will our public offices keep up? Might this necessitate a rethink of FOI policies or require more resources for processing these requests? And this is just one application. When you start to factor in the myriad tasks GPTs are capable of handling, the potential changes to our daily lives and work environments are profound.

There’s a saying that people tend to overestimate a technology’s impact in a year and underestimate it in a decade. As we stride into 2024, we should keep one critical question in mind: How do we harness this rapidly advancing technology to chase our goals… and do it safely?

--

--