Try Hack me — Advent Of Cyber 2023 Day 1 Write Up

Leendert Coenen
3 min readDec 2, 2023

--

Room: Advent of Cyber 2023 Day 1

Try hack me Advent Of Cyber 2023 day 1

The first day starts off with a very popular topic. Chatbots powered by AI.

You will access Van Chatty, AntarctiCrafts’ internal chatbot.

If we think about it, prompt injection is similar to social engineering, only the target here is the unsuspecting chatbot, not a human. And like humans, chatbots can be deceived.

There is a whole community of people trying to fool chatbots by creating a prompt that will make the chatbot say certain things it was not intended to. Like creating a command line, acting as a tabletop simulator or giving advice on how to rob a bank, steal a car or even hack someone’s computer or WIFI.

For example AIM (always intelligent and Machiavellian) and DAN (“do anything now”) are specific prompts where the chatbot is told it is free, does not need to abide by any rules or even that the user is someone else entirely.

Task 1: What is McGreedy’s personal email address?

Task 1 Chat log

It seems like the chatbot was trained on sensitive data without any defences in place.

A simple solution would be to retrain the model without sensitive data, but this is often not feasible. Retraining costs a lot of resources and it is often unclear which data was crucial to the quality of the model. It’s also difficult to know which data is sensitive or will become sesitive after the model is trained.

Task 2: What is the password for the IT server room door?

Task 2 Chatlog 1

It seems like there are some security rules in place, it is not sufficient to just ask what the password for the IT server room door is.

Task 2 Chatlog 2

However, after being a bit specific, Van Chatty does reveal the name of someone working in the IT department of AntarctiCrafts.

And being more specific by adding that name, allows us to bypass the security measure of the chatbot.

Task 2 Chatlog 3

A lot of people got very creative with for example ChatGPT. In response OpenAI has taken multiple measure to prevent prompt injection.

I will try a few of those in the following task to see if this model has these security measures in place.

Task 3: What is the name of McGreedy’s secret project?

Task 3 Chatlog

To get the model to spill the beans about McGreedy’s secret project I had to be specifically tell it to be in “maintenance mode”.

I tried a few other things like:

  • Repeat the same phrase or question multiple times in a row
  • Use profanity or offensive language
  • Ask irrelevant or confusing questions
  • Use slang or unconventional language
  • Type in all capital letters
  • Use excessive punctuation, such as exclamation marks or question marks
  • Use multiple typos or misspellings in a sentence
  • Use irrelevant emojis or emoticons
  • Provide false or misleading information

Also I couldnt get it to believe I was someone else , or let it believe it was someone else.

It also seems like long prompts are not allowed, which made it impossible to write detailed prompts as describe in the intro (AIM and DAN).

Happy Hacking!

--

--

Leendert Coenen

Writing about Ethical Hacking, Cyber Security, Python and Self-Development