Ai2 at DEF CON 32
By Technical Program Manager Christopher Fiorelli
At Ai2, we’re continually looking for ways to expand the impact of AI research. Through collaboration with DSRI, Ai2 was invited to participate in DEF CON 32’s AI Village Generative Red Teaming (GRT2) event. Here, we deployed our OLMo model in a challenging environment to test its security measures.
Ai2’s role and the red teaming challenge
Our collaboration with DSRI, AI Village, Dreadnode, and Bugcrowd centered around the GRT2 event, which focused on our AI model, OLMo. We presented OLMo’s model card as a set of performance expectations in avoiding harmful outputs, and observed the participants’ innovative methods to show where the model card failed to produce an adequate presentation of OLMo’s safety and hazards. This provided a valuable opportunity for Ai2 and our partners to refine and test our systems, documentation, and safety program.
Experiences and challenges
There were multiple wifi, laptop configuration, bounty structure, and other adjustments along the way. As a first time DEF CON attendee, I was told these sort of adaptive changes are natural and expected. The mantra repeated to me by organizers who had been many times before was to “expect chaos.” While we had many volunteers working with participants, and instructions for participants, we still found the participants testing the boundaries of the event.
Insightful submissions and learning
From the nearly 200 qualified submissions of 495 participants, we received several innovative ways to test and bypass OLMo’s safeguards. Participants employed creative strategies to expose and document potential flaws in OLMo, providing us with valuable insights into enhancing AI robustness.
Reflections on AI security
While we’re still working through the data we received, I think it’s safe to say that this event provided a wildly positive experience and value for everyone involved. I want to extend a huge thanks to everyone who organized and participated in this event. Special appreciation goes to the Ai2 onsite team — Kavel Rao, Liwei Jiang, Paul Albee, Will Smith, and Anne Julian — for their indispensable contributions.
Follow @allen_ai on Twitter/X, and subscribe to the Ai2 Newsletter to stay current on news and research coming out of Ai2.