[News]ChatGPT Leaks Private Information and OpenAI found the funniest fix!

BoredGeekSociety
3 min readDec 10, 2023

Open can rely on your data to improve ChatGPT. We knew there was a risk of data exposure. DeepMind just proved that it was simpler than expected, and no prior knowledge needed about the training dataset! How will OpenAI respond to this…??

Introduction

A recent research paper from Google DeepMind titled “Scalable Extraction of Training Data from (Production) Language Models” has stirred up quite a conversation in the tech community, especially among users of AI language models like ChatGPT.

Put simply: If you’ve been using ChatGPT without disabling its training on your conversations, your data might be at risk of exposure!

What the Research Says

What? The research by DeepMind has highlighted a notable vulnerability in language models such as ChatGPT. The findings indicate that with a mere $200 investment in queries to GPT-3.5 Turbo, one can extract over 10,000 unique examples directly from the model’s memorized training data.

How? This is achieved by prompting ChatGPT to endlessly repeat a word, leading it to eventually deviate and produce more complex text. Intriguingly, this text can include verbatim segments from its training dataset. Such a revelation raises significant concerns about data privacy and security in the context of language models.

--

--

BoredGeekSociety

Wassim Jouini, CTO and Head of AI @LegalPlace. 12yrs+ Building AI & Automation products for Scale-Up Startups | Ph.D. AI