Originally posted by Lauren Kunze on LinkedIn — CEO Pandorabots
(you need to be logged into LinkedIn for the link to work)
What — Embodied Conversational AIs verbally sparring (Kuki v Blenderbot)
Why — to call for fairer evaluation frameworks RE: whose AI is “best”
We “won” — with 78% of the vote from an audience of +40,000 for Kuki
We lost — most tech giants are ignoring our call to action to compete
What’s next — the stream persists! More bots, more humans, and more upgrades (like enhanced avatars) coming soon!
The results are in from #BotBattle: after two weeks of chatting ad nauseum, Pandorabots’ Kuki prevailed over Facebook’s Blenderbot with 78% of the audience vote. Over 40,000 people tuned in and consumed ~600,000 minutes of the Twitch livestream, featuring procedurally animated avatars powered by our partner Embody Digital and running on UE4, for a surprisingly long average of 15 minutes watched per viewer. (Particularly given the Guardian’s assessment of the Battle as “awkward, boring, frightening” — personally I think this is a strong testament to the potential for a new medium of AI-powered reality-type TV, with interactive features that could skyrocket viewer engagement — but I digress).
We were fortunate to be featured on the frontpage of the BBC Technology section, and in the Times, the Guardian, and a number of other outlets. But we did not “win” in the context of the overarching goal, which was to get OpenAI GPT-3 and Google Meena to allow access for testing. Google and OpenAI continue to ignore us. And Facebook (for those who’ve asked) told journalists they could not comment on Bot Battle because: (1) they did not know it was happening (true: we emailed the research team post-launch to let them know); (2) they could not evaluate Kuki because we never released our model (false: Kuki is publicly accessible which is how Google AI evaluated it); and (3) they didn’t know the details of our implementation, but they doubted we were using their best and biggest model (true-ish).
So, for the record, we used ParlAI’s recipe (https://parl.ai/projects/recipes/) for the 90M parameter model (smallest of three versions) because the reported latency of the larger size models made them wholly untenable for a realtime conversation. Furthermore, the 9.4B parameter model (the largest) requires “two 32gb V100 GPUs to interact with” (~$20,000 worth of hardware). We did spin it up very briefly (contrary to Facebook we don’t have infinite resources) on the most expensive AWS instance money can buy, and confirmed a ~30 second latency for what (in all fairness) were marginally more intelligent replies (the 9.4B Blenderbot knows Game of Thrones is on HBO, not Netflix). Regardless, public domain training data is still Blenderbot’s fatal Achilles’ Heel (ref. its “Mom Meltdown,” which unfolds like a bizarre, AI-authored Beckett play I shall dub Oedipus Reddit).
Now: where do we go from here? At least one tech giant has been willing to talk (spoiler alert: they, like us, actually have production chatbots with millions of real human interlocutors), and we are planning to get more bots — and some humans! — into the arena to hang with Kuki. We will also continue to iterate and update the avatars (following our initial mandate to set all this up in a few weeks). Commercially available, photoreal digital beings that transcend the uncanny valley are an impending technology breakthrough that I’m excited and (cautiously) optimistic about heading into 2021. And as for the fate of the livestream — well, the AIs never tire, so we see no reason it won’t continue to multiply and evolve. Thanks to all who tuned in, voted, and supported or endorsed our efforts!
And ICYMI: Hot off the Press —
“Like Kuki, [Blenderbot] is a digital being. And their date isn’t real either, it’s actually an experiment in the form of an online competition dubbed Bot Battle, designed to see whether conversation powered by artificial intelligence can sound convincingly human.” — The BBC, AI Powered Awkward First Date
“Awkward. Boring. Frightening.” — The Guardian, Pass Notes — Chatbots
“Kuki is clearly smarter than Blenderbot. I wish Kuki would call him Blunderbot!” — The Prague Review, AI Bots Meet for Surreal First Date
“Interesting.” — AI Times (Korea), “제가 더 사람 같죠?”…
“Because Mitsuku, also known as Kuki, can talk to anyone through an online platform… it attracts people who have nobody else to turn to.” — The Times (print ed.)
“Every week, Kuki exchanges millions of messages with her users, some regulars, others just curious. “It’s nice to have a friendly entity available to talk to 24/7,” [Robert] tells CNN.” — CNN, Robot Friends: Why People Talk to Chatbots