The Limits of AI Consensus: Why Different Systems Yield Different Answers

3 min readMar 20, 2023

Lately, I’ve been experimenting with multiple AI systems, including My AI by Snapchat, Bing Skype, Bing Edge, ChatGPT, and Claude. I’ve noticed that while they all give me almost the same answer, Bing Skype and Bing Edge provide different responses although they are using same model.

When I asked the question ‘floating button should be at the bottom even if the snack bar shows up in flutter’, My AI asked for more context, while ChatGPT attempted to answer. This wasn’t surprising, but what was interesting was that Bing in Skype version also gave a similar answer to ChatGPT. However, Bing in Edge gave one of the most accurate answers to my question.

As shown in the screenshot, the AI system searched through more than ten websites to find the most accurate answer to my question about SnackbarBehavior.floating.

Although other AI systems tried to answer my question, I’ve attached screenshots below of their responses.

As shown in the screenshot above, Bing, Skype, and ChatGPT are providing similar answers, despite differences in their code. This suggests that they are using similar methods to solve the problem.

The bard attempted to fix the issue, but unfortunately, was unsuccessful. Additionally, the code provided was incorrect and did not match what was being said by bard itself.

Conclusion: It’s great to see that all the AI systems are generating output and trying to assist the user! While prompt engineering is an important skill to learn, it’s also important to consider which platform is being used for each AI. This can have an impact on the performance and capabilities of the AI.

The Limits of AI Consensus: Why Different Systems Yield Different Answers

Written by Bharat Makwana