AI vs Humans: Who is better at Summarizing Documents?
Blind Proof of Concept Tests Reveal Clear Winner
Using artificial intelligence (AI) to summarize long and complex documents is the golden snitch of the ongoing global high-stakes tech wizard tournament. Everyone wants to know how good the current AI models are in this task. To find a definitive answer, Australia’s corporate regulator, the Australian Securities and Investments Commission (ASIC), has pitted AI against humans in summarizing complex documents study. Their independent blind-review trial used an open-source Large Language Model (LLM) to summarize submissions made to a government inquiry, and the results identified a staggering gap in performance between human and machine result scores. Here is an overview of the methodology, process and results of the Proof of Concept (PoC) test.
The Neural Nuts and Bolts of the AI Trial
ASIC teamed up with Amazon Web Services to test AI’s ability to summarize lengthy, detailed documents. The goal was to see if AI could handle the task as well as humans, focusing specifically on finding mentions of ASIC, highlighting recommendations, and suggesting more regulation. Meta’s open-source AI, Llama2–70 B, was the AI model used in this test.