WattzOn’s Mr Bill Surpasses Human Accuracy in Text Extraction from Structured Documents

Martha Amram
GLYNT.AI
Published in
2 min readApr 17, 2018
Data from forms, tables, invoices and utility bills is valuable. Now it is easy to get too.

This is a big day at WattzOn. Our team has done a fantastic job, and the performance of our our machine learning system, Mr Bill, has been studied and measured. And its accuracy exceeds that of human data entry teams and current human/software systems. So for that high-value data that must be lifted out of forms, invoices, utility bills and other structured documents, Mr Bill is the accurate, scalable and speedy solution. See the press release announcing the results and describing our study in detail.

As we work with customers from a diverse range of industries, automated text extraction performance comes down to three things:

— Precision: Does the system return accurate data?

— Recall: What percent of the data items that were supposed to be read come back with data?

— Training and Setup Costs: High document volume always comes with document variety. The ROI of text extraction often gets bogged down when large training sets are need or customized, fragile extraction system have to be set up for every single variation in document layout.

Mr Bill’s performance on these key three factors is extraordinary. As the press release reports, our study reports F1 scores of 96–98%. The F1 score is a blend of Precision and Recall, so high F1 means both are working great. And, Mr Bill needs only 20 example documents for training; document setup costs are vanishingly small.

It’s a happy day here at WattzOn. Give us a shout if you want to learn more.

--

--