My last post has explained why it makes sense to use neural networks for invoice extraction. Now we will take a look at how different neural network architectures perform on this task.
The development and optimization of neural network models has always been a heuristic domain. For the last years, Deep Learning research and their application has followed one of the following paths:
Managing incoming documents in a company can be a very labor-intensive task — especially if you are working in accounting, where still most invoices arrive on paper via post mail. Recent studies have shown that German SMEs (Small and Medium-sized Enterprises) send out an average of 1.895 invoices per month, of which 79% are still on paper. Larger companies even deliver up to 90% of their invoices on paper. Even though some companies would like to use electronic formats or offer them in addition, the lack of standardization and system compatibility impedes all efforts.
So, what happens with all those printed invoices when they are delivered to the recipient? Someone has to sort and read them in order to extract relevant data pieces (also called entities) and sort them into categories. In total, Hypatos has identified up to 70 different types of entities in invoices. Here are a few examples:
- total price,
- unit price,
- tax amount,
- sender address,
- sender bank account’s IBAN and BIC,
- product name and description,
- amount of units bought,
- due date for payment. …