A completely generic scan solution using Microsoft Fabric and GPT-4 Turbo with Vision

Ronald Vermeire
Heroes Herald
Published in
4 min readJan 25, 2024

In my previous blog post, I wrote about ESG (reports about environment, social and governance). When I was thinking of the best way to describe this solution, the same term crossed my mind namely: Elegant, Simple and Generic. But I do not want to make it too confusing as this scanning solution is also part of our ESG-solution. An important part of the emissions from yourself or others will be taken from invoices, receipts, and other document sources. However, this solution is much more versatile and in no way exclusively tied to our ESG-solution. It is entirely independent and can replace any scanning solution.

Fabric Lakehouse

This implementation starts with a simple folder structure in Microsoft Fabric. Fabric is the place within the Azure cloud for everything to do with data. This includes data engineering, as well as data science and analytics, warehouses and of course PowerBI. For this solution, I am using the so-called Lakehouse, a component of data engineering.

The various parts of Microsoft Fabric

Other than the holiday feeling it gives me, it is also a platform for saving, managing, and analysing structured and unstructured data in one location. It is a flexible and scalable solution with which organisations can process and analyse enormous amounts of data.

A Lakehouse can also be reached through the“File Explorer”

Fabric Notebook

Another component that we use intensively in this solution is the so-called Notebook. It is a development environment in the cloud that allows you to get started immediately without having to set up anything in advance.

Left is our Lakehouse and right the Notebook with limited configuration necessary.

It works intuitively and gives fast results with little coding. Under the hood, it works with the opensource solution Apache Spark, an engine that enables you to perform data engineering and analysis tasks on a large scale. By working with this open-source engine and open file formats, you prevent your data from becoming locked in. Another important plus are the built-in data visualization capabilities.

Data visualisation in a Fabric Notebook

ChatGPT with Vision

The star of this solution is ChatGPT with Vision. ChatGPT was already incredibly skilled in performing tasks without any special training. But with vision, the model is multimodal, so it can see, hear, and speak. For this, we use the seeing part to scan images, we even convert textual documents to images as we see no data losses in our tests. If this were to happen unexpectedly, we have an extensive safety net.

The revolutionary thing about this, is that it will no longer be necessary to configure a separate scan for every file format (PDF, Word, Excel, etc.). The drawing of boxes around fields is completely in the past. One piece of coding for all your scans, however unlikely that may sound. The only thing needed, is composing a CSV-file with column headers. We have already begun experimenting with leaving this to ChatGPT as well and it is looking very promising.

A small selection of the infinite variety of file formats that are automatically supported. On the right an example of scanned receipts in a structured table format.

Conclusion

With the components described above, our simple solution elegantly comes together like Lego blocks. The only thing we need to do, and what we are particularly good at, is handling data and engineering a generic and efficient prompt. I hope you got a clear idea of how disruptive ChatGPT and Fabric are together. Specifically in this solution, but certainly also for many more applications regarding your data. You no longer need to set up a new scanning line for each document type or layout, and if new scanning needs arise, they can immediately be realized. Furthermore, you can store all your data (whether it comes from scans or not) in one place in an open and easily accessible format within the scalable Fabric SaaS solution. If you would like to schedule a demo, please contact Tom or Tom (tom.vangils@heroes.nl, tom.steenbakkers@heroes.nl).

--

--