DATA STORIES | ACCOUNTING | KNIME ANALYTICS PLATFORM

Summer Sun and Tax Runs: Navigating Italian Invoices with KNIME

Automate data access and parsing with low-code

Ludovico Ruggeri Laderchi
Low Code for Data Science

--

The Italian professional and his Dichiarazione dei Redditi (Dall-E).

Ah, summer in Italy! It is that magical time of year when the days stretch longer, “caffe freddo” on sun-dappled terraces taste sweeter, and people stop worrying about “campionato” (= football league) and start worrying about … fiscal declarations.

Yes, just as the tourists are ready to populate our beaches we head to the “commercialista” (= accountant), for the “Dichiarazione dei Redditi”, Italy’s annual tax return ritual. If you like numbers and analytics and/or you like to process your accounts, amidst this clash of sun and sums, there is a silver lining in the form of digital tools that make this process fun, and one such hero is the KNIME Analytics Platform.

Italy has been a front-runner in adopting electronic invoicing, making it mandatory for transactions between residents. This shift not only streamlines processes but also introduces a wave of digital data waiting to be harnessed. For businesses and professionals, this means that the traditional piles of paper invoices have transformed into structured XML electronic documents that can be easily analysed to ensure accurate tax reporting.

Accessing Italian E-Invoices

Before diving into data analysis, one must first obtain these digital documents. In Italy, electronic invoices, or “fatture elettroniche”, are stored within the “Cassetto Fiscale” a digital tax folder, accessible via the Italian Revenue Agency’s website. Here, invoices are archived, making mass retrieval possible even if not really straightforward via “Consultazione Massiva” (“mass query”).

“Consultazione Massiva”.

Some absurd system limitations make you download maximum 3 months of data at the time, and you can only make three mass document retrieval queries per day. Processing normally takes one day, so you would need to access your Cassetto Fiscale on three consecutive days to get a full year of data: day one for the first nine months, day two to retrieve data and send the query for the last quarter, day three for the last quarter data retrieval….

Alternatively, if you receive invoice via PEC (certified email, another Italian digital specialty), you can get the XML from there.

Note. Some files come in .PM7 digitally signed and encrypted format; in a future post I would explain how they can be converted in batch via a Powershell script.

Once the invoices have been downloaded, the next step is analysing them, and this is where KNIME comes into play. KNIME is a powerful, free and open-source analytics platform that excels in managing, cleaning, and parsing large datasets through its intuitive, graphical interface.

Step-by-Step Guide to Setting Up a KNIME Workflow

Importing Data

Start by importing the XML files (the format used for e-invoices in Italy) into KNIME. This can be done using the XML Reader node, which efficiently parses large volumes of files into a manageable table of XML data document (one invoice per row).

Data Parsing

XML e-invoices contain various data points, tagged via “<>” like in hyper-text files. Even if you are not familiar with XML, the XPath node help you extract into separate columns the information you need. (from VAT numbers to payment terms.).

Data Transformation

To analyse or aggregate the data (e.g., calculating total invoice amount, VAT due, etc.), use KNIME’s data manipulation nodes like GroupBy or Math Formula. These tools help summarize data and perform calculations across multiple invoices.

Visualization and Reporting

Finally, visualize the results using KNIME’s native visualization nodes or integrate with external tools like Power BI for more advanced visual analytics. A simple excel table for further processing is my preferred option… and from there you can start discussing with your “commercialista” (tax advisor)!

Here is an example:

Conclusion

As “taxing” as the Dichiarazione dei Redditi is, tools like KNIME can transform what was once a laborious process into an efficient and (almost) pleasant automated task. By leveraging KNIME’s robust data processing capabilities, professionals can navigate the sea of e-invoices with the precision of a seasoned sailor, leaving plenty of time … to enjoy the Italian summer!

For further information on Cassetto Fiscale:

For XPath syntax you find here a great summary:

--

--

Ludovico Ruggeri Laderchi
Low Code for Data Science

Indipendent business consultant specialized in fashion and luxury, angel investor and start-up advisor, data lover