ISO 20022 — A ready-to-use Knowledge Graph ?

Pierre Oberholzer
12 min readMay 3, 2024

--

Abstract

Banks must approach the use of Generative AI cautiously, particularly in critical areas like Payments Operations and Financial Crime Compliance, due to its tendency to generate unreliable “hallucinated” content. The risk associated with inaccuracies in business-critical applications is a significant concern.

Meanwhile, the emerging standard for financial transactions ISO 20022 is often seen merely as a migration challenge in cost-conscious environments, overshadowing its potential as a bridge in the gap between data handlers and users. But, this standard exhibits traits that are highly compatible with advanced AI technologies, promising to democratize and enhance data accessibility.

This article suggests converting the structured ISO 20022 XML data into Knowledge Graphs (KGs), enhancing Retrieval Augmented Generation (RAG) systems. This approach moves beyond just text by grounding the information retrieval to a structured, enterprise-maintained data source, enabling more informed and reliable outputs.

When sufficient processing time is invested, a high-quality graph can be obtained using a GPT-4 level model, which delivers impressive capabilities for human-to-data queries.

Bridging the gap between operations and analytics could lead to substantial improvements in efficiency and deeper insights.

Gen AI — Hype and reality

Artificial Intelligence (AI), particularly its latest form, Generative AI (Gen AI) relating on Large Language Models (LLMs), has become a major topic of interest since the release of ChatGPT at the end of 2022, prompting widespread prototyping across the enterprise world throughout 2023, this interest being particularly driven by its demonstration of notable emerging capabilities [1].

However, its promotion to production-level applications is hindered by the unreliability of its “hallucinated” content. This unreliability is especially problematic for high-risk, business-critical applications in sectors like banking, where even minor inaccuracies cannot be tolerated due to the high stakes involved.

“All hype aside, there are still only four use cases for LLMs: brainstorming, summarization, coding assistance, and semantic search.” [2].

We are interested in exploring semantic search in this article, focusing on how it enables the extraction and access of information.

RAG: A preliminary approach to grounding LLMs

The techniques aimed at “grounding” LLMs are gaining attention, particularly with advancements in Retrieval Augmented Generation (RAG) systems. These promising systems enable models to incorporate predefined data inputs into their searches and into the prompts passed to the LLMs, thereby enhancing the relevance of information retrieval. Additionally, the link to the original text document used can be included as a reference in the LLM output provided to the user for fact-checking.

However, RAG systems encounter significant challenges related to data sampling:

- Embedding issues: Embeddings transform text into compact numeric vectors used by RAGs, but they might dilute the specific signal needed or fail to capture it entirely. Furthermore, the model used for the embeddings may not align with the business context of the application, potentially providing “out-of-context” embeddings.

- Selection limitations: Only a small “top-k” selection of promising data samples, identified by comparing embeddings of a user’s request with existing data, can be forwarded to the LLM due to token window size limitations. This restriction can cause the omission of crucial information because of arbitrary cutoffs. Long–context window LLMs [3] might be able to overcome this issue in the near future.

- Beyond text: Data like that in the ISO 20022 XML format, which is highly structured, has already segregated data elements according to specific business realities, encapsulated in business components and tags. This structured data is often overlooked in RAG systems, despite its potential to enhance information accuracy and relevance.

These limitations highlight the need for further refinement in RAG methodologies to ensure that crucial data is not omitted and that the retrieved information accurately represents the intended business context.

Graph-RAG: Enhancing retrieval with Knowledge Graphs

A Knowledge Graph is a knowledge base that uses a graph-structured data model or topology to represent and operate on data [4].

A minimalistic graph

Knowledge Graphs are data structures that consist of nodes, that describe entities (e.g. persons, companies, locations, dates, abstract concepts) and edges, which capture the semantic relationships underlying these entities (e.g. relation, causality, inheritance). In the example provided, the graph includes two nodes (Jack London and New York) connected by one relationship (RESIDENCE).

Although the concept of Knowledge Graphs is not new, their practical application has been limited by the considerable human effort required to define data organization rules, collect and analyze data, and ultimately populate these graphs (as exemplified by the semantic web). Now LLMs have proven to be exceptionally effective at extracting content from data, positioning them as ideal partners for constructing knowledge graphs. Additionally, LLMs can function as high-quality tools for querying data, particularly when it is highly structured. Like all systems, LLMs benefit from a structured approach to data, potentially overcoming some challenges mentioned earlier for the standard RAG systems.

As a result, the community is increasingly focusing on a new approach, which involves leveraging Knowledge Graphs as a source of choice for RAG systems, referred to as Graph RAG systems.

An interesting combination: LLMs and Knowledge Graphs [5,15]

The combination of the two technologies appears to be promising.

ISO 20022: A natural fit for Knowledge Graphs

ISO 20022 is often and rightly described as a rich and structured data standard. Indeed, it encapsulates relevant information from transaction banking in a condensed dataset, detailing who interacts with whom, when, and why, making it a perfect enabler for Traditional Artificial Intelligence (AI) better known as Machine Learning (ML), and Business Intelligence (BI) [9].

Less commonly noted, the standard is somewhat self-explanatory as it includes English-like tags that accurately describe what the data elements truly mean [6]. This feature makes ISO 20022 not only intuitive and clear but also particularly well-suited to Large Language Models (LLMs) when performing semantic extraction. In short, it contains more metadata than actual data.

ISO 20022: An emblematic data standard

Yes, the adoption of ISO 20022 might be slower than expected, yes, it is currently confined to payments only, and yes, banks use multiple other data sources. However, it is emblematic of what can propel the industry forward. It fosters collaboration across the entire sector, standardizing how data elements are described and communicated, aiming for clarity and self-explanation wherever possible. Also ISO 20022 not only sets the benchmark for data quality but also exemplifies a direct connection between data producers and consumers [7].

This standard could become a model for other data types in the financial industry and help reshape how data is managed, taking advantage of current technologies and ways of working together.

“In 2024, the separation between data and AI is dissolving. It’s no longer about having a data strategy and a separate AI strategy.” [8]

Full data flow — From ISO 20022 to Users

Full data flow — From raw ISO 20022 to Knowledge Graph to User

In the following sections, we outline a two-step approach to implementing a Graph-RAG system using ISO 20022. This process primarily utilizes open-source code and methods previously detailed by Tomaz Bratanic [10], with some customizations specifically tailored to accommodate the unique aspects of ISO 20022.

Part I — Use LLM to build Knowledge Graph

Data flow part I — From raw ISO 20022 to Knowledge Graph

We utilize the LLMGraphTransformer class, recently introduced in detail [10], from the open-source langchain package [11]. Essentially, this class employs a customized prompt via the LLM chat interface to extract entities and relationships from a pre-processed XML document, possibly adhering to certain schema constraints. The resulting output from the LLM is then stored as graph objects in the Neo4j graph database [12] to maintain the obtained graph structure.

The input message passed to the LLM, shown below as an excerpt, is based on a sample from the CBPR+ test samples provided by Swift [13]. It has been adapted for the purposes of this article, with modifications to BICs and entity details.

The following extract shows the input message used — A pacs.008 ISO 20022 test message [13].

<Envelope xmlns="urn:swift:xsd:envelope" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="urn:swift:xsd:envelope ../../March21Schemas/Translator_envelope.xsd">
<head:AppHdr xmlns:head="urn:iso:std:iso:20022:tech:xsd:head.001.001.02">
<head:Fr>
<head:FIId>
<head:FinInstnId>
<head:BICFI>AAAAGBXXXXX</head:BICFI>
</head:FinInstnId>
</head:FIId>
</head:Fr>
<head:To>
<head:FIId>
<head:FinInstnId>
<head:BICFI>BBBBROXXXXX</head:BICFI>
</head:FinInstnId>
</head:FIId>
</head:To>
<head:BizMsgIdr>pcs008bzmsgidr-1</head:BizMsgIdr>
<head:MsgDefIdr>pacs.008.001.08</head:MsgDefIdr>
<head:BizSvc>swift.cbprplus.02</head:BizSvc>
<head:CreDt>2022-10-20T08:25:00+00:00</head:CreDt>
</head:AppHdr>
<pacs:Document xmlns:pacs="urn:iso:std:iso:20022:tech:xsd:pacs.008.001.08">
<pacs:FIToFICstmrCdtTrf>
<pacs:GrpHdr>
<pacs:MsgId>pcs008bzmsgidr-1</pacs:MsgId>
<pacs:CreDtTm>2022-10-20T08:25:00+00:00</pacs:CreDtTm>
<pacs:NbOfTxs>1</pacs:NbOfTxs>
<pacs:SttlmInf>
<pacs:SttlmMtd>INDA</pacs:SttlmMtd>
</pacs:SttlmInf>
</pacs:GrpHdr>
<pacs:CdtTrfTxInf>
<pacs:PmtId>
<pacs:InstrId>pcs008bzmsgidr-1</pacs:InstrId>
<pacs:EndToEndId>pacs008EndToEndId-001</pacs:EndToEndId>
<pacs:UETR>7a562c67-ca16-48ba-b074-65581be6f001</pacs:UETR>
</pacs:PmtId>
<pacs:IntrBkSttlmAmt Ccy="RON">591636</pacs:IntrBkSttlmAmt>
<pacs:IntrBkSttlmDt>2022-10-20</pacs:IntrBkSttlmDt>
<pacs:ChrgBr>DEBT</pacs:ChrgBr>
<pacs:InstgAgt>
<pacs:FinInstnId>
<pacs:BICFI>AAAAGBXXXXX</pacs:BICFI>
</pacs:FinInstnId>
</pacs:InstgAgt>
<pacs:InstdAgt>
<pacs:FinInstnId>
<pacs:BICFI>BBBBROXXXXX</pacs:BICFI>
</pacs:FinInstnId>
</pacs:InstdAgt>
<pacs:IntrmyAgt1>
<pacs:FinInstnId>
<pacs:BICFI>CCCCROXXXXX</pacs:BICFI>
</pacs:FinInstnId>
</pacs:IntrmyAgt1>
<pacs:Dbtr>
<pacs:Nm>Jack London</pacs:Nm>
<pacs:PstlAdr>
<pacs:StrtNm>E 45th St</pacs:StrtNm>
<pacs:BldgNb>405</pacs:BldgNb>
<pacs:PstCd>10017</pacs:PstCd>
<pacs:TwnNm>New York</pacs:TwnNm>
<pacs:CtrySubDvsn>NY</pacs:CtrySubDvsn>
<pacs:Ctry>US</pacs:Ctry>
</pacs:PstlAdr>
</pacs:Dbtr>
<pacs:DbtrAcct>
<pacs:Id>
<pacs:Othr>
<pacs:Id>25698745</pacs:Id>
</pacs:Othr>
</pacs:Id>
</pacs:DbtrAcct>
<pacs:DbtrAgt>
<pacs:FinInstnId>
<pacs:BICFI>AAAAGBXXXXX</pacs:BICFI>
</pacs:FinInstnId>
</pacs:DbtrAgt>
<pacs:CdtrAgt>
<pacs:FinInstnId>
<pacs:BICFI>CCCCBEXXXXX</pacs:BICFI>
</pacs:FinInstnId>
</pacs:CdtrAgt>
<pacs:Cdtr>
<pacs:Nm>Ascent Bank</pacs:Nm>
<pacs:PstlAdr>
<pacs:StrtNm>Main Street</pacs:StrtNm>
<pacs:TwnNm>Bucharest</pacs:TwnNm>
<pacs:Ctry>RO</pacs:Ctry>
</pacs:PstlAdr>
</pacs:Cdtr>
<pacs:CdtrAcct>
<pacs:Id>
<pacs:Othr>
<pacs:Id>65479512</pacs:Id>
</pacs:Othr>
</pacs:Id>
</pacs:CdtrAcct>
</pacs:CdtTrfTxInf>
</pacs:FIToFICstmrCdtTrf>
</pacs:Document>
</Envelope>

The figure below illustrates the Knowledge Graph generated from the aforementioned message. The nodes are organized according to the parties involved — Creditor, Debtor, Agents — and other attributes such as Transaction Details that are present in the message.

Output — Knowledge graph obtained for one single pacs.0008 ISO 20022 test message [13]

We compare four state-of-the-art Large Language Models (LLMs) working out-of-the-box using `LLMGraphTransformer`, and with the same message sample to minimize complexity and facilitate interpretation. As of the writing of this article, the Gemini 1.5 API is not yet available in Europe.

Comparison of LLM-assisted creation of Knowledge Graphs

This table is organized as follows:

  • LLM model: Specifies the model used in the test.
  • Execution time: The time required for the LLM to generate a Knowledge Graph (KG) from a single message.
  • Coverage: The ratio of attributes identified in the KG to the total number of unique attributes in the raw XML.
  • Graph: The resulting Knowledge Graph, with nodes color-coded by entity properties.
  • Node labels: Labels assigned to the nodes by the LLM, derived from the raw XML.
  • Relationship types: Semantic relationships inferred by the LLM, based on the raw XML.

Quality-wise, only the gpt-4–0125-preview model manages to extract nearly all entities (with the exception of currency, which should be parsed separately) while also providing meaningful node labels and relationship types. However, this high quality comes at the expense of slower execution time.

The other models, while demonstrating significantly faster execution times, exhibit notable quality issues:

  • The llama3-70b–8192 model fails to detect more than half of the entities but still manages to accurately infer address-related information. However, it incorrectly assigns the account number as a phone number.
  • The gpt-3.5-turbo-0125 model functions as a “pure parser,” extracting all content but failing to infer any entity types or relationships, rendering it ineffective for semantic search.
  • Finally, the mistral-large-2402 model misses most of the content and shows very limited entity recognition capabilities, significantly undermining its utility.

We continue our analysis using only the gpt-4–0125-preview model, applying it now to the entire CBPR+ dataset [13], which consists of 650 messages across various message types [14]. The image below is just an extract, displaying the first 300 nodes out of a total of 1738 nodes.

Knowledge Graph (first 300 nodes) obtained for the 650 ISO 20022 messages of the CBPR+ test dataset [13]

This graph will be used to test the information extraction capabilities in the next chapter of our analysis.

Part 2 — Use LLM to build Knowledge Graph

Data flow part II — From Knowledge Graph to User

Now that we have constructed the ISO 20022 Knowledge Graph, we can utilize it to retrieve information. For all necessary details and code, we refer the reader to the work of Tomaz Bratanic [10]. Here, we note that two retrieval methods are combined:

  • Structured data retrieval : This method uses a full-text search engine to explore the neighborhood of the entities extracted from the user query, passing the most relevant results to the LLM as context in the prompt.
  • Unstructured data retrieval : This method employs a hybrid approach that combines embeddings based on ISO 20022 text content with keyword-based search on the same dataset.

Note that compare to the original version, we perform the following modifications:

  • Retrieve as well entity labels in the structured retriever.
  • Use one call to the LLM to filter out this result and remove irrelevant entity labels.

We conducted queries on the entire dataset previously discussed, which consists of the 650 test messages from the CBPR+ dataset [13], using only the gpt-4–0125-previewmodel due to the reasons previously outlined. It’s crucial to test our query engine on a sufficiently large dataset to incorporate the typical “noise” that analysts would encounter in a real-life context. The schema of the test is representative of a virtual global bank that implements all 109 CBPR+ use cases in its operations.

In terms of schema, the test effectively simulates a virtual global bank that would implement all 109 CBPR+ use cases in its operations. This approach ensures coverage of the “maximal complexity” in terms of metadata (use cases), but not in terms of data itself (volume of transactions are in the range of million per day globally).

The table below lists four queries that we have experimented with. This represents just a small sample of the potential queries that can be tested. We encourage you to engage with us if you are interested in contributing to further testing and improvement.

A few queries asked to the Knowledge Graph

In our observations of intra-message type queries (where all information is present in one message), we noted that even though messages containing Jack London were mixed among many others, the model delivered high-quality responses for those specific queries. However, it struggled with inter-message queries (where information spans across multiple messages), such as those asking to identify a common account held by unknown entities, because it failed to find the relevant context “around” Jack London as an entity and across messages. By reframing this query to center on the specific account number, 25698745, we were able to overcome this issue and obtain a correct result.

We query the database separately to show the evidence in the next graph.

MATCH (n)-[r:!MENTIONS]-(c:Person) where n.id = '25698745' return n,r,c.
Discovering an account owned by various parties

While many relationship types identified are redundant, we can confirm the list of the four Person entities discovered in the earlier query.

This success demonstrates our ability to uncover unknown account owners solely by conducting human-language queries on the ISO 20022 knowledge graph, without supplying any specific domain knowledge to the model.

Conclusion

The integration of Knowledge Graphs with Retrieval Augmented Generation (RAG) into a so-called Graph RAG approach has been introduced and motivated by the limitations of full-text methods. This is particularly relevant when dealing with highly structured and semantically rich data, such as ISO 20022. The creation of a high-quality Knowledge Graph can be fully automated using a GPT-4 level model, though it requires a significant amount of time (approximately 60 second per message). In these cases, the responses provided to users posing questions in plain English are impressively accurate. However, as LLMs continue to improve in terms of performance (both quality and execution time) and cost-efficiency, the application of such approaches to entire datasets remains challenging due to time and resource constraints. A more viable option may involve a hybrid approach, combining Graph RAG with faster query engines capable of scanning through numerous transactions, potentially focusing on structured data for more efficient processing.

Next step

Reach out directly to info@alpina-analytics.com to discuss your use case or visit Alpina Analytics for more information. We’ve confronted these challenges for years in real-life, high-risk application contexts, and we understand that banks need long-lasting proven expertise to mitigate risks, rather than quick wins that never make it to production.

About Us

Pierre Oberholzer is the founder of Alpina Analytics, a Switzerland-based data team dedicated to building the tools necessary to make inter-banking data, including Swift MT and ISO 20022, ready for advanced analytics.

George Serbanut is a Senior Technical Consultant at Alpina Analytics supporting Swiftagent — a conversational interface to navigate and query Swift MT and ISO 20022 data.

References

[1] https://openreview.net/pdf?id=yzkSU5zdwD
[2] https://www.linkedin.com/posts/valliappalakshmanan_all-hype-aside-there-are-still-only-four-activity-7127524292254199809-m6WF?utm_source=share&utm_medium=member_desktop
[3] https://blog.google/technology/ai/long-context-window-ai-models/
[4] https://en.wikipedia.org/wiki/Knowledge_graph
[5] Shirui Pan et al. Unifying Large Language Models and Knowledge Graphs: A Roadmap, 2024, https://arxiv.org/pdf/2306.08302
[6] https://medium.com/@pierre.oberholzer/iso-20022-navigating-rich-semantics-7571ea76f8a2
[7] https://medium.com/@pierre.oberholzer/iso-20022-front-to-back-to-front-5291c585214d
[8] https://data-ai-trends.withgoogle.com/
[9] https://medium.com/@pierre.oberholzer/iso-20022-an-enabler-for-ai-and-bi-0c2a54042c6c
[10] https://bratanic-tomaz.medium.com/constructing-knowledge-graphs-from-text-using-openai-functions-096a6d010c17
[11] https://github.com/langchain-ai/langchain
[12] https://neo4j.com/docs/getting-started/get-started-with-neo4j/graph-database/
[13] https://www2.swift.com/mystandards/#/c/cbpr/samples
[14] pain.001,002,008,pacs.002–004,008,009,010 and camt.029,052–060,107–110
[15] John Tan Chong Min, ​​Large Language Models and Knowledge Graphs: Merging Flexibility and Structure, available on YouTube: https://www.youtube.com/watch?v=1RZ5yIyz31c&t=2574s

--

--

Pierre Oberholzer

I'm a lead data scientist active in the development of platforms for responsible artificial intelligence (AI) and inter-banking transaction monitoring.