Exploring the use of LLM knowledge assistant for analysts at the Bank of Finland.

Ilari Määttä
SPxFiva Data Science
4 min readMay 23, 2024

In recent years, the capabilities of large language models (LLMs) have captivated the world, sparking a search for practical applications. One potential use-case, identified by many organizations, is a tool that improves access to information across internal documents. During the latter half of 2023, such a tool was tested for analyst work in the financial stability and statistics department of the Bank of Finland. This experiment was part of a broader LLM exploration by the Analytics Center of Excellence.

Motivation

The amount of information on a given new topic is often too large for an analyst to identify all relevant aspects. Starting the analysis with internal systems, like workspaces, data repositories, emails, and personal notes, is already a very time-intensive endeavor, without even considering external sources. Typically, the most efficient way to gather pertinent data and expert contacts is by asking colleagues. Alternatively, those same questions could also be prompted from an LLM knowledge assistant that leverages all available internal materials.

Such assistant could enable analysts to quickly locate experts, past articles, discussions, related concepts, data sources, and variable descriptions, all accompanied by concise descriptions and references. The responses from the assistant might not be as insightful as those from an experienced colleague, but the ease of posing questions and the time saved on straightforward queries are significant advantages. Moreover, the assistant does not forget to mention all relevant aspects of a topic, which gives the analysts a wider set of source materials, and thereby not only increases efficiency, but quality as well.

Materials

The materials used in the experiment consisted solely of public information to avoid concerns about data confidentiality while testing various technical solutions. In a production environment, the assistant would need to respect the confidentiality of documents, and tailor the available material set for each user.

The test involved context material comprising 700 BoF Bulletin articles, 10 reporting guidelines and Bank of Finland open data. The technical solutions in the experiment were ChatGPT as the foundational model, supplemented by Azure AI search and a knowledge graph to identify relevant context within the materials.

Results

The knowledge assistant was evaluated using a series of typical queries an analyst might pose when exploring a new topic. For example:

- Please list experts, texts, data, and concepts related to the topic.
- Please give a short description of the concepts and datasets.
- Please describe a specific variable in detail.

The assistant showed promise in generating useful topic summaries and data descriptions, provided the search method successfully retrieved relevant context. At its best, the assistant was able to describe a concept, provide source links and identify related data. However, the performance lagged in pure search tasks, where the analyst sought comprehensive listings of all pertinent information. For such tasks, improvements could include automatic adjustments of search tools, or their parameters based on the input prompt.

The knowledge graph

Another challenge involves linking different materials effectively. Ideally, the assistant would connect a cited data in an article to the correct variable in an open data set and describe it using the reporting guidelines. This capability would be valuable for the analyst in addition to the default, which performs a separate search of most fitting results in each segment of the context materials.

Image 1. A knowledge graph constructed from a small subset of the materials. Nodes represent items and concepts, and links represent relations.

The knowledge graph approach showed potential in creating relevant links within the materials. It constructs a network graph that connects concepts or other relevant items of interest across the materials. The creation of such a knowledge graph can be a complex and tedious process but it can also profit from the new capabilities of LLMs.

In this case, ChatGPT was used to extract relevant concepts and their relations from the underlying articles and guidelines. The extracted concepts were then merged based on semantic proximity to create a knowledge graph that represents the connectedness of topics across different articles. The graph then facilitates retrieval of connected content for the answer generation of the LLM. This approach effectively builds a metadata structure to the underlying data, which is unique to every organization, and might be cumbersome to achieve otherwise.

Conclusions

Testing the knowledge assistant use-case in analyst work proved to be insightful, offering valuable perspectives on which technical solutions might best leverage internal materials. Moreover, it demonstrated the viability of developing internal solutions on top of foundational large language models, potentially meeting the unique needs of an organization.
Understanding the knowledge generation process is important not only for developers but also for analysts, as the choices in design and training influence the output. In any case, the results may include hallucinations and bias, and thus, the analysts should treat them as preliminary ideas for further exploration rather than definitive conclusions.

--

--