Small yet Mighty: Small Language Models, Big Solutions

Usen Osasu
5 min readFeb 18, 2024

--

A picture of a small robot with books

Introduction

Have you ever wondered how something small could wield such immense power? In the world of AI, the answer lies in the ingenious application of small language models. These compact yet formidable tools are revolutionising the way text is analysed, interpreted, and leveraged for decision-making. Welcome to the realm where size doesn’t dictate impact — where small models yield big solutions.

In this blog post, we embark on a journey into the realm of small language models and their profound implications for finance. From unraveling complex corporate financial statements to extracting key insights with unparalleled efficiency, these diminutive models are reshaping the landscape of financial analysis. But how exactly do they accomplish such feats, and what advantages do they offer over their larger counterparts?

Join us as we explore the transformative potential of small language models in the realm of finance. Discover how these seemingly modest tools pack a punch, delivering actionable solutions that transcend their size. By the end of this post, you’ll gain a deeper understanding of how small models are unlocking new possibilities and driving innovations.

Small Language Models (SML)

Large language models have garnered significant attention in recent years for their remarkable ability to understand and generate human-like text. These models, such as GPT-4, are characterised by their vast size, comprising hundreds of billions of parameters. This extensive parameter count enables them to capture intricate patterns in language and produce nuanced outputs across a wide range of tasks.

However, the sheer size of these models comes with associated challenges, including high computational costs, substantial memory requirements, and environmental concerns due to their energy consumption. Additionally, deploying large models in resource-constrained environments can be impractical or inefficient.

Enter small language models, a compact alternative that offers a solution to these challenges. Unlike their larger counterparts, small language models are designed to be lightweight, with fewer parameters and reduced computational demands. Despite their reduced size, small language models retain much of the functionality and versatility of their larger counterparts, making them ideal for applications where efficiency and scalability are paramount.

Small language models typically have fewer parameters, ranging from tens of millions to a few billion, compared to the hundreds of billions found in large models. This reduction in size enables small language models to be deployed more efficiently on a variety of platforms, including mobile devices and edge computing environments.

In essence, small language models represent a more streamlined approach to natural language processing, offering a balance between performance and resource efficiency. By understanding the fundamental differences between large and small language models, we can better appreciate the unique capabilities and potential applications of these compact yet powerful tools in various domains, including finance.

How SLM work

Small language models work by leveraging innovative strategies to reduce model size and computational demands while maintaining effectiveness. One key approach involves optimising the architecture of the model to achieve a balance between efficiency and performance. This may entail reducing the number of parameters, optimising memory consumption, and implementing efficient training algorithms.

Furthermore, small language models often utilise techniques such as knowledge distillation, where a larger pre-trained model serves as a teacher to train a smaller, more lightweight model. By transferring knowledge from the larger model to the smaller one, researchers can achieve comparable performance with fewer parameters, making the model more accessible and practical for a wide range of applications.

Additionally, advancements in fine-tuning strategies play a crucial role in the effectiveness of small language models. Techniques such as adapter layers (LoRA, QLoRA) , prefix fine-tuning, and low-rank matrix addition enable researchers to tailor the model to specific tasks and domains without the need for extensive retraining. This flexibility allows small language models to adapt quickly to new challenges and environments, making them highly versatile tools for natural language processing tasks.

Overall, small language models represent a significant advancement in the field of natural language processing, offering efficient and accessible solutions to the challenges posed by large language models. By leveraging innovative techniques and optimisation strategies, researchers are paving the way for a new era of intelligent language processing technologies that are both powerful and practical.

SLM in Action: Analysing corporate financial statement

The Nigerian Stock Exchange (NSE) serves as the principal exchange for trading securities in Nigeria. It provides a platform for buying and selling stocks, bonds, and other financial instruments, facilitating capital formation and investment opportunities within the Nigerian economy.

In this section, we’ll delve into the practical application of small language models (SLMs) in the analysis of corporate financial statements, using the example of Abbey Mortgage Bank PLC.

Data Preprocessing

Before feeding the financial statements into the SLM, we need to preprocess the data. This involves extracting all the text and tables from each page of the PDF document containing the financial statements. While the specifics of this preprocessing step are not covered in this blog, it typically involves using tools such as PDF parsing libraries to extract the relevant information in a structured format.

process_pdf("path/to/pdf")
Statement of profit and loss (PDF)
Extracted Statement of profit and loss

Generating Analysis

Once we have the extracted text and tables in a suitable format, we pass this data through our SLM inference implementation. This implementation utilises Microsoft’s Phi 2 model for the analysis. Phi 2 is a state-of-the-art small language model developed by Microsoft Research’s Machine Learning Foundations team. With 2.7 billion parameters, Phi 2 demonstrates outstanding reasoning and language understanding capabilities, showcasing state-of-the-art performance among base language models with less than 13 billion parameters.

Model Output

Conclusion

In conclusion, small language models (SLMs) represent a transformative force in the realm of finance, offering efficient and accessible solutions to the challenges posed by large language models. By leveraging innovative techniques and optimisation strategies, SLMs like Microsoft’s Phi 2 are revolutionising the analysis of corporate financial statements, enabling stakeholders to extract key insights with unparalleled efficiency and accuracy. From unraveling complex data to driving informed decision-making, these compact yet powerful tools are reshaping the landscape, paving the way for a new era of intelligent language processing technologies.

Stay updated on the latest developments in AI and technology by following me on LinkedIn! Join the conversation and explore the transformative potential of small language models in shaping the future of AI and beyond.

--

--

Usen Osasu

Senior Data Scientist | Generative AI | Deep learning | Bringing data-driven strategies to the forefront of the fintech industry