LoRA: A Solution to the Lack of Computational Power in African ML Research

Published in

ILLUMINATION

4 min readJul 12, 2023

The field of machine learning (ML) is rapidly growing, and there is a growing demand for ML research in Africa. However, there is a major barrier to entry for African researchers: the lack of computational power.

In developed countries, researchers have access to powerful computers and large datasets. This allows them to train large language models (LLMs) that can be used for various tasks, such as machine translation, question answering, and text summarization.

However, in Africa, the situation is quite different. Researchers often do not have access to the same level of computational resources as researchers in developed countries. This means that they cannot train LLMs, which limits their ability to conduct ML research.

In my recent research work, I came across a new technique called LoRA (Low-Rank Adaptation of Large Language Models) that can help to solve this problem facing by ML researchers in Africa.

LoRA: A Solution?

LoRA is a new technique that can help to solve the problem of the lack of computational power in African ML research. LoRA allows researchers to train LLMs on smaller datasets and with less computational power.

1. How LoRA works

LoRA works by first training a large LLM on a large dataset. This LLM is called the “base model”. The base model is then used to generate a low-rank approximation of itself. This low-rank approximation is called the “target model”. The target model is then fine-tuned on a smaller dataset.

The low-rank approximation is generated by using a technique called singular value decomposition (SVD). SVD decomposes the base model into a set of rank-1 matrices. These rank-1 matrices are then combined to form the target model.

Modified forward pass using low-rank decomposition. (**Image credit**)

2. Technical details

The following is a more technical explanation of how LoRA works.

Let M be the base model, and let T be the target model. The low-rank approximation of M is given by

where σi are the singular values of M, ui are the left singular vectors of M, and vi are the right singular vectors of M. The number of terms k is a hyperparameter that can be tuned to achieve the desired trade-off between accuracy and computational complexity.

The target model T can then be fine-tuned on a smaller dataset. This can be done using various methods, such as supervised or reinforcement learning.

The benefits of LoRA for African ML research

LoRA has several benefits for African ML researchers. Mainly, it can help them to reduce the size and computational requirements of large language models. This is vital because many African researchers do not have access to high-performance computing resources required to train/fine-tune these models. LoRA can help them to train/fine-tune these models on smaller datasets and with less resources. This suggests that LoRA enables African researchers to use large AI models for a variety of tasks, even in resource-constrained settings.

In addition to its potential benefits for research, LoRA could also be used to develop commercial applications in Africa. For example, LoRA could be used to finetune large language models for chatbots, customer service applications, or even creative writing tools. The possibilities are endless, and LoRA has the potential to revolutionize the way that AI is used in Africa.

Conclusion

LoRA is a promising new technique that has the potential to revolutionize African ML research. It can help to bridge the gap between African researchers and researchers in developed countries, and it can enable African researchers to conduct cutting-edge research that would not be possible otherwise.

I believe that LoRA has the potential to make a significant impact on the field of African ML research. I am excited to see how it is used by other researchers in the future.

References

LORA: Low-Rank Adaptation of Large Language Models: https://arxiv.org/abs/2106.09685
Singular value decomposition: https://en.wikipedia.org/wiki/Singular_value_decomposition