LLM & tooth brushing: a project using Generative AI on Raspberry Pi to educate and entertain children

Florian BERGAMASCO
9 min readJul 4, 2024

--

How to entertain children by answering their questions about science, art and other subjects during the 3 minutes it takes to brush their teeth.

Dall-E generation via Copilot

Introduction

This innovative project uses a Large Language Model on a Raspberry Pi to create an intelligent, fun storyteller for children. The idea is to take advantage of the 3 minutes a day spent brushing their teeth to stimulate children’s curiosity and learning about a variety of subjects, including science, art, history, geography and so on.

What is a Large Language Model ?

AI Generative is a fascinating area of AI that specialises in the automatic creation of content that could be considered creative or original. This technology learns from large amounts of data and are then able to produce new works, whether text, images, music or other forms such as video.
Generative AI can be used to design digital works of art, compose music, write poems or articles, and even generate 3D models. It opens up new horizons in the field of creativity and offers powerful tools for content creators.
In our case, we will be looking at a sub-category of generative AI, the Large Language Model (LLM).
An LLM is an AI system that can understand and generate natural text from textual data. It is a form of artificial intelligence that uses machine learning algorithms to learn the rules and structures of language from large quantities of text from a variety of sources, such as books, articles, websites, etc. An LLM can then use this knowledge to create a new language. An LLM can then use this knowledge to answer questions, summarise texts, write stories, and much more based on a statistical approach.

Equipment

To carry out this project, you will need the following materials:
· Raspberry Pi: to run the Large Language Model
· A speaker : to talk to the child

Prerequisite

The Raspberry Pi will be the brain of the mechanism. It will generate text to answer to the questions of the children. For that we will need to install some libraries :

sudo apt-get update
curl -fsSL https://ollama.com/install.sh | sh

Ollama is a cutting-edge technology solution designed for artificial intelligence and natural language processing enthusiasts. It provides a robust and flexible platform for running advanced language models locally, allowing users to benefit from the power of AI without the need for specialised hardware.

sudo apt-get install mpg123

mpg123 is a fast, efficient audio player, renowned for its ability to play MP3 files from the command line. It is compatible with a variety of operating systems and is lightweight and easy to use. As an open-source project, mpg123 benefits from an active community that contributes to its development and continuous improvement.

The Raspberry Pi uses the Python programming language, you need to install two libraries :

pip installer gTTS

There are several APIs available for converting text to speech in Python. One such API is the Google Text to Speech API, commonly known as the gTTS API. gTTS is a very simple to use tool that converts typed text into audio that can be saved as an mp3 file. The gTTS API supports several languages, including English, Hindi, Tamil, French, German and many others. Speech can be delivered at one of two audio speeds, fast or slow.

How to use Ollama ?

Dall-E generation via Copilot

In the universe of LLMs that run locally and on the Raspberry Pi, I’ve been able to test few solutions, but the simplest, most powerful and easiest to use is Ollama.
Ollama hosts its own list of models to which you have access.

You can download these models locally on your Raspberry Pi, then interact with them via a command line. Alternatively, when you run the model, Ollama also runs an inference server hosted on port 11434 (by default) which you can interact with via APIs and other libraries.

How do you choose which model to use on your raspberry pi? As you will see, the characteristics of your raspberry pi will quickly limit the use of models with a large number of parameters.

Perhaps before choosing your model, a quick reminder of what a parameter is, what a weight is, etc.

The differences between models

LLM models use internal parameters to learn natural language features and rules. These parameters are adjusted during training based on the data provided to the model. Parameters can take the form of weights, biases or scaling factors that influence the behaviour and output of the model. The number of parameters in an LLM model indicates its complexity and potential to produce quality text. Some LLM models have hundreds of millions or billions of parameters, enabling them to handle a wide variety of tasks and linguistic domains.

All the models are available there :
https://ollama.com/library

As part of my “LLM & toothbrushing” project, I tested various wide language models (LLM) available on Ollama, in order to assess their ability to generate quality text on a Raspberry Pi.
I started with the “qwen2:0.5b” model, which can be used on any type of Raspberry Pi, but which has limited text generation with many errors, particularly in the content generation part. This model may well be enough to explain results and sensors (weather, etc.).
The “tinyllama” model, which can also be used on all Raspberry Pi models, seems to be the limiting model for the Raspberry Pi 3 in terms of size and number of parameters. To use it, you need to use the server version, which requires fewer resources than the command lines.
For the Raspberry Pi 4 and 5, the “Phi3” model seems to be the best, and it’s the one I used to generate content in this use case. This model offers more fluid, consistent and diverse text generation, while remaining fast and compact. The prompt is well understood, enabling it to generate content that meets demand.
Feel free to test other models

How to interact with LLM models: the prompt

Dall-E generation via Copilot

A prompt is a written instruction given to the language model, regardless of the choice of model. The prompt is used to guide the LLM model in generating the desired content, indicating the purpose, subject, tone, style, format, etc. of the text to be produced.
A good prompt is therefore essential for obtaining a satisfactory response from the AI.
- A prompt must include clear and precise written instructions. Ambiguous wording, questions that are too broad or too vague, and contradictory or incomplete requests should be avoided. Simple, direct vocabulary should also be used, with no jargon or abbreviations.
- A prompt should provide sufficient context for the LLM model to understand the subject, audience, purpose and framework of the content to be generated. Context can include background information, reference sources, specific data, concrete examples, etc. Context helps the LLM to select relevant information, avoid factual errors, adapt tone and level of language, respect ethical or legal constraints, etc.
- A prompt must clearly define the task that the model is going to perform, i.e. the type of content to be generated, its length, structure, format, etc. The task must be formulated explicitly, in the form of an order, a question, a challenge, etc. The task must be achievable, i.e. within the capabilities of the LLM, who is not an expert in every field, nor a soothsayer, nor a magician.
- A prompt must indicate the format of the content to be generated, i.e. the way in which the text will be presented, organised, highlighted, etc. The format may include elements of typography, page layout, mark-up, punctuation, etc. The format must be consistent with the task, subject, audience and medium of the content to be generated. The format must be specified in the prompt, or deduced from the text preceding or following the prompt (for example, if the prompt is inserted into an existing document).
- A prompt should provide examples to the template, if possible, to illustrate the expected result, or to show what should be avoided. Examples can be extracts from similar texts, models, samples, evaluations, etc.

These guidelines are not strictly sequential, and do not need to be applied systematically. Rather, they should be seen as best practice, which can vary according to the needs, preferences and creativity of the user. The key is to create a prompt that is adapted to the situation, that clearly expresses the request, and that enables the LLM to generate the most appropriate content possible.
Don’t hesitate to iterate on your prompt so that you end up with the best formulation.

For example, for our use case :
We want the language model to answer a 7-year-old child in French to a question that children regularly ask.
A prompt might look something like this:
Explain to a 7-year-old in French in 500 words why the sky is blue?
1. Task = “Explain” […] “why is the sky blue”.
2. Persona = “to a 7-year-old child”.
3. Format = “in French in 500 words”.
I could also had this context, but I’m happy with the result I get without it, hence the importance of iterating
4. Context = “To tell a story while the child is brushing his teeth”.

How does the product work?

To optimise performance, I’ve designed the product in two separate scripts.
The first script is the one that runs the GenAI large language model, which receives questions from children and generates appropriate answers. This script runs continuously in the background using the resources of the Raspberry Pi.
The second script is the one that is launched when the Raspberry Pi is started up, and which uses a text-to-speech system to communicate with the children. This script ensures that tooth-brushing time is animated, by proposing questions to the children. In this way, the product minimises the latency, and provides a fluid and entertaining experience for children.

Script #1 : main_LLMgenerated.py

###
#
##. Execute the file one
#
###
import os
import subprocess

lines = []
with open("questions_enfants.txt") as file_in:
for line in file_in:
lines.append(line)


if os.path.exists("dir_questions/")==False:
os.mkdir("dir_questions/")


i=0
while i<len(lines):

if os.path.isfile("dir_questions/"+str(i)+".txt")==False:
question_children = lines[i]
question_children="".join(question_children.split('\n'))


arg1="ollama"
arg2="run"
arg3="phi3"
## if the performance is low :
#--> Chose another model : qwen2:0.5b (low quality), tinyllama, etc...

arg4="Explique à un enfant de 7 ans en français "+question_children



result = subprocess.run([arg1,arg2,arg3,arg4], capture_output=True, text=True).stdout.strip("\n")
print(result)


f=open("dir_questions/"+str(i)+".txt","w")
f.write(str(result))
f.close()


i=i+1

Script #2 : main_toothbrushing.py


# Import the required module
import os
import random
from gtts import gTTS
import time
from datetime import datetime



t_question = []
id_question=[]
i=0
with open("questions_enfants.txt") as file_in:
for line in file_in:
t_question.append(line)

if os.path.isfile("dir_questions/"+str(i)+".txt"): #file has been generated by the LLM
id_question.append(i)
i=i+1


random.shuffle(id_question)



# This module is imported so that we can
# play the converted audio
# Language in which you want to convert
language = 'fr'


mytext = 'Trois'
myobj = gTTS(text=mytext, lang=language, slow=False)
myobj.save("Trois.mp3")
os.system("mpg123 Trois.mp3")
print(3)


time.sleep(0.5)
mytext = 'Deux'
myobj = gTTS(text=mytext, lang=language, slow=False)
myobj.save("Deux.mp3")
os.system("mpg123 Deux.mp3")
print(2)

time.sleep(0.5)
mytext = 'Un'
myobj = gTTS(text=mytext, lang=language, slow=False)
myobj.save("Un.mp3")
os.system("mpg123 Un.mp3")
print(1)


time.sleep(0.5)
mytext = "C'est parti pour trois minutes du brossage de dents"
myobj = gTTS(text=mytext, lang=language, slow=False)
myobj.save("debut_brossage.mp3")
os.system("mpg123 debut_brossage.mp3")



start_dateTime = datetime.now()




time.sleep(1)
i=0
while i<len(id_question):

if os.path.isfile("dir_questions/"+str(id_question[i])+".txt")==True:

print("dir_questions/"+t_question[id_question[i]])

mytext = t_question[id_question[i]]
myobj = gTTS(text=mytext, lang=language, slow=False)
myobj.save("dir_questions/"+str(id_question[i])+"_question.mp3")
os.system("mpg123 "+"dir_questions/"+str(id_question[i])+"_question.mp3")
time.sleep(0.2)


f = open("dir_questions/"+str(id_question[i])+".txt", "r")
mytext = f.read()
f.close()

myobj = gTTS(text=mytext, lang=language, slow=False)
myobj.save("dir_questions/"+str(id_question[i])+"_texte.mp3")
os.system("mpg123 "+"dir_questions/"+str(id_question[i])+"_texte.mp3")
time.sleep(1)


maintenant_dateTime = datetime.now()
print(maintenant_dateTime)
difference = (maintenant_dateTime-start_dateTime).total_seconds()
if difference>=180 :
i=len(id_question)
i=i+1

time.sleep(0.2)
mytext = "C'est la fin du brossage de dents"
myobj = gTTS(text=mytext, lang=language, slow=False)
myobj.save("fin_brossage.mp3")
os.system("mpg123 fin_brossage.mp3")

Launching the scripts

To launch the two scripts without using an interface, either connect to your car using SSH, or instruct the two scripts to start when the Raspberry Pi is switched on with the command “ crontab -e ”:

You can use a console mode editor like nano.
Simply add the following line with the @reboot option:

@reboot python3 /home/florianbergamasco_RaspPi/Desktop/LLM4kid_toothbrushing/main_LLMgenerated.py &
@reboot python3 /home/florianbergamasco_RaspPi/Desktop/LLM4kid_toothbrushing/main_toothbrushing.py &

Web Demo ?

To demonstrate the possibilities offered by LLMs, I’ve created a web demonstrator in javascript, html and css, which shows the result. The aim is to entertain a child during the 3-minute tooth-brushing session, by offering answers to questions that children have about science, art, or any other subject that interests them. The web demonstrator also uses Text to Speech technology, which transforms the text generated by the LLM into a synthetic voice, making the experience more playful and immersive (the voice will have to be revised ^^).
Happy brushing!

https://riviera-digital.com/BrosseADents/llm/index.html

--

--