Automate Flash Cards Creation for Language Learning with Python
My experience using Python to support my long journey in learning Mandarin as a native French speaker.
Do you need help with learning a new language?
Automate your flashcard creation with Python and data analytics tools to make the process easier and more efficient.
Read on to learn how I have improved my language skills and stayed motivated on my language-learning journey using Python.
Introduction
Learning a language can be a long journey, so staying motivated and having clear goals to aim for are essential.
Because Mandarin uses a pictorial system of writing words and sounds called hanzi 汉字, learning it can be even more challenging for learners without a background in a similar language.
In my quest for Chinese fluency, flashcards have been my best ally in improving my reading and pronunciation.
In this article, I will share my experience using data analytics tools with Python to automate the creation of flashcards to support my learning process.
Summary
I. Use Python to Support Language Learning
1. Lessons Learned: The importance of using flashcards
2. A personal teacher on your phone with Anki
II. Create Anki Flashcards with Python
1. Extracting keywords from Emails with pywin32
2. Extracting keywords from PDF reports with PyPDF2
3. Final Results with Vocabulary lists, including translation
III. Add phonetic transcription with Google Translate
Google Translate API to generate the pinyin of chinese words
IV. Add audio transcript with Google Text-To-Speech
Improve your pronouciation with the support of Text-To-Speech
V. Next Step: Boost your Learning Journey with GPT
Adaptative learning experience using GPT and custom visuals
1. Generative AI: Boost your Learning Experience with GPT
2. Automate the Design of Nice Visuals
3. Conclusion
If you prefer, you can check the video version of this tutorial
Use Python to Support Language Learning
I am a French guy who moved to China to study engineering for a two-year double degree program.
Finally, I stayed for more than six years, and my main challenge was learning Mandarin for daily life and work.
Lessons Learned: The importance of using flashcards
The main mistake I made when I started to learn Mandarin was not following the advice of intelligent people promoting the use of flash cards.
Do you remember as a kid when one of your parents or tutor was holding your book to help you prepare for tomorrow’s history test?
She was asking you questions related to the lesson:
- If you answer well, she can consider that you are ready for the test.
- If you make mistakes, she will ask you to read the lesson again and return when ready.
Now, there is an open-source app for this, and it’s called Anki.
A personal teacher on your phone with Anki
In the picture above, you can find an example of a card to learn how to say ‘Hello!’ in Mandarin.
Step 1: Shows you the word in the Chinese character Hanzi
Step 2: Show you the answer with the following:
- The pronunciation using the romanization system pinyin: nĭ hăo
- The translation in English: Hello!
- The oral pronunciation with an mp3 sound
Step 3: Perform your self-assessment
- If you guessed well, press ‘Good’: the card will reappear in 10 min
- If you think that it’s ‘Easy’, Anki will wait 4 days to ask you again
- If you did not guess well, press ‘Again’; the card will reappear shortly
Objective
To support your learning journey, you want to feed your Anki with thousands of cards and practise 2 hours per day during your commuting and dead times.
🏫 Discover 70+ case studies using data analytics for supply chain sustainability🌳and business optimization 🏪 in this: Cheat Sheet
Create Anki Flashcards with Python
In this section, I will explain how to use Python to build these cards with…
- Common words or sentences for daily life or work
- Add the phonetic transcription using a Python library
- Add an audio transcription using Google TTS API
This framework can be applied to any language, not only Mandarin Chinese.
As a foreigner working in China, my main priority was to have a basic vocabulary to communicate with my colleagues.
Extracting Keywords from Emails with pywin32
Because my first objective was to read emails in Mandarin, I planned to extract the most frequently used words in the emails in my Outlook mailbox.
Using the code below, you can extract the body of all your emails and store them in a list.
Extracting Keywords from pdf reports with PyPDF2
Some reports and documentation I received from suppliers can be a good source of technical words.
Therefore, I have built this simple code to extract the text from any PDF report.
Extracting Keywords from Excel Files with Pandas
Another main source was the monthly financial reports in Excel that can be processed using the Pandas library.
Final Results with Vocabulary lists, including translation
After processing, I get a list of words like the one below
Add phonetic transcription with Google Translate
You need a phonetic transcription to practise your pronunciation and get the right use of the tones.
I use the jieba library for Mandarin, which takes the Chinese characters and returns the phonetics transcription (pinyin).
You can find a library for your language.
For instance, you have fonem for French and epitran for Italian.
Add audio transcript with Google Text-To-Speech.
You want to add the pronunciation to each card to improve your speaking ability.
There is a solution for this using the GTTS library.
This Python library and CLI tool interface with Google Translate’s text-to-speech API.
You can find more details and instructions on using it in the official documentation.
Next Step: Boost your Learning Journey with GPT
Generative AI: Boost your Learning Experience with GPT
In November 2022, OpenAI released the first version of ChatGPT.
Generative AI is an opportunity to bring additional intelligence to manage flash cards' creation and order of appearance.
Let’s imagine a learner that interacts with a GPT agent
- User: I would like to improve my vocabulary skills for accounting.
- Agent: Generates flashcards using the scripts designed in this article and follows the user's progress.
Instead of relying on the hard-coded logic of Anki, we can exploit the intelligence of LLMs to adapt the learning path to the student’s level.
💡 If you want to boost your learning skills with GPT,
Automate the Design of Nice Visuals
Use Python Pillow to automate the creation of graphs, visuals or illustrations to feed your report.
For example, the labels above have been generated automatically with a Python script.
This method can create illustrations of words to boost your memorization process.
💡 If you want to create your own visuals,
Conclusion
Now you have a list of words or sentences with the translation in English, the phonetics transcription, and a short mp3 audio with the pronunciation.
These cards can be used to practise your…
- Reading Comprehension using the translation
- Pronunciation using the phonetics transcription
- Oral Comprehension using the short audio
Apply the process presented in the visual above, and I promise you will see improvements in your language mastery with Python!
About Me
Let’s connect on Linkedin and Twitter; I am a Supply Chain Engineer using data analytics to improve logistics operations and reduce costs.
For consulting or advice on analytics and sustainable supply chain transformation, feel free to contact me via Logigreen Consulting.
💌 New articles straight in your inbox for free: Newsletter
📘 Boost your Productivity with Data Analytics: Productivity Cheat Sheet