Compare Daily Search Trends in Two Countries with a Streamlit App

Pavlo Sydorenko
Stop russian aggression against Ukraine
5 min readMar 24, 2022

Two nations sharing the same tragedy must have similar concerns, unless …

Since the beginning of the unprovoked Russian aggression against Ukraine on 24 February 2022, Ukrainians have been showing fierce resistance on multiple battlefields, including cyber and informational. I have been using Google Trends quite often to monitor daily search trends and related news in both Ukraine and Russia. Therefore, I created a Streamlit app to have this data side-by-side using pytrends, GoogleNews, googletrans, and Python (3.8). Given the layout, it`s more of a desktop app.

Methodology: I assume that two nations sharing the same tragedy must have similar concerns if they have equal access to information and similar values. Unfortunately, the tragedy of one nation appears to be not so tragic for another nation which actually caused it. While searches in Ukraine are dominated by war-related topics, Russians have been more preoccupied with sport (soccer) and shortages of office paper, although the latter is also one of the consequences of russian aggression. However, trends are changing, also slowly, meaning more russians may be looking for information outside habitual channels of state-sponsored propaganda and misinformation.

1. Installing required libraries

To reproduce this app, you will need the following libraries. Please pay attention to the version of googletrans because this one has the fixes we need for this application.

pip install streamlit
pip install pytrends
pip install GoogleNews
pip install googletrans==3.1.0a0

2. Required imports and functions

Let`s build a get_top() function to search for TOP 10 daily trends on Google by country and a get_related() function to retrieve related searches. Of course, you can change the number of trends to be retrieved, as well as other parameters, according to your preferences.

It`s important to note, that pytrend outputs an error if you insert a two-word country name, e.g., United States, Saudi Arabia, etc. The underscore is required — i.e., Saudi_Arabia. Since this approach is not intuitive enough, let`s spare our users the trouble and irritation of guessing the correct data input convention (or reading specific guidelines that can be provided with st.info). The workaround is to split the input string into a list with the split() method and join elements of such a list with the underscore.

Another function, get_news(), should detect the language of the term, i.e. output of the get_top(), and search for relevant news. It`s worth mentioning that this language detection tool is not always accurate. I guess, adding some text cleaning and normalization might help to improve this approach. Also, you can try implementing a more accurate solution, e.g. spacy_langdetect (it proved to be more accurate in my case, at least; I tested it in Colab).

The output should be the title of one article and a clickable link. Again, you can change this by, e.g., adding the source (media) or increasing the number of retrieved articles.

3. Key features

To enable a visual comparison of two countries` search data, we can insert two side-by-side containers with st.columns (although this layout may be less convenient for mobile devices). Given that I predominantly use this app for the same two countries, I leave them as a default choice. But you can modify this parameter or leave it blank, this one is optional.

The output, i.e., search trends in selected countries, is also placed within expanders (st.expand), and inside each expander I also insert two checkboxes (st.checkbox) to look for relevant searches and news, if needed. I don’t retrieve this additional data by default to avoid any decrease in performance, especially in the case of news searches.

Given that multiple checkboxes will be created, we need to use st.checkbox with both label and key parameters. Also, please note that when searching for related queries this app retrieves rising queries, but you can change this to ‘top’ queries if you want.

4. Performance check

Although this app has been created for a non-commercial purpose and billions of users are not expected either, its performance should still be acceptable, given its practical application. Moreover, I think it`s a good practice to have such kind of step in your pipeline, no matter what.

There are multiple tools (as well as tutorials on the topic, so I will omit details here and provide you with some useful links at the end) that can check your app`s performance. My choice is Lighthouse from Google for the performance test and Locust for load testing. E.g., I`ve tested the app with 1000 users with no failures.

Some insights from the native Google Trends — interest in search term “война” [i.e., “war” in Russian] by Russian subregions.

Google trends, search term “война” (i.e., war in Russian), past 7 days, as of 09:07 AM 24 March 2022

Given searches of the term “war” in Russia within the past 7 days, regions that are located far-far away from the battlefield are the most concerned/curious about it, except Belgorod oblast. You wonder why?

First, it`s safer for the Kremlin`s regime to use its far east soldiers (e.g., indigenous people of Siberia) as “cannon fodder”. Thus, these regions are often the first ones to start receiving body bags. Second, after its failures to achieve early victory in Ukraine (as well as any significant victory at all) and bearing heavy losses (over 15,000 Russian soldiers are dead and many more injured one month after the invasion, according to Ukrainian authority), Kremlin started dragging military reserves from all possible sources, including even recruiting in Syria.

References

--

--

Pavlo Sydorenko
Stop russian aggression against Ukraine

Head of Legal Ops & Analytics for an in-house team of over 500 lawyers | 15 + years of overall experience in Analytics | Ph.D. in International Economics