How I Developed a Fully Functional COVID19 Resources Search Engine
My journey towards creating a fast, comprehensive, and usable COVID19 Resources Search Engine
I lost someone close to me recently. He was not a blood relative but I had known him since I was a child. He was that fun uncle I knew. Yes, knew. I lost him to Coronavirus.
Coronavirus or COVID19 has been looming over our heads like an apocalyptic nightmare since last year. With more than 3 million dead and counting, this is the single biggest biological disaster in modern history.
With the world slowly releasing itself from its clutches, my country — India has suddenly fallen prey to the deadly second wave of it. Thousands have died within weeks and many are barely holding on to life. But, is it solely because of the virus? No, it’s because of the heavy lack of resources when compared to the caseload. With each passing day, as the cases are rising, the number of available resources is depleting and it has now become a haunting treasure hunt for the citizens to get essential items for their loved ones in distress.
Some of the most scarce resources are oxygen, essential medicines, hospital beds, life support services, and more. While many are able to get their hands on some of it, not all people are lucky enough. It is observed that many people are not able to get the resources because of the lack of leads. They don’t know to whom they should reach out or which lead they should pursue.
Something similar happened to that uncle of mine. Maybe he could have survived if we had gotten our hands into some leads. But, we didn’t. Looking at it from such close proximity kindled something inside me. I had to do something. I had to at least try to do my bit during this pandemic
So, I created a search engine for covid resources. To access it, please visit:
I will be dividing this article into the following sections for ease of reading:
- How to use the engine
- How it works
- An appeal
How to use the engine
Note: I have slightly changed the UI a bit after I made the video. Functionality remains exactly the same
Using the app is simple:
- Select resource from the dropdown
- Enter the name of the place where you are searching for the resource
- Enter the maximum age of the tweet
I have attached a silent video that highlights the steps of searching for a resource. The engine also allows you to go to the original tweet to verify the lead.
How it works
Now, let’s talk about how it works. The source of this search engine is Twitter. I observed that a lot of good samaritans post tweets about their leads on Twitter for people to use. However, not everyone in need has a Twitter account. Also, going through a huge Twitter feed to find leads for a specific resource can be cumbersome. Therefore, this search engine has been designed to do that heavy lifting.
I sat through and scanned many tweets and observed if there is any discernable pattern in them. Indeed, there was. After going through them here were the patterns I observed:
- Mostly, the leads used the #verified hashtag in their tweets
- Most tweets were posted for a specific resource. However, whenever a lead was posted, a specific hashtag related to the resource was posted. As an example, many plasma leads had #plasma in them. All the oxygen leads had #oxygen #oxygencylinder in them
- Very few leads had location provided. But, interestingly, or I should say luckily, many of them had a location-based hashtag. For example, #bangalore would be for leads in Bangalore
- There were a lot of retweets that appeared and those had to be eliminated
Using the above simple observations, I decided to run a few ad-hoc queries. The results were very fruitful. With the combination of the right hashtags, many useful leads started appearing on the search. So, I decided to use this searching technique in my engine
First things first, before we get started, you can clone or fork the implementation from here:
Now, while implementing, the challenge was to achieve the above search mechanism in the simplest way possible and to make the results available to the general public, with or without Twitter accounts with the least amount of hassle. I have divided my implementation into 2 sections — backend and frontend
To implement the search I used Twitter API. Twitter provides nice APIs to search for tweets using the search query we use in websites. Here are the steps I followed:
- First, I created a developer account. Go to https://developer.twitter.com for the same. The steps are very simple. Once approved, you are good to go!
- After that, I created a developer app that allowed me to get my API key and secret for Oauth2 authentication. To know more, visit here: https://developer.twitter.com/en/docs/apps/overview
- Once I was done with it, I also created a developer project. A project allows 500,000 tweets to be searched within a month. Also, it was needed to use the new Twitter 2.0 APIs. More info here: https://developer.twitter.com/en/docs/projects/overview
- To search, I used the Twitter 2.0 recent tweets search API. The API needs a mandatory query string to search for the tweets and can search for tweets of the last 7 days. It also has other parameters which can be used. More information about it here: https://developer.twitter.com/en/docs/twitter-api/tweets/search/api-reference/get-tweets-search-recent
- Now, to search for the tweets I created a relevant function that accepted user input and converted them to hashtags. After that, those hashtags were joined together to form a query string for the API. A typical query string looks like this:
#plasma and #verified and #kolkata and -is:retweet
The above would search for plasma-related leads in Kolkata. The -is: retweet is added to filter out any retweets in the results.
Now, let’s talk about the twitterapi code. The same has been provided above.
The twitterdata class is responsible for generating the Oauth access token using the generatetoken function
The main function is the gettweets function which gets the query, hours since the oldest tweet has been posted, maximum results at a time and the next page token for pagination.
The hours parameter in this function helps in determining the oldest date since when the tweets have to be fetched using generatestarttime function.
The generateheader and generateparam are helper functions for gettweets. In the generateparams function, query is used to define the search query and tweet.fields is used to define what data is being fetched from the API.
Finally parsetweets function is being used to generate a dictionary with the tweets and the relevant details. It uses a function called generateuserid which generates user details based on the author_id fetched from the tweets
This is the easiest part of the whole project thanks to this awesome library I have recently started using named — streamlit. More about this library and its uses can be found here: http://streamlit.io/
Now, to create the interface, I thought of going minimal. As I had no intention of collecting any user information, it was an easy task to do. Here are the steps:
- I created a UI that asked the users to put in 3 values — the resource they want, the place where they want, and the maximum age of the tweet(in hours) since Twitter API returns a max of 7 days of data
- Next, I added a button called ‘Search’ which basically triggered an integration function named searchresources that converted the user data into a search query and sent the entered data to the twitterapi function gettweets to get the results
- The same integration function then cleaned the tweets data obtained as a dictionary, de-duplicated it and created a pandas dataframe
- Finally, the UI presented the data in that dataframe in the form of search results
For reference, here are the required code snippets:
I did not include any section for deploying the app as right now I am using the free heroku server to share this. You can get more information about deployment with heroku here: https://devcenter.heroku.com/articles/getting-started-with-python
Also, here is an awesome video to deploy streamlit apps on heroku
Now, that we have come to the end of this article, I want to post a small appeal to all my readers:
I understand that this is not a perfect implementation. And, I am right now working on making the experience better by improving search efficiency, speed, getting a custom domain, and by adding new resources. However, the driving force and the main cause behind this implementation has been the loss of someone close to me.
If I am able to help and save even one person with this tool, I will consider this implementation a roaring success. Hence, I request the readers reading this to consider sharing this tool with anyone who needs help with COVID19 resources. If we say the word enough times to enough people, there is a growing chance that we might be able to save someone together
Thank you for reading and sharing. Let’s fight this battle together. Cheers!