Making queries to Twitter API on tweepy

Fetch the tweets you didn’t want to see

Roberto Aguilar
5 min readNov 7, 2021

Twitter API bestows us several endpoints at the moment we request our App access. From which 3 of them are for searching methods, those that bring samples of the tweets we want according to specific criteria.

What’s the difference between those 3 methods?

Two of them can be upgraded by acquiring a Premium Membership, and they limit their usage in case we have the Sandbox membership, which is the default membership. In case you want to see the upgrading capabilities, access your subscription panel.

api.search_30_day()

  • Premium Search for tweets from the last 30 days.
  • Monthly limitation of 4500 tweets per minute, without exceeding 25K tweets per month — Sandbox

api.search_full_archive()

  • Premium Search for tweets from March of 2006.
  • Monthly limitation of 3000 tweets per minute, without exceeding 5K tweets per month — Sandbox

api.search_tweets()

  • Regular Search for tweets from the last 6–9 days as maximum.
  • Monthly limitation of 3000 tweets per minute — Sandbox, this is the one that we gonna use in this case.

How to make queries with tweepy?

The .search_30_day() and .search_full_archive(), both have a parameter named “query”, nothing but intuitive I guess.

In the case of .search_tweets() we have to use “q”, which will receive an string with the search operator we need, following the Developer Platform Syntax.

How can I make a query using the Dev Syntax operators?

There are a lot of operators as SQL, BUT don’t confuse it. Here we’ll use just the basics (Important: We won’t cover the authentication here, so if you haven’t invocated it yet, please take a look at this quick tutorial).

Now, we store the tweets as a list, assign a query string, then we limit the tweets for English Language only, the results are a mix between popular and recent and pagination of 100 as max.

This will retrieve us the tweepy object which needs to be manipulated with the json library to transform into a dictionary as we did before.

AND

For tweets containing both words “Donkey” and “Nintendo”

Exact Match

For Tweets containing the exact words “Chris Pratt” and “Mister Chief”, I guess someone would like to see Chris Pratt as Master Chief voice for Halo Infinite.

Well, I didn’t believe that people exist, but let’s omit that for humanity’s wellness purposes….

OR

Tweets containing both words “Seth” or “Donkey Kong”. Why? I don’t know.

Hashtags or Mentions

In the same way, we search for a specific word, we can set mentions or hashtags. Here we will search for tweets that mention Tobey Maguire and use the #spiderman hashtag (sure that the next leak is 100% official).

Filter Options

We can fetch tweets according to specific interest in the content. This returns tweets that are not marked as sensitive that mention Travis Pastrana.

Or excluding some kind of content, by adding “-” before.

To User and From User

As you just read we know what it does. FYI: This is requesting a tweet from the PlayStation Indies Boss to Phil Spencer, note that is NOT necessary to add the @ to the user name.

URL

Also, we can select tweets that contain URLs, by matching words inside the hyperlink. For example: here with this query, you will be able to see all Dr. Disrespect Twitch URLs…

Since and Until a date

We can specify dates within the fetching range by setting the since or until operators in the query.

Attitudes

Yeah, sounds strange but we can subset tweets depending on the attitude manifested. For positive, we can use a happy face and vice-versa.

There are many way to do queries but it’s important to have present that there are some operators only available for premium users. To know more about this, please check this table.

Best practices?

Just take into consideration this:

— Limit your searches to 10 keywords/operators, remember this is not the full archive endpoint, or the API can send you an error like this — .

{“error”: “Sorry, your query is too complex. Please reduce complexity and try again.”}

Thanks for reading, hope this helps you as a guide to go through the tweets forest. In my next posts we’ll go in deep about how to manage the tweepy objects, see you soon

--

--

Roberto Aguilar

Data Scientist @ McKinsey & Company | Generative AI | AI & Gaming