Customise your own Twitter notification with Telegram Bot

Linus Ng
The Startup
Published in
5 min readMay 27, 2020

Telegram has a lot of potential as a messaging app. The Bot APIs can act as assistants to perform many functions in chats and channels. At the same time, Twitter provides easy-to-use APIs for developers to publish, manage, and analyse tweets. With the flexibility the APIs offer, the potential for Twitter and Telegram in data science is endless.
During the coronavirus, I wanted to keep myself up to date with the daily coronavirus figures. In the UK, the figures are released on a daily basis but at a different time throughout the day.

1. The official daily tweet on coronavirus in UK

What I think would be nice, is to get a notification when the figures are released, together with the data that I am interested in.

2. The automatic telegram notification

In this tutorial, I will walk through on how to create your own Twitter news alert. This will include
1. Scrape tweets using Tweepy

2. Extract numeric data from tweets

3. Create a telegram bot, and have the bot sending message to channel/chat using Python

4. Create a crontab job that enables the script recurringly

1. Scrape tweets using tweepy

Before we start collecting data for twitter, we need to become Twitter developers.

  1. Login to twitter developer section.
  2. Click ‘Apps’, then ‘Create an app’.
  3. Complete the application
  4. You can see the app that you just created, and the consumer API key and consumer secret key.
  5. At ‘Create my access token’, you can get the access token and the access token secret.

You will need the script below to mine the tweets, save the auth keys as a dictionary.

import tweepy
import datetime
auth = {'consumer_key': 'XXXXXX',
'consumer_secret':'XXXXXX',
'access_token_key':'XXXXXX',
'access_token_secret': 'XXXXXX'
}
class TweetMiner(object):result_limit = 20
data = []
api = False
def __init__(self, keys_dict=auth, api=api, result_limit = 20):

self.twitter_keys = keys_dict

auth = tweepy.OAuthHandler(keys_dict['consumer_key'], keys_dict['consumer_secret'])
auth.set_access_token(keys_dict['access_token_key'], keys_dict['access_token_secret'])

self.api = tweepy.API(auth)
self.twitter_keys = keys_dict

self.result_limit = result_limit
def mine_user_tweets(self, user="linusnhh",
mine_rewteets=False,
max_pages=5):
data = []
last_tweet_id = False
page = 1

while page <= max_pages:
if last_tweet_id:
statuses = self.api.user_timeline(screen_name=user,
count=self.result_limit,
max_id=last_tweet_id - 1,
tweet_mode = 'extended',
include_retweets=True
)
else:
statuses = self.api.user_timeline(screen_name=user,
count=self.result_limit,
tweet_mode = 'extended',
include_retweets=True)

for item in statuses:
mined = {
'tweet_id': item.id,
'name': item.user.name,
'screen_name': item.user.screen_name,
'retweet_count': item.retweet_count,
'text': item.full_text,
'mined_at': datetime.datetime.now(),
'created_at': item.created_at,
'favourite_count': item.favorite_count,
'hashtags': item.entities['hashtags'],
'status_count': item.user.statuses_count,
'location': item.place,
'source_device': item.source
}

try:
mined['retweet_text'] = item.retweeted_status.full_text
except:
mined['retweet_text'] = 'None'
try:
mined['quote_text'] = item.quoted_status.full_text
mined['quote_screen_name'] = status.quoted_status.user.screen_name
except:
mined['quote_text'] = 'None'
mined['quote_screen_name'] = 'None'

last_tweet_id = item.id
data.append(mined)

page += 1

return data

With the above TweetMiner function, we can start collecting tweets with just two lines of codes.

Tweet = TweetMiner(result_limit = 10)
uk_tweets = Tweet.mine_user_tweets(user=’DHSCgovuk’, max_pages=1)

2. Extract numeric data from tweets

The collected tweets are in the form of the list. To make it easier for visualisation, we need to convert the list to a dataframe.

tweets_df = pd.DataFrame(uk_tweets)

The dataframe will contain all the tweets that we scrape, but we are only interested in the tweets that includes the daily figures. So we need to do some filtering. I chose the word ‘died’ as a unique string to help collect all the tweets.

tweets_df = tweets_df[tweets_df['text'].str.contains("died")]

To better visualise data, I have created a date column and changed the dataframe to numeric.

tweets_df['date'] = tweets_df.created_at.dt.strftime('%Y-%m-%d')
tweets_df = tweets_df[['screen_name', 'date','text']].reset_index(drop=True)
tweets_df = tweets_df.apply(pd.to_numeric, errors='ignore')

From the tweet, there is a lot of figures we can extract. Let’s try to get the death rate as an example. The way that I calculate the death rate is (cumulative death) / (the number of positive cases).

You can find a lot of regex pattern online. I used the following pattern to match all the number within the tweet to extract the figures as two new columns, ‘death total’ and ‘positive total’.

number_regex = '\d+(?:,\d+)*'
tweets_df['death_total'] = tweets_df.text.str.findall(number_regex).str[8]
tweets_df['positive_total'] = tweets_df.text.str.findall(number_regex).str[5]

We can now create a column called ‘death_rate’ by putting the figures into the formula. We can use the similar ways to extract other figures. But for now, let’s add data transparency by extracting the url with another regex.

url_regex = '(?:(?:https?|ftp):\/\/)?[\w/\-?=%.]+\.[\w/\-?=%.]+'
tweets_df['url'] = tweets_df.text.str.findall(url_regex).apply(''.join)
Your dataframe shouold look like this.

To draft up the telegram message, we need to extract the numbers from the dataframe.

death_total = format(tweets_df.death_total.iloc[0], ',')
death_rate = round(tweets_df.death_rate.iloc[0]*100, 1)
url = tweets_df['url'].iloc[0]

With the variables, we can start putting them into a template.

msg = 'On {0}, the death toll has reached {1}, the death rate thus far is {2}. {3}'.format(latest_date, \
death_total, death_rate, url)

3. Send telegram message via Python

We have the message now! We need to figure out how we can send that msg to telegram with Python. To start, we need to create a telegram bot.

  1. On Telegram, search for ‘BotFather’ and send a ‘/start’ message.
  2. To create new bot, Send another ‘/newbot’ message, then follow the instructions to configure a name and a username. Your user name will be the bot_chatID.
  3. send ‘/token’ to get the token that access to HTTP API and then save it as bot_token.

What we are doing here is to use bot to send message to a target channel. So the bot_chatID should be the channel name you create, starting with ‘@’. Remember to make sure you bot is the admin of the channel. Otherwise, the bot allowed to publish content to the channel.

import requests def telegram_bot_sendtext(bot_token, bot_chatID, bot_message):
send_text = 'https://api.telegram.org/bot' + bot_token + '/sendMessage?chat_id=' + bot_chatID + '&parse_mode=Markdown&text=' + bot_message
response = requests.get(send_text)

Run the function, and voila! You will see the message is being sent to telegram!

4. Running the script recurringly

You can try to run the script on a daily basis by setting up a crontab. You can customise the schedule of your choice, there is a very good tutorial on how to do that here.

00 21 * * * /path/to/script/.sh

For example, the above crontab is set to automatically run the script every day at 9pm.

Alternatively, you can also run the script every five minutes and send out a message once the daily figures are released. That would need another tutorial. But you can check out how I do it on my github.

That’s it — now you have written a python script that scrapes tweets, analyses data, and automatically sent message to a telegram channel.

--

--