Creating an automated news roundup

Using python in linux to generate an automated news roundup email

Ion Ioannidis
5 min readApr 14, 2023

Intro

For this project I will be using News API’s free developer account. You will need to sign-up with News API and get an API key to be able to request this database. Once the API key is obtained I use a python script and a cron job in Linux to automate the emailing of a weekly news roundup to an email account.

Creating the Python script

I am using VSCodium to create this python script locally on my computer.

  1. Importing libraries
import requests # to make the News API request 
import smtplib # to send emails
from email.mime.text import MIMEText # to create emails
from datetime import datetime, timedelta # to manipulate dates

2. Calculating the start and end dates for the past week

This needs to run as a fully automated script so I don’t want to manually input the dates, I need the script to be able to calculate the dates a week ago from .now().

end_date = datetime.now().strftime('%Y-%m-%d')
start_date = (datetime.now() - timedelta(days=7)).strftime('%Y-%m-%d')

3. Making the API request for the dates

The news round up will be about domestic violence, so the search terms are Family Violence, Domestic Violence and Gendered Violence. You can pick any search term, test and fine tune it. The articles are currently sorted by popularity. Popularity factors include things such as Social Media Shares, Engagement Metrics, Page Views, Publication reputation etc. You can also refer to News API documentation for more ways to sort articles.

# Creating a list to store the articles
articles = []

# Looping through each day of the past week and make an API request for that day
current_date = datetime.strptime(start_date, '%Y-%m-%d')
while current_date <= datetime.strptime(end_date, '%Y-%m-%d'):
# Construct the API request URL for the current day
q = '"Family Violence" OR "Domestic Violence" OR "Gendered Violence"' # enter your search terms here
sort_by = 'popularity' # sorting by popularity
api_key = 'your API key' # enter you API key as obtained at API subscription or signup
url = (f'https://newsapi.org/v2/everything?'
f'q={q}&'
f'from={current_date.strftime("%Y-%m-%d")}&'
f'to={current_date.strftime("%Y-%m-%d")}&'
f'sortBy={sort_by}&'
f'apiKey={api_key}')

# Make the API request and parse the response data
response = requests.get(url)
data = response.json()

# Appending the articles for the current day to the list of articles
articles.extend(data['articles'])

# Incrementing the current date by one day
current_date += timedelta(days=1)

# Printing the total number of articles retrieved
article_count = len(articles)
print(f"Total articles retrieved: {article_count}\n")
print("-" * 80)

4. Creating the output list

I want the list of articles to display author, title, source, publication and date details as well as include the URL link for the article. News API documentation has more in depth information on available article information display options.

# Looping through the articles and printing the required details
output = ''
for i, article in enumerate(articles, start=1):
author = article['author']
title = article['title']
source = article['source']['name']
url = article['url']
published_at = article['publishedAt']

output += f"{i}. {title}\n"
output += f"Author: {author}\n"
output += f"Source: {source}\n"
output += f"URL: {url}\n"
output += f"Published at: {published_at}\n"
output += "-" * 80 + '\n'

# Writing the output to a file
with open('output.txt', 'w') as f:
f.write(output)

5. Sending the email

In this case I will be sending an email to myself using gmail. In order to use gmail you will need to create an app password which ensures security for your account. You can see Google Account Help for more details.

# Sending the output as an email
msg = MIMEText(output)
msg['Subject'] = 'newsletter' # enter subject line for email
msg['From'] = 'sender@gmail.com' # your gmail address
msg['To'] = 'recipient@gmail.com' # recipient's address, or yours to test, you can enter multiple with ;

smtp_server = 'smtp.gmail.com'
smtp_port = 587
smtp_username = 'yourusername@gmail.com' # your gmail address
smtp_password = 'yourapppassword' # your gmail app password

with smtplib.SMTP(smtp_server, smtp_port) as server:
server.starttls()
server.login(smtp_username, smtp_password)
server.sendmail(msg['From'], msg['To'], msg.as_string())

Running this script will send the email, however we will be automating the email send so after running a trial, save your python file as a .py extension, for example News_roundup.py.

Automating

To automate this task I will be creating a Cron job on MX linux to periodically (weekly) run the python script. You can also automate tasks on Windows with various tools such as Power Automate Desktop.

In your linux terminal input:

$ crontab -e

If you haven’t created a Cron job before this will open a dialogue asking you to select a text editor. Select ‘nano’ which is the easiest text editor for beginners.

Once in the Cron editor, arrow down past the introductory notes and input:

40 21 * * 5 /usr/bin/python3 "/home/path/News_roundup.py"

This Cron job is set up as follows:

  • 40 specifies that the script should run at 40 minutes past the hour.
  • 21 specifies that the script should run at 9 PM (21:00).
  • * for the day of the month field means the script can run any day of the month.
  • * for the month field means the script can run any month.
  • 5 for the day of the week field means the script should run on Fridays (where Sunday is 0, Monday is 1, and so on).

Hit Ctrl+O, Enter, Ctrl+X to save and exit.

You computer will need to be turned on at the set time for the Cron job to run, but you don’t need to be logged in. Alternatively, you can also run this on a virtual server. To execute manually just run the python script in your python IDE environment (VSCodium, Google Colab etc).

Output

This is an extract of the domestic violence news roundup email, the full email is 377 articles long, which sounds like a busy newsweek for this search term.

Conclusion

This project leverages the power of automation and curated API content to provide a news roundup directly to individual inboxes at scheduled intervals. Although the News API caters to a wide range of topics, there are more niche APIs available for various more specialised fields, such as the Intrinio API for financial data, the OpenFDA API for drug and medical device information. One of my favourites, is Europeana API, a digital platform that provides access to millions of digitized items from European museums, galleries, libraries, and archives. Europeana’s collection includes items from the medieval period, such as manuscripts, books, maps, artwork, and more.

By incorporating a specialized API of your preference into an automated email roundups, you can receive or provide tailored, valuable news and insights that are specific to your professional or personal interests.

--

--

Ion Ioannidis

Public Sector Reporting & Analytics | Program Evaluation | Research