Instant Insights — PROJECT SAADHNA

Polani Keerthi Varshini
Google Cloud - Community
7 min readAug 5, 2024

TITLE : INSTANT INSIGHTS : Real - Time News Summarizer

Introduction

Instant Insights is a web application designed to keep users updated with the latest news by providing concise, easy-to-read summaries of news articles in real-time. This application harnesses the power of Google Cloud Platform (GCP) services to ensure efficient data processing, and summarization.

Elements

  • Impact: In today’s fast-paced world, staying informed without getting overwhelmed by the sheer volume of information is challenging. Instant Insights addresses this problem by delivering brief, accurate summaries of the most recent news, helping users stay informed without spending excessive time reading full articles.
  • Problem: Many individuals struggle to keep up with the constant influx of news due to time constraints and information overload. Traditional news platforms often present lengthy articles, making it difficult for users to quickly grasp the essentials of a story.
  • Solution: Instant Insights offers a streamlined news consumption experience by Fetching the latest news articles from various sources. Generating concise summaries using advanced natural language processing and Providing users with quick, digestible updates on current events.

Design

High-Level Technical and Functional Design:

  1. Data Ingestion: ● News articles are fetched from various sources using APIs. ● Cloud Pub/Sub handles the incoming stream of data, ensuring each article is processed in real-time.
  2. Backend Processing: ● Cloud Functions are triggered by new messages in Pub/Sub. ● These functions handle tasks such as extracting relevant information from the articles and sending them to Natural Language for summarization.
  3. Summarization: ● Natural Language processes the articles and generates concise summaries. ● Summaries are then stored in Firestore along with the original articles for reference.
  4. Frontend Interface: ● Users access the Instant Insights web application to view the latest news summaries. ● The interface displays summaries in a user-friendly format, allowing users to quickly grasp the key points of each story. ● Users can click on summaries to read full articles if they desire more information.

Prerequisites

Before starting with Instant Insights, ensure you have the necessary software and basic knowledge in place.

Software and Tools:

  • Integrated Development Environment (IDE): PyCharm/ VS Code
  • Languages and Frameworks: Python (version 3.7 or higher) & Streamlit
  • GCP Services: Natural Language API , Cloud Functions
  • Version Control: Github

Assumed Prior Concepts:

  • Basic Python Programming: Familiarity with Python syntax, data structures, and standard libraries.
  • Web Development Basics: Understanding of web application frameworks, HTTP requests, and basic front-end development.
  • Version Control with Git: Basic commands for committing, pushing, and pulling code from a repository.
  • Cloud Services Usage: Basic knowledge of setting up and using cloud services, particularly GCP.

Ensure you have these tools installed and are comfortable with the basic concepts before starting with Instant Insights.

Step-by-step instructions

Step 1: Set Up Your Development Environment

First, you need to install Python and Streamlit. Next, set up an Integrated Development Environment (IDE) such as PyCharm or Visual Studio Code. Additionally, install Git for version control and create a GitHub account to manage your code repository.

Step 2: Set Up Google Cloud Platform (GCP)

Create a Google Cloud Platform (GCP) account and enable the necessary APIs. Specifically, enable the Google Cloud Natural Language API and set up Google Cloud Functions.

language.googleapis.com \
cloudfunctions.googleapis.com \
pubsub.googleapis.com \
firestore.googleapis.com \

Create Environment variables for REGION and PROJECT_ID.

export REGION=asia-south1
export PROJECT_ID={your_project_id}

Step 3: Fetch News Articles

Create a Python script to fetch the latest news articles from various sources. Write a script that sends a request to a news API and retrieves the top headlines. Make sure to replace placeholders with your actual API key from a news API provider.

Step 4: Summarize News Articles

Create another Python script to summarize the fetched news articles. Install the ‘google-cloud-language’ library to leverage Google’s Natural Language API.

Step 5: Build the Streamlit Web Application

Develop the main Streamlit application script. Use Streamlit to create a simple web interface that displays the latest news headlines and their summaries. Set up the app.py script, requirements.txt, and .yaml file for your application.

In your app.py, you can start with these lines to build a Streamlit application.

#app.py 
import streamlit as st
from bs4 import BeautifulSoup as soup
from urllib.request import urlopen
from newspaper import Article
import nltk
nltk.download('punkt')
st.set_page_config(page_title='Instant Insight: Real-Time News Summarizer')
DEFAULT_NEWS_COUNT = 20
def fetch_news_search_topic(topic):
site = 'https://news.google.com/rss/search?q={}'.format(topic)
op = urlopen(site)
rd = op.read()
op.close()
sp_page = soup(rd, 'xml')
news_list = sp_page.find_all('item')
return news_list
def fetch_top_news():
site = 'https://news.google.com/news/rss'
op = urlopen(site)
rd = op.read()
op.close()
sp_page = soup(rd, 'xml')
news_list = sp_page.find_all('item')
return news_list
def fetch_category_news(topic):
site = 'https://news.google.com/news/rss/headlines/section/topic/{}'.format(topic)
op = urlopen(site)
rd = op.read()
op.close()
sp_page = soup(rd, 'xml')
news_list = sp_page.find_all('item')
return news_list
def display_news(list_of_news, news_quantity=DEFAULT_NEWS_COUNT):
for c, news in enumerate(list_of_news[:news_quantity], 1):
st.write('**({}) {}**'.format(c, news.title.text))
news_data = Article(news.link.text)
try:
news_data.download()
news_data.parse()
news_data.nlp()
except Exception as e:
st.error(e)
with st.expander(news.title.text):
st.markdown(
'''<h6 style='text-align: justify;'>{}"</h6>'''.format(news_data.summary),
unsafe_allow_html=True)
st.markdown("[Read more at {}...]({})".format(news.source.text, news.link.text))
st.success("Published Date: " + news.pubDate.text)
def run():
st.title("Instant Insight: Real-Time News Summarizer")
category = ['--Select--', 'Trending News', 'All Topics', 'Search🔍']
cat_op = st.selectbox('Select your Category', category)
if cat_op == category[0]:
st.warning('Please select Type!!')
elif cat_op == category[1]:
st.subheader("Here is the Trending news for you")
news_list = fetch_top_news()
display_news(news_list)
elif cat_op == category[2]:
av_topics = ['Choose Topic', 'WORLD', 'NATION', 'BUSINESS', 'TECHNOLOGY', 'ENTERTAINMENT', 'SPORTS', 'SCIENCE',
'HEALTH']
st.subheader("Choose Any Topic")
chosen_topic = st.selectbox("Choose your Topic here", av_topics)
if chosen_topic == av_topics[0]:
st.warning("Please Choose the Topic")
else:
news_list = fetch_category_news(chosen_topic)
if news_list:
st.subheader("Here are the some {} News for you".format(chosen_topic))
display_news(news_list)
else:
st.error("No News found for {}".format(chosen_topic))
elif cat_op == category[3]:
user_topic = st.text_input("Enter your Topic🔍")
if st.button("Search") and user_topic != '':
user_topic_pr = user_topic.replace(' ', '')
news_list = fetch_news_search_topic(topic=user_topic_pr)
if news_list:
st.subheader("Here are the some {} News for you".format(user_topic.capitalize()))
display_news(news_list)
else:
st.error("No News found for {}".format(user_topic))
else:
st.warning("Please write Topic Name to Search🔍")
run()

Set up Streamlit server configurations and environment variables:

#.yaml file  
runtime: python
env: flex
entrypoint: gunicorn -b :$PORT main:app

runtime_config:
operating_system: "ubuntu22"
runtime_version: "3.12"

manual_scaling:
instances: 1

# Additional configuration (optional)
handlers:
- url: /static
static_dir: static
- url: /.*
script: auto

Your requirements.txt should include the necessary dependencies for Streamlit, Google Cloud, and any other libraries you might use:

#requirements.txt
streamlit
Pillow
beautifulsoup4
newspaper3k
nltk
google-cloud-storage

Step 6: Initialize the Google Cloud SDK

Run below command to authenticate your account and select the appropriate project along with the regions and the other data of your project and account.

gcloud init

Step 7: Deploy in google SDK

Deploying on the Google Cloud SDK platform, specifically using Google App Engine, is to leverage the scalability, reliability, and developer-friendly features.Run the gcloud app deploy command from the root directory of your project. This will deploy your Python application to Google App Engine.

gcloud app deploy

Step 8: Updating the Application

When you have made changes to your application, you can use this command to deploy the updated version.

gcloud app deploy - version=<new-version-id>

Run your application to test the entire workflow. Ensure that it correctly fetches the latest news, generates summaries using the Cloud Function, and displays them on the web interface.

Troubleshooting Tips:

  • Error in Fetching News: Ensure your API key is correct and you have a stable internet connection.
  • GCP Errors: Check that the GCP services are correctly set up and the necessary APIs are enabled.
  • Streamlit Issues: Verify that Streamlit is properly installed and your Python scripts are correctly referenced.

Result

By the end of this project, you’ll have a fully functional web application that provides concise, real-time summaries of the latest news articles. The application will be visually appealing, user-friendly, and efficiently deliver the key points of current events without overwhelming the reader with too much information.

Real- Time Summarizer — INSTANT INSIGHT
Choose in wide Concepts
Search for Trending News
Search any News For Summarization

Let’s see the Short demo of INSTANT INSIGHT:

What’s next?

  • Real-Time News Sentiment Analysis: Extend application to analyze the sentiment of news articles and display a sentiment score or trend analysis over time.
  • Custom Recommendation System: Build a recommendation system that suggests articles to users based on their reading history and preferences, using collaborative filtering or content-based filtering techniques.
  • Mobile Version: Develop a mobile version of the Instant Insights application using frameworks like React Native or Flutter to reach a broader audience.

Call to action

To learn more about Google Cloud services and to create impact for the work you do, get around to these steps right away:

--

--