I made a news aggregator using the OpenAI API!

Dimitri Rusin
12 min readMay 20, 2023

--

You know ChatGPT, right? Well, actually, you can build apps on top of ChatGPT. Not exactly ChatGPT, but on top of the same engine that powers ChatGPT, the OpenAI API. Wait, can we summarize news using this technique?

Yes, we can! Do you know the German news website: https://www.tagesschau.de/? It’s pretty well written in my opinion and it’s free, so I used to read it often. That was before I got tired of reading news, which led me to think: Can we summarize news to read it faster?

After the global event of ChatGPT, everyone is blown away by generative AI technology. So, let’s make it more accessible and actually build something with it. In today’s post, we will build a simple news aggregator that just summarizes an article of the above German news website. Or does whatever you want with it: We will include an opportunity for the user to change the exact prompt given to ChatGPT.

First of all, check out the ready-made website: https://news-aggregator-2000.netlify.app/. This website is done completely as a frontend website without any sort of AWS or anything in the background. With one notable exception: your own local web server.

“Wait, what? I need to host my own local web server just to use your site?”, is probably what you’re thinking. But this is for your own security. We will actually make a call to the OpenAI API. And for that call, we need an OpenAI account’s credentials. But the credentials are associated with a real-cash dollar account. ChatGPT isn’t free, you know. Well, ChatGPT is free. But the engine behind it isn’t free. That’s why you need to use your own OpenAI account and, either deposit some cash into it or use OpenAI’s free trial credits which they give you when you open a brand-new account with OpenAI using a telephone number from an eligible country.

Then, you will need to copy the OpenAI API key into the .env file of the web server. But before we lose ourselves in the details of the web server, let’s look at some sample summarization outputs:

Note that the original news is in German. Also note the simple language that the AI can use to summarize complex news.

Anyway, how can I try the news aggregator myself?

Go to: https://github.com/Habimm/openai-keymanager. Clone this repository onto your local storage. Then navigate to that directory, and create an .env file:

OPENAI_API_KEY=sk-ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuv

Take your own API key from here: https://platform.openai.com/account/api-keys. If you don’t have an OpenAI account, create one. But you probably have one, if you had used ChatGPT even once. If your credits are expired, create a new OpenAI account using another mobile number from one of the eligible countries listed here: https://platform.openai.com/docs/supported-countries. Then you’ll get a something like $5 credit, which was plenty for all of my testing during the development of this app. This is my usage chart:

Alright, so after you cloned the repository and filled in the .env file, it’s time to install the Python packages and run the Python server. These steps are necessary to run the summarization on the frontend: https://news-aggregator-2000.netlify.app/.

But you’ll need to first start the web server on your local system. So, use Anaconda3 to install the Python packages in a separate Anaconda3 environment and then run the flask server, that’s encoded in the repository’s only Python file. By the way, there are the INSTALL and RUN files in the repository that will help you setup an Anaconda3 environment. After closing the repository and filling in the .env file, just run:

./INSTALL
./RUN

Your console will look like this:

> ./INSTALL
[...]
> ./RUN
* Serving Flask app 'openai_forwarder'
* Debug mode: off
WARNING: This is a development server. Do not use it in a production deployment. Use a production WSGI server instead.
* Running on http://127.0.0.1:5000
Press CTRL+C to quit

Alright, now you have setup a local key management or proxy server. It’s a key management server, because it guards your OpenAI key. It’s a proxy server, because it makes requests to the OpenAI API on behalf of the Netlify frontend application. And that frontend will actually display the summarized news for you.

Now, just go to this website: https://news-aggregator-2000.netlify.app/. You can choose between a couple prompts in the above dropdown menu. Or you can enter a prompt yourself. The prompt will be inserted at the top of the news article’s text, when submitted to ChatGPT. Now, click on Add article + in the lower right corner. Magic!

If you don’t get the magic, you probably haven’t setup that key management server correctly. In case you need help with that, you can ask me for support at dimitri@habimm.com.

How can I make my own app building on top of ChatGPT?

Actually, almost all of the ChatGPT magic happens in that local web server’s only Python file. Let’s look inside:

import dotenv
import flask
import flask_cors
import os
import requests

dotenv.load_dotenv()

app = flask.Flask(__name__)
flask_cors.CORS(app)

@app.route('/', methods=['POST'])
def home():
request_data = flask.request.get_json()
openai_api_key = os.getenv("OPENAI_API_KEY")
if openai_api_key is None:
print()
print("NO OPENAI_API_KEY specified. Please get the key from: https://platform.openai.com/account/api-keys\nThen make an .env file and specify the key like so:\nOPENAI_API_KEY=sk-ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuv")
print()
headers = {
"Authorization": f"Bearer {openai_api_key}",
"Content-Type": "application/json"
}
response = requests.post("https://api.openai.com/v1/chat/completions", headers=headers, json=request_data)
return flask.jsonify(response.json()), response.status_code

if __name__ == '__main__':
app.run()

You see that we just make a request to the address: https://api.openai.com/v1/chat/completions. From there we will get our completion, which in our case will be a summarization of the news article.

Another important Open-AI-related piece of code is in the frontend:

const openaiBody = {
model: 'gpt-3.5-turbo',
messages: [{ role: 'user', content: articleWithPrompt }],
};

This JSON string is put into the HTTPS request’s body. We select the name of the OpenAI model: gpt-3.5-turbo. And then just shove the entire article into it, together with the prompt. That articleWithPrompt is constructed as follows:

let articleWithPrompt = prompt;
articleWithPrompt += "\n\n";
articleWithPrompt += articleText;
articleWithPrompt += "\n";

Whereby, that prompt is just the text that you enter at the top of the website. And the articleText is the text of the article that we download from the news website using the npm package axios and analyze using the npm package cheerio.

No, how can I actually build a news aggregator using OpenAI API just like you did?

Alright, let’s go through it step by step. First of all, create a separate directory for the web server. Put all of the above code into it. Let’s call this file openai_forwarder.py.

Now, write down the requirements.txt:

flask
flask-cors
jupyter
python-dotenv
python-dotenv
requests

We will INSTALL these requirements in a separate conda environment:

#!/usr/bin/env fish

true
and conda activate base

and rm -rf (conda info --base)/envs/openai_forwarder/
and conda env remove --name openai_forwarder

and conda create --yes --name openai_forwarder python
and conda activate openai_forwarder
and pip install --upgrade --requirement requirements.txt

And then RUN the Python server:

#!/usr/bin/env fish

true
and conda activate openai_forwarder
and ipython3 openai_forwarder.py

Check out all of this code, frozen, at my GitHub: https://github.com/Habimm/openai-keymanager

When you use the web server, don’t forget to create an extra .env file in this directory. Then put the OpenAI API key into it in this format:

OPENAI_API_KEY=sk-ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuv

Then, if you RUN in the same directory as the openai_forwarder.py file, you should get up and ready to receive requests to forward to OpenAI.

What about the actual frontend?

What concerns the frontend, there are only four things that you really need to know about its build process:

  1. It’s built using create-react-app. Which means, it’s a React app. (I actually tried using Next.js for this project, until I figured out that this framework’s main advantage over React is server-side rendering which I didn’t really need for this project.)
  2. We use a lot of code to download and extract the article text from the news website’s HTML code. So, let’s look forward to that. Yay!
  3. We use a fancy feature called local storage. If you have already tried the news aggregator, you might have noticed that if you close the tab and open it up again, your summarized news is still there. That is because, we actually cache the summarization in the browser’s internal storage. I was excited to actually use this feature of the browser for the first time in my life!
  4. We also use Bootstrap to nicely present each article’s summarization to the user.

Alright, so instead of just diving right into the cold water, I will just leave a link to the code to study at your convenience: https://github.com/Habimm/news-aggregator/blob/main/src/App.js

That code is the entire frontend. You don’t really have to copy the code: instead just clone the repository.

We also use some cascaded style sheets: https://github.com/Habimm/news-aggregator/blob/main/src/App.css

And the last thing we really use is this Stability AI generated app icon:

The prompt was: make a logo for a news aggregator website, really informative, to the point precise

What are the exact steps to reproduce the news aggregator?

Run:

npx create-react-app front

Then replace the contents of the src/App.js file with:

import "./App.css";
import "bootstrap/dist/js/bootstrap.bundle.min"; // for the dropdown menu
import 'bootstrap/dist/css/bootstrap.min.css';
import axios from 'axios';
import cheerio from 'cheerio';
import { useEffect, useState, useRef } from "react";

async function getArticleText(articleUrl) {
let newsText = [];

const response = await axios.get(articleUrl);
const $ = cheerio.load(response.data);

const ancestor = $('article.container.content-wrapper__group');

const classes = ['seitenkopf__headline', 'meldung__subhead', 'textabsatz', 'tag-btn'];

ancestor.find('h1, h2, p, a').each((i, element) => {
const elClass = $(element).attr('class') || '';
if(classes.some(cls => elClass.includes(cls))) {
newsText.push($(element).text().trim());
}
});

ancestor.find('p, h2').each(() => {
newsText.push('');
});

return newsText.join('\n');
}

async function addArticle(articles, setArticles, promptRef) {
const baseUrl = 'https://www.tagesschau.de';
const response = await axios.get(baseUrl);
const $ = cheerio.load(response.data);

const teaserLinks = $('a.teaser__link').toArray();

const href = $(teaserLinks[articles.length]).attr('href');
var fullLink = (href.startsWith('http')) ? href : baseUrl + href;

// Fetch additional data for this link
const articleResponse = await axios.get(fullLink);
const article$ = cheerio.load(articleResponse.data);

const imageSrc = article$('img.ts-image').attr('src');
const date = article$('p.metatextline').text();

var article = {
url: fullLink,
imageSrc: imageSrc,
date: date,
summary: null,
};

// All of the following code is for
// summarizing the article's text, or doing whatever the prompt says.

const prompt = promptRef.current.value;

var articleUrl = article['url'];
var articleText = await getArticleText(articleUrl);
articleText = articleText.trim();
articleText = articleText.substring(0, 5500);

let articleWithPrompt = prompt;
articleWithPrompt += "\n\n";
articleWithPrompt += articleText;

// The newline at the end of the prompt is ESSENTIAL. Without it, the model might ignore the prompt
// at the beginning of the user content's message and just complete the last sentence
// in the scraped article.
articleWithPrompt += "\n";

document.body.classList.add("loading");
try {
const openaiBody = {
model: 'gpt-3.5-turbo',
messages: [{ role: 'user', content: articleWithPrompt }],
};

console.log("Sent to OpenAI:");
console.log(openaiBody);

var responseWithSummary = null;
try {
// On this local port, you should run the OpenAI forwarder,
// also introduced at the tutorial on Medium

responseWithSummary = await axios.post('http://localhost:5000/', openaiBody);

} catch (error) {
if (error.response) {
// The request was made and the server responded with a status code
// that falls out of the range of 2xx
console.error(`Error: ${error.response.data}`);
if (error.response.status === 401) {
// Handle 401 error here
console.error('Unauthorized request. Check your API key inside the OpenAI Forwarder. Then restart the OpenAI Forwarder.');
window.alert('Unauthorized request. Please check your API key inside the OpenAI Forwarder. Then restart the OpenAI Forwarder.');
} else {
// Handle other errors here
console.error(`Error status: ${error.response.status}`);
window.alert(`An error occurred. Please check your setup. Status code: ${error.response.status}`);
}
} else {
// The request was made but no response was received
console.error(`Error in setup: ${error.message}`);
window.alert(`No key manager found. Please setup the key manager from: https://github.com/Habimm/openai-keymanager`);
}
}

article['summary'] = responseWithSummary.data.choices[0].message.content;
const newArticles = [...articles, article];
setArticles(newArticles);

} catch (error) {
console.error(`An error occurred during the API call: ${error}`);
} finally {
document.body.classList.remove("loading");
}

return article;
}

function App() {
var stuffFromLocalStorage = JSON.parse(localStorage.getItem('articles')) || [];
const [articles, setArticles] = useState(stuffFromLocalStorage);
const promptRef = useRef(null);

useEffect(() => {
handleDefaultPromptClick("summarize this for a five-year old. keep it funny and teach the five-year old valuable life lessons related to the news.")
}, []);

useEffect(() => {
localStorage.setItem('articles', JSON.stringify(articles));
}, [articles]);

const handleDefaultPromptClick = (promptOption) => {
promptRef.current.value = promptOption;
};

return (
<div className="App night-sky-background">
<button
type="button"
className="btn btn-danger plus-button btn-lg"
onClick={() => addArticle(articles, setArticles, promptRef)}
>
Add article ➕
</button>

<div className="container">
<div className="row">
<div className="col">
<div className="row">
<div className="form-group">
<label htmlFor="prompt" className="custom-label">Prompt</label>
<input type="text" className="form-control" id="prompt" ref={promptRef} />
</div>
<div className="col">
<div className="dropdown">
<button className="btn btn-primary dropdown-toggle" type="button" id="dropdownMenuButton" data-bs-toggle="dropdown" aria-haspopup="true" aria-expanded="false">
Select Prompt
</button>
<div className="dropdown-menu" aria-labelledby="dropdownMenuButton">
<button className="dropdown-item" onClick={() => handleDefaultPromptClick("FASSE IN EINEM EINZIGEN SATZ FÜR EINEN 12-JÄHRIGEN ZUSAMMEN. MIT HÖCHSTENS 25 WORTEN!")}>FASSE IN EINEM EINZIGEN SATZ FÜR EINEN 12-JÄHRIGEN ZUSAMMEN. MIT HÖCHSTENS 25 WORTEN!</button>
<button className="dropdown-item" onClick={() => handleDefaultPromptClick("summarize this for a five-year old. keep it funny and teach the five-year old valuable life lessons related to the news.")}>summarize this for a five-year old. keep it funny and teach the five-year old valuable life lessons related to the news.</button>
<button className="dropdown-item" onClick={() => handleDefaultPromptClick("Summarize in one word:")}>Summarize in one word:</button>
</div>
</div>
</div>
</div>
</div>
</div>
</div>

{articles.map((article, articleIndex) => (
<div key={articleIndex} className="card text-center" style={{margin: "90px"}}>
<div className="row no-gutters">
<div className="col-md-4">
<img src={article.imageSrc} className="card-img" alt="..." style={{margin: "20px"}} />
</div>
<div className="col-md-8">
<div className="card-body">
<p className="card-text">{article.summary}</p>
<p className="card-text"><a href={article.url} target="_blank" rel="noopener noreferrer" className="card-text">Full text</a></p>
<p className="card-footer text-muted" style={{"marginTop": "15%"}}>{article.date}</p>
</div>
</div>
</div>
</div>
))}
</div>
)
}

export default App;

Then replace the contents of the src/App.css file with:

body.loading {
cursor: wait;
}

.plus-button {
position: fixed;
bottom: 20px; /* Adjust the distance from the bottom as needed */
right: 20px; /* Adjust the distance from the right as needed */
z-index: 100;
}

.custom-label a {
color: yellow; /* Set the desired color */
font-weight: bold; /* Add any other desired styles */
}

.custom-label {
color: yellow; /* Choose the desired color */
font-weight: bold; /* Add any other desired styles */
}

.App.night-sky-background {
background-color: navy;
background-image: url("night-sky.jpg");
background-size: cover;
background-repeat: no-repeat;
background-position: center;
}

.App {
text-align: center;
}

Then, also put this image into the src/ folder:

Run:

npm install axios cheerio bootstrap react-bootstrap

Replace the public/favicon.ico file with:

Go to line 27 of public/index.html and write:

<title>News Aggregator</title>

Finally, from the root directory, start:

npm start

Why is ChatGPT and generative AI important?

About this topic: Generative AI. Is. Insane. It’s completely new. Everything about generative AI is new. This had never before in human history happened. It’s the process of thinking — automated. Done. All the inventions, all the artists, all the revolutionaries who ever lived — completely frozen in ChatGPT. Everything is from now on — trivial. Because a life-less thing can do it.

There is absolutely no precedence on this, except maybe the industrial revolution, which had just happened 200–300 years ago. There, the actual force of the human body was automated. But now… it’s the mind. The mind has been automated. Nowadays, nobody identifies with their ability to lift weights anymore. Tomorrow, noone will identify with their ability to think anymore. A think-less society — finally in reach.

What is a human being? Lifting weights — part of the outside world. Thinking — part of the outside world. What is left? Love? Compassion? Selfishness? Egoism?

We’re stripping away at our humanity piece by piece. Generative AI, and all future inventions building on it, are taking away more and more achievements that human beings are proud of. When all physical achievements are trivial, what is a human being?

--

--