NLP-News Data Collection

Sarang Mete
1 min readNov 1, 2022

--

Photo by Annie Spratt on Unsplash

In many use cases we need news articles data for ML model building or for any other purposes.

There are a lot of commercial as well as open source libraries available to get news data.
Newsapi is one such library. It has many APIs to get news data in different categories, format etc.

However, currently they don’t provide complete news article text, so we’ve integrated newsapi with another amazing library newspaper3k.

Basic Flow:

1.Get the news metadata using newsapi library
2.Get actual news article text with newspaper3k

I’ve created a complete end to end project for news data collection. The project is production ready. You can refer it here. This is a project developed to create an api application to access news articles given a query to search for.

If you liked the article or have any suggestions/comments, please share them below!

Let’s connect and discuss on LinkedIn

--

--