How Python Does Help to Scrape News Articles from CNN? — Websitescraper How Python Does Help to Scrape News Articles fromCNN?

  • A quick overview of web pages and HTML
  • Python web scraping with BeautifulSoup
! pip install beautifulsoup4
  • find_all(element tag, attribute): It enables us to identify any HTML element on a page by displaying its tag and characteristics. This function will find all items that are of the same kind. Instead, we can use find()to get only the first one.
  • get_text(): This command will allow us to retrieve the text from a specific element after it has been discovered.
# importing the necessary packages import requests from bs4 import BeautifulSoup
r1 = requests.get(url) coverpage = r1.content
soup1 = BeautifulSoup(coverpage, 'html5lib')
coverpage_news = soup1.find_all('h2', class_='articulo-titulo')
coverpage_news[4].get_text()
coverpage_news[4]['href']
# Scraping the first 5 articles number_of_articles = 5# Empty lists for content, links and titles news_contents = [] list_links = [] list_titles = [] for n in np.arange(0, number_of_articles): # only news articles (there are also albums and other things) if "inenglish" not in coverpage_news[n].find('a')['href']: continue # Getting the link of the article link = coverpage_news[n].find('a')['href'] list_links.append(link) # Getting the title title = coverpage_news[n].find('a').get_text() list_titles.append(title) # Reading the content (it is divided in paragraphs) article = requests.get(link) article_content = article.content soup_article = BeautifulSoup(article_content, 'html5lib') body = soup_article.find_all('div', class_='articulo-cuerpo') x = body[0].find_all('p') # Unifying the paragraphs list_paragraphs = [] for p in np.arange(0, len(x)): paragraph = x[p].get_text() list_paragraphs.append(paragraph) final_article = " ".join(list_paragraphs) news_contents.append(final_article)

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Scraping Intelligence

Scraping Intelligence

Scraping Intelligence is provide all type off website scraper software, web scraping service, data extraction service, web data mining service, web data scraper