Web Scrape News Articles: The ABC of Python’s Abstract Methods

How to use Python’s abstract class functionality to scrape news articles online.

Tom Sullivan
Código Ecuador
3 min readFeb 12, 2020

--

Many programmers complain that Python lacks abstract classes. This is not true. Python has an ABC module with which programmers can create broad-purpose, general classes to reuse later in more specific contexts using the ABC module.

By creating abstract classes, the programmer can later define more specific and nuanced child classes that inherit from the abstract classes. Abstract classes are helpful for modules that inherit from a common base class but have significantly different characteristics like different types of media.

Scrape news articles online with Python

For example, we could create an abstract class named Item to reuse throughout a Python program to scrape news articles online. The program could contain child classes for both articles and videos that inherit from the Item class.

The abstract class, item, inherits from the ABC module which you can import at the beginning of your Python file using the command from abc import ABC, abstractMethod. The ABC class is an abstract method that does nothing and will return an exception if called.

Think of the abstract method decorator as a prefix

Now that we’ve imported the abc library, we must decide which methods are abstract methods, and mark them with the @abstractmethod decorator.

A quick word here about decorators. Think of a decorator as a prefix that tells us how the function or class is to be called and used. We use basic decorators all the time when we create properties for a class using the @property decorator.

The property decorator tells us that this function method is a property, which means we can access it as though it were an attribute. The @abstractmethod decorator does just the opposite: it tells us we cannot access a method until it has been overridden by a subclass.

The child class must override all the methods in the abstract class using a line of code in the __init__ function: super().__init__(name). For each method bearing the @abstractmethod decorator, there must be a corresponding method with the same name in the child classes. If we were to make a class for short stories, and it did not have one of these methods, then it would not work.

Don’t forget to override the abstract class methods!

That’s it! The abstract class provides a template for the base class and shows us what methods we need to create for a group of similar classes. We override the display_info method to show a very simple message on the screen with the title and the author.

The key is remembering to override all of the methods in the class using super().__init__(name). If the child class does not override these methods, you will receive an error message. For example, removing the title_author method from our code would break it.

--

--