Abstraction in python OOP

Python for AI, data science and machine learning Day 8

Gianpiero Andrenacci
Data Bistrot
9 min readMay 2, 2024

--

Python for AI, data science and machine learning series

Understanding Python Abstraction

Abstraction is a fundamental concept in object-oriented programming (OOP) that plays a crucial role in building scalable, maintainable, and modular code, especially in Python. Abstraction, in its essence, is about hiding the complexity of a system by providing a simpler interface. This article will delve into the concept of abstraction in Python, highlighting its key features, benefits, and how to implement it through practical examples.

Key Features of Abstraction

Abstraction in Python focuses on two main aspects:

  • Simplifying client interactions: It offers a straightforward interface for clients (other pieces of code that use the class) to interact with class objects. Clients can call methods defined in the interface without needing to understand the intricate details of how the methods are implemented.
  • Hiding internal complexity: Abstraction masks the complexity of internal class implementations behind an interface. This ensures that the client code does not have to deal with the inner workings of a class, thus promoting a decoupled and more maintainable codebase.

Benefits of Using Abstraction

The use of abstraction in Python programming provides several benefits:

  • Enhanced modularity: By encapsulating complex logic within classes and exposing only what is necessary through interfaces, code becomes more modular. This modularity facilitates easier code management and updates.
  • Improved code readability: Abstraction makes the code more intuitive and easier to understand by highlighting the interactions over the implementations.
  • Increased maintainability: Changes to the internal implementation of a class do not affect the client code, as long as the interface remains unchanged. This makes the system more resilient to changes and easier to maintain.
  • Better code reusability: Abstracting common functionalities into interfaces allows for reusing code across different parts of a project or even across projects, reducing redundancy and effort.

Implementing Abstraction in Python

Python does not have built-in support for abstract classes in the way some other languages do, but it offers the abc module which stands for Abstract Base Classes. This module provides the infrastructure for defining custom abstract classes in Python.

Understanding the @abstractmethod Decorator

In Python, the @abstractmethod decorator is a tool used in abstract base classes (ABCs) to define methods that must be implemented by subclasses (to understand decorators see Introduction to Key Concepts in Python Classes in this series). This concept is crucial for enforcing a contract between the abstract class and its subclasses, ensuring that certain methods are implemented in a consistent manner across different implementations.

Abstract Base Classes (ABCs)

Before diving into the @abstractmethod decorator, it's important to understand what an abstract base class is. An ABC provides a blueprint for other classes to follow, defining methods that must be implemented by its subclasses. This is useful in scenarios where you want to set a common interface for a group of related classes.

Python’s abc module enables the creation of abstract base classes. To define an ABC, you inherit from ABC, which is provided by the abc module.

The @abstractmethod Decorator

The @abstractmethod decorator is used to indicate that a method is an abstract method. This means that the method does not need to be implemented in the abstract class itself, but must be implemented in any subclass that inherits from the abstract class.

The abc module formally define the abstract base classes. Abstract base classes exist to be inherited, but never instantiated.

Here’s a breakdown of the example provided in the Data Processing in Data Science with Polymorphism:

from abc import ABC, abstractmethod

class DataProcessor(ABC):
@abstractmethod
def process_data(self, data):
"""
Abstract method to process data.
Subclasses must override this method.
"""
pass
  • from abc import ABC, abstractmethod: This line imports the necessary components from Python's abc module. ABC is the base class for defining abstract base classes, and abstractmethod is the decorator for marking methods as abstract.
  • class DataProcessor(ABC):: Here, we define a new abstract base class called DataProcessor by inheriting from ABC. This class will serve as a template for any subclass that deals with data processing.
  • @abstractmethod: The @abstractmethod decorator is applied to the process_data method, indicating that this method is an abstract method.
  • def process_data(self, data):: This defines the signature of the abstract method. The method takes self and data as parameters. In the body of the method, we typically don't implement any functionality (hence the pass statement) because the implementation should be provided by the subclass.

Now image that you want to build a class that move a vehicle independently from the type of vehicle.

from abc import ABC, abstractmethod

class Vehicle(ABC):
@abstractmethod
def move(self):
pass

class Car(Vehicle):
def move(self):
return "The car moves on the road"

class Boat(Vehicle):
def move(self):
return "The boat sails on the water"

# Usage
car = Car()
print(car.move()) # Output: The car moves on the road

boat = Boat()
print(boat.move()) # Output: The boat sails on the water

In the example, Vehicle serves as an abstract class with a simple interface (move method). Car and Boat are concrete implementations of the Vehicle abstract class. Clients interacting with Car or Boat instances don't need to know how move works internally; they only need to know that the method exists.

They don't need to be concerned with how the move() method is implemented internally in each subclass. This abstraction allows clients to use Car and Boat instances interchangeably, relying on the common interface provided by the Vehicle abstract class without needing to understand the specific details of each vehicle's movement mechanism. This separation of concerns promotes encapsulation, modularity, and code reusability.

Data Science Example Using Abstract Methods: A Framework for Model Evaluation

In this example, we will design a framework for evaluating different machine learning models using Python’s abc module to implement abstraction. This framework will allow us to define a generic structure for model evaluation that can be applied to any machine learning model, ensuring consistency and reusability of our evaluation process.

The goal is to create an abstract class that outlines the steps for model training, prediction, and evaluation. We will then extend this class for specific models, allowing us to easily compare their performance on a given dataset.

Step 1: Defining the Abstract Base Class

We start by importing the necessary modules and defining an abstract base class for our model evaluation framework.

from abc import ABC, abstractmethod
from sklearn.metrics import accuracy_score
from sklearn.model_selection import train_test_split

class ModelEvaluator(ABC):
def __init__(self, X, y, test_size=0.2, random_state=42):
self.X_train, self.X_test, self.y_train, self.y_test = train_test_split(X, y, test_size=test_size, random_state=random_state)
self.model = None

@abstractmethod
def train_model(self):
pass

@abstractmethod
def predict(self):
pass

def evaluate_model(self):
predictions = self.predict()
accuracy = accuracy_score(self.y_test, predictions)
print(f"Model Accuracy: {accuracy}")

Step 2: Implementing the Abstract Class for Specific Models

Now, we’ll implement this abstract class for two specific models: Logistic Regression and Decision Tree Classifier. For this example, we’ll use the Iris dataset.

Logistic Regression Evaluator

from sklearn.linear_model import LogisticRegression
from sklearn.datasets import load_iris

class LogisticRegressionEvaluator(ModelEvaluator):
def train_model(self):
self.model = LogisticRegression(max_iter=200)
self.model.fit(self.X_train, self.y_train)

def predict(self):
return self.model.predict(self.X_test)

Decision Tree Evaluator

from sklearn.tree import DecisionTreeClassifier

class DecisionTreeEvaluator(ModelEvaluator):
def train_model(self):
self.model = DecisionTreeClassifier()
self.model.fit(self.X_train, self.y_train)

def predict(self):
return self.model.predict(self.X_test)

Step 3: Evaluating the Models

After defining our model evaluators, we can use them to train, predict, and evaluate the performance of Logistic Regression and Decision Tree models on the Iris dataset.

iris = load_iris()
X, y = iris.data, iris.target

# Evaluate Logistic Regression
print("Evaluating Logistic Regression:")
logistic_evaluator = LogisticRegressionEvaluator(X, y)
logistic_evaluator.train_model()
logistic_evaluator.evaluate_model()

# Evaluate Decision Tree
print("\nEvaluating Decision Tree:")
decision_tree_evaluator = DecisionTreeEvaluator(X, y)
decision_tree_evaluator.train_model()
decision_tree_evaluator.evaluate_model()

Explanation

  1. An abstract base class (ModelEvaluator) is used to define a template for model evaluation. This ensures that any subclass must implement the specified abstract methods (train_model and predict), enforcing a consistent interface across different model evaluators.
  2. Abstraction of Common Functionality: The evaluate_model method is implemented in the base class, using the abstract methods to perform its function. This allows to write the evaluation logic once and reuse it across different model implementations.
  3. Flexibility and Extensibility: By defining a structure that can be extended for different models (like Logistic Regression and Decision Tree in the example), we’ve created a framework that is both flexible and easy to extend. Adding new models is straightforward and does not require duplicating code for model evaluation.

This example demonstrates how to use abstract methods in Python to create a flexible and reusable framework for evaluating machine learning models. By defining a generic structure for model training, prediction, and evaluation, we can easily extend our framework to include additional models and metrics, thereby streamlining the model evaluation process in data science projects.

Improving our Framework for Model Evaluation

To enhance the flexibility of the model evaluation framework by allowing different evaluation metrics, we can modify the abstract class and its implementation to accept a list of metric functions. Below is an example that demonstrates how to implement this feature:

Step 1: Modifying the Abstract Base Class

Here’s how you can modify the ModelEvaluator abstract class:

from abc import ABC, abstractmethod
from sklearn.metrics import accuracy_score, precision_score, recall_score
from sklearn.model_selection import train_test_split

class ModelEvaluator(ABC):
def __init__(self, X, y, test_size=0.2, random_state=42, metrics=None):
self.X_train, self.X_test, self.y_train, self.y_test = train_test_split(X, y, test_size=test_size, random_state=random_state)
self.model = None
# Set default metrics if none are provided
if metrics is None:
metrics = [accuracy_score] # default to using accuracy if no metrics are specified
self.metrics = metrics

@abstractmethod
def train_model(self):
pass

@abstractmethod
def predict(self):
pass

def evaluate_model(self):
predictions = self.predict()
results = {}
for metric in self.metrics:
score = metric(self.y_test, predictions)
results[metric.__name__] = score
for name, score in results.items():
print(f"{name}: {score:.2f}")

We’ve added a parameter (metrics) to the constructor to accept metric functions. These functions will be used later to evaluate the model predictions.

Step 2: Implementing the Abstract Class for Specific Models

We can now implement specific models that use this modified abstract class.

from sklearn.linear_model import LogisticRegression
from sklearn.tree import DecisionTreeClassifier
from sklearn.datasets import load_iris

# Logistic Regression Evaluator
class LogisticRegressionEvaluator(ModelEvaluator):
def train_model(self):
self.model = LogisticRegression(max_iter=200)
self.model.fit(self.X_train, self.y_train)

def predict(self):
return self.model.predict(self.X_test)

# Decision Tree Evaluator
class DecisionTreeEvaluator(ModelEvaluator):
def train_model(self):
self.model = DecisionTreeClassifier()
self.model.fit(self.X_train, self.y_train)

def predict(self):
return self.model.predict(self.X_test)

Step 3: Evaluating the Models with Custom Metrics

When we instantiate the evaluators, we can now specify which metrics to use. In this case we use accuracy, precision, and recall for evaluation

iris = load_iris()
X, y = iris.data, iris.target

# Using accuracy, precision, and recall for evaluation
metrics = [accuracy_score, precision_score, recall_score]

print("Evaluating Logistic Regression with multiple metrics:")
logistic_evaluator = LogisticRegressionEvaluator(X, y, metrics=metrics)
logistic_evaluator.train_model()
logistic_evaluator.evaluate_model()

print("\nEvaluating Decision Tree with multiple metrics:")
decision_tree_evaluator = DecisionTreeEvaluator(X, y, metrics=metrics)
decision_tree_evaluator.train_model()
decision_tree_evaluator.evaluate_model()

This modification makes the evaluation framework more versatile, allowing users to specify any number of metrics to evaluate the models. It also aligns with the principle of making software components open for extension (adding new metrics) but closed for modification (not modifying existing code to add support for new metrics).

Conclusion: Why Use @abstractmethod?

The use of @abstractmethod in Python, made possible by the abc module, offers several key advantages that make it an essential tool in the arsenal of object-oriented programming. This decorator not only enforces the implementation of methods in subclasses but also promotes a clear and structured approach to software design. Here’s a closer look at the benefits and other considerations:

Enforce Subclass Implementation

The primary function of @abstractmethod is to ensure that any subclass derived from an abstract base class (ABC) implements the abstract methods defined in the ABC. This enforcement guarantees that all subclasses adhere to a common interface, reducing the risk of runtime errors due to unimplemented methods and enhancing the robustness of the software.

Design Clarity

Using abstract methods clarifies the design intent. Developers and designers can outline a blueprint of essential operations without committing to the details of their implementation. This clarity helps in maintaining a clean architecture, where the responsibilities and interactions between different components are well-defined and transparent.

Flexibility and Structure

Abstract methods provide a structured yet flexible foundation for building complex systems. By defining a consistent interface in the base class, the system can evolve with new functionalities without affecting the existing implementation. It allows developers to extend the functionality of the system by adding new subclasses that implement these predefined interfaces, ensuring that new components seamlessly integrate with the established system.

Other Considerations

While the benefits are significant, it’s also important to consider when and where to apply @abstractmethod. Overusing it, especially in scenarios where flexibility is not required, can lead to unnecessary complexity. Furthermore, developers should be mindful of the following:

  • Performance Considerations: In some high-performance scenarios, the overhead of using abstract base classes might be a concern.
  • Design Overhead: The need to thoroughly understand the base class design before extending it can increase the learning curve for new developers.

In conclusion, the strategic use of @abstractmethod provides a balance between enforcing discipline in software development and offering the flexibility to extend functionalities. By carefully considering its application and potential impacts, developers can harness the full potential of abstract methods to build reliable, scalable, and maintainable software systems.

--

--

Gianpiero Andrenacci
Data Bistrot

Data Analyst & Scientist, ML enthusiast. Avid reader & writer, passionate about philosophy. Ex-BJJ master competitor, national & international title holder.