From Data Science to Production: Abstract Classes for Model Deployment

Wencong Yang, PhD
6 min readFeb 8, 2024

--

Background

When deploying a model to production, it’s easy to create repeated or messy code structures. This is because you have to integrate multiple data sources and write multiple service APIs. One way to avoid repeating yourself and organize your code is to maintain the main logic of your pipeline and make it reusable. In Python, you can use abstract classes to write generic code that can work with different types of data and services, without having to rewrite the same logic for each case.

Source: by author

In this article, you will learn:

  • How to use abstract classes in Python
  • How to handle different data sources and create different service APIs elegantly by using abstract classes in a real-world project

I hope this helps!

Solutions for abstraction

In object-oriented programming, an abstract class provides a template for other classes. Abstract classes cannot be instantiated but can be inherited by concrete classes. All concrete classes should implement the abstract methods of the abstract class.

In Python, the abc module allows you to define abstract classes. In the example below, we define an abstract class called Animal. We also define two abstract methods of the Animal object called eat and sleep using the @abstractmethod decorator. These abstract methods do not have an implementation, but they must be implemented by any concrete subclass of Animal. You can then define concrete objects of animals that follow the common interface of Animal. For example, you can define Dog and Cat which implement the eat and sleep methods in their own way. After that, you can write generic code that can work with both Dog and Cat.

from abc import ABC, abstractmethod

class Animal(ABC):

@abstractmethod
def eat(self):
pass

@abstractmethod
def sleep(self):
pass

class Dog(AbstractClass):

def eat(self):
print("The dog is eating.")

def sleep(self):
print("The dog is sleeping.")

class Cat(AbstractClass):

def eat(self):
print("The cat is eating.")

def sleep(self):
print("The cat is sleeping.")

Example

This is a real-world project that uses deep learning models to forecast streamflow for river gauge stations. The project aims to provide a web service that predicts the daily streamflow for the next few days. We encounter two challenges:

  • Multiple data sources. We expect the forecasting service to work for different clients who provide different APIs to obtain streamflow data. We also need to experiment with several meteorological datasets to optimize the forecasting model. Meanwhile, we hope to keep the pipelines of model training and inference unchanged across different input data sources.
  • Multiple service APIs. We provide an API to return all river sites where the forecasting model can be applied and another API to return the forecasted streamflow of a specific river site. We hope to write web framework code that is compatible with all services we develop.
Structure of abstractions in the project. Source: by author

To deal with these problems, we abstract the data readers in abstract_reader.py in the adapter folder that stores scripts of different data reader scripts, as the file tree shows. Similarly, in the service folder that stores scripts of different services, we abstract the services in abstract_service.py. Then, we can define concrete data readers and services in other scripts. All concrete classes implement the same abstract methods within the abstract classes.

streamflow-forecasting-service/

├── adapter/
│ ├── abstract_reader.py
│ ├── streamflow_reader.py
│ ├── meteo_reader.py
│ └── reader_selector.py

├── service/
│ ├── abstract_service.py
│ ├── forecast_service.py
│ └── info_service.py

├── xxx/
│ └── xxx.py

......


├── config.yaml
├── main_train.py
├── main_service.py
├── requirements.txt
......
├── README.md
└── LICENSE1. Abstracting data readers

1. Abstracting data readers

Since the forecasting model use two categories of data, i.e., streamflow data and meteorological data, we should define two abstract classes: AbstractStreamflowReader and AbstractMeteoReader. The abstract methods are the generic ways for our model to interact with the data.

# @File: abstract_reader.py

from domain.data_schema import SiteModel, StreamflowModel, MeteoModel
from abc import ABC, abstractmethod
from pandera.typing import DataFrame

class AbstractStreamflowReader(ABC):
'''Abstract class for the reader of streamflow data'''

@abstractmethod
def __init__(self):
pass

@abstractmethod
def get_site_info(self, sites: list[str]) -> DataFrame[SiteModel]:
pass

@abstractmethod
def get_daily_streamflow(self, sites: list[str], history_days: int) -> DataFrame[StreamflowModel]:
pass

class AbstractMeteoReader(ABC):
'''Abstract class for the reader of meteorological data'''

@abstractmethod
def __init__(self, **kwargs):
pass

@abstractmethod
def get_site_history_daily_meteo(self, site_id: str, lat: float, lon: float, history_days: int) -> DataFrame[MeteoModel]:
pass

@abstractmethod
def get_site_forecast_daily_meteo(self, site_id: str, lat: float, lon: float, forecast_days: int) -> DataFrame[MeteoModel]:
pass

@abstractmethod
def get_site_daily_meteo(self, site_id: str, lat: float, lon: float, history_days: int,
forecast_days: int) -> DataFrame[MeteoModel]:
pass

For each streamflow data source, we create a concrete reader class that inherits AbstractStreamflowReader and implements all the abstract methods, i.e., __init__, get_site_info, and get_daily_streamflow.

# @File: streamflow_reader.py

from domain.data_schema import SiteModel, StreamflowModel
from config.config_data import USGSDataConfig
from adapter.abstract_reader import AbstractStreamflowReader
from pandera.typing import DataFrame

class USGSStreamflowReader(AbstractStreamflowReader):
'''Data reader for USGS stramflow data'''

def __init__(self):
self.config = USGSDataConfig

def get_site_info(self, sites: list[str]) -> DataFrame[SiteModel]:
......

def get_daily_streamflow(self, sites: list[str], history_days: int) -> DataFrame[StreamflowModel]:
......

class XXXXStreamflowReader():
'''Data reader for another stramflow data'''
......

Similarly, any data source of meteorological data should inherits AbstractMeteoReader and implements corresponding abstract methods.

# @File: meteo_reader.py

from domain.data_schema import MeteoModel
from config.config_data import OpenMeteoDataConfig
from adapter.abstract_reader import AbstractMeteoReader
from pandera.typing import DataFrame

class OpenMeteoReader(AbstractMeteoReader):
'''Data reader for Open-meteo meteorological data'''

def __init__(self):
self.config = OpenMeteoDataConfig

def get_site_history_daily_meteo(self, site_id: str, latitude: float, longitude: float, history_days: int) -> DataFrame[MeteoModel]:
......

def get_site_forecast_daily_meteo(self, site_id: str, latitude: float, longitude: float, forecast_days: int) -> DataFrame[MeteoModel]:
......

def get_site_daily_meteo(self, site_id: str, latitude: float, longitude: float, history_days: int,
forecast_days: int) -> DataFrame[MeteoModel]:
......

class XXXXMeteoReader():
'''Data reader for another meteorological data'''
......

Finally, we create a script to select specific data sources for our model deployment by selecting the specific data readers. In the pipelines of model training and inference, we can use the unified ways to access the input data by calling ReaderRepository.streamflow_reader and ReaderRepository.meteo_reader. Abstraction reduces the coupling between data and model components, making it easier to manage and maintain a large number of functional units.

# @File: reader_selector.py

from adapter import streamflow_reader, meteo_reader
from config.config_data import DataConfig

class ReaderRepository:

def __init__(self):
self._create_streamflow_reader()
self._create_meteo_reader()

def _create_streamflow_reader(self):
if DataConfig.flow_data == "usgs":
self.streamflow_reader = streamflow_reader.USGSStreamflowReader()
elif DataConfig.flow_data == "xxxx":
self.streamflow_reader = streamflow_reader.XXXXStreamflowReader()
else:
self.streamflow_reader = None

def _create_meteo_reader(self):
if DataConfig.weather_data == "open-meteo":
self.meteo_reader = meteo_reader.OpenMeteoReader()
elif DataConfig.weather_data == "xxxx":
self.meteo_reader = meteo_reader.XXXXMeteoReader()
else:
self.meteo_reader = None

2. Abstracting services

The abstract class of service, i.e., AbstractService, defines common steps to execute model service code. Every service should validate request parameters with make_params, read data with get_data, make predictions with get_results, and encapsulate results to return with create_response.

# @File: abstract_service.py

from abc import ABC, abstractmethod
from pydantic import BaseModel

class AbstractService(ABC):

@abstractmethod
def __init__(self):
pass

@abstractmethod
def execute(self, query: dict) -> dict:
pass

@abstractmethod
def make_params(self, query: dict) -> dict:
pass

@abstractmethod
def get_data(self, params: dict) -> dict:
pass

@abstractmethod
def get_results(self, params: dict, data: BaseModel) -> dict:
pass

@abstractmethod
def create_response(self, results: dict) -> dict:
pass

Then, we create concrete service classes, i.e., the InfoService that gets all site names and ForecastService that gets forecasting results. Abstraction makes it possible to handle different service requests in the same way.

# @File: info_service.py

from service.abstract_service import AbstractService
from domain.data_schema import SiteModel
from pandera.typing import DataFrame

class InfoService(AbstractService):
'''Service for returning all river sites where the forecasting model can be apply to'''

def __init__(self):
......

def execute(self, query: dict) -> dict:
......

def make_params(self, query: dict) -> dict:
......

def get_data(self, params: dict) -> DataFrame[SiteModel]:
......

def get_results(self, params: dict, data: DataFrame[SiteModel]) -> dict:
......

def create_response(self, results: dict) -> dict:
......
# @File: forecast_service.py

from service.abstract_service import AbstractService
from domain.data_schema import CombinedDataModel

class ForecastService(AbstractService):
'''service for returning the forecasted streamflow of a specific river site'''

def __init__(self):
......

def execute(self, query: dict) -> dict:
......

def make_params(self, query: dict) -> dict:
......

def get_data(self, params: dict) -> CombinedDataModel:
......

def get_results(self, params: dict, data: CombinedDataModel) -> dict:
......

def create_response(self, results: dict) -> dict:
......

Summary

This article introduces the basics of abstract classes in Python and then demonstrates typical scenarios where abstract classes can be used when deploying a model in production. You will learn how abstraction makes the code more modular and flexible in real-world projects.

Links

--

--

Wencong Yang, PhD

PhD in geoscience, AI engineer. I write about AI4Science, climate change, and cloud computing. Twitter: https://twitter.com/San_Onion_Young