PART TEN: Data Object Layer (DOL) — Building a Production-Ready Algorithmic Trading Framework in Python

6 min readJul 4, 2023

Hello there,

We continue the series on Building a Production-Ready Algorithmic Trading Framework in Python. This article is going to focus on our Data Object Layer. The prerequisites for this article can be found below. They contain the classes you will need to get up and running and, if you missed them, the inside knowledge to understand what is happening.

In the world of algorithmic trading, where automated systems make lightning-fast decisions to buy or sell financial instruments, having a well-organized and efficient framework is essential. One crucial component of such a framework is the Data Object Layer (DOL). In this article, we will delve into the concept of DOL and explore its significance in building a production-ready algorithmic trading framework using Python.

What is the Data Object Layer (DOL)?

The Data Object Layer (DOL) bridges your algorithmic trading strategy and the vast amounts of data it relies upon. It serves as a container for organising, accessing, and manipulating various data types crucial for making informed trading decisions. These data types can include historical price data, real-time market data, fundamental data, and any other relevant information required by your trading algorithm.

Benefits of the Data Object Layer:

Abstraction and encapsulation: The DOL provides a layer of abstraction that shields your trading strategy from the underlying complexities of data storage and retrieval. It encapsulates how data is stored, accessed, and transformed, allowing you to focus on developing and refining your trading algorithm.
Modularity and reusability: By employing a DOL, you can create modular components that can be reused across different areas of the system. This saves time and effort as you can leverage pre-existing functionality for handling data rather than starting from scratch for each new project.
Data integrity and validation: The DOL ensures that the data used by your algorithm is valid and consistent. It can include data validation, error handling mechanisms, and reducing the chances of erroneous trades based on inaccurate or incomplete data.
Performance optimisation: The DOL can implement various optimisation techniques to enhance the performance of your algorithmic trading framework. This may involve caching frequently accessed data, implementing efficient data retrieval algorithms, or leveraging parallel processing capabilities.

Implementing the Data Object Layer in Python

To build a Data Object Layer in Python, we will use a framework called SQLAlchemy, utilising the ORM principal — Object Relational Mapping is a technique or tool that allows you to work with a relational database using objects in your programming language, abstracting away from traditional SQL queries.

pip install SQLAlchemy

First Things First

You need to define your data objects as classes, its good practice to store them in folders named after the target schema. I typically make a folder called ‘DOL’ in the root of my project. In the following example I am targeting my ‘Trading’ schema but yours can be anything you like.

Next we need a ‘__init__.py’ file that imports all of the classes we put in the facts and dimension folders, making our ‘Trading’ folder a module in python. You can put them all in one file, but I have found over the years that if your schema becomes massive it can be hard to find the class you are looking for.


# __init__.py

# Import Dimensions Tables
from .Dimensions.Dimensions_Instruments import Dimensions_Instruments

# Import Facts Tables
from .Facts.Facts_Instruments import Facts_Instruments

Finally we need the schema base file to bring any relationships and tables into our target schema. The ‘Base.py’ file looks like this:

# Base.py

from sqlalchemy.ext.declarative import declarative_base

Base = declarative_base()

We import our schema base to make it link with any other data class files we make. I will only show two different files so you get the idea, but you can expand this to all the tables you need.

The other foriegn keys and relationships you can see are to other files that enrich the data using the one-to-many or many-to-one relationship. All of the objects I use will be available in the GitHub assocaited with this project. If you want to learn more about the potential relationships check the documentation here. There are many-to-many relationships that can be set up too and it will remove from the scope of the article if we went into them.

The following example would sit in the Facts folder called ‘Facts_Instruments.py’, it is decorated with ‘@dataclass’ which provides it with all the ORM functionality we want.

# Facts.Facts_Instruments.py

from dataclasses import dataclass

from sqlalchemy import Column, String, Integer, ForeignKey, Float
from sqlalchemy.orm import relationship

from ..Base import Base


@dataclass
class Facts_Instruments(Base):
    __tablename__ = 'Facts_Instruments'
    __bind_key__ = 'Trading'
    __table_args__ = {'schema': 'Trading'}

    id: int = Column(Integer, primary_key=True, autoincrement=True)
    DateTimeKey: int = Column(Integer, nullable=False)
    DateKey: int = Column(Integer, ForeignKey('Trading.Dimension_Date.DateKey'), nullable=False)
    TimeKey: int = Column(Integer, ForeignKey('Trading.Dimension_Time.TimeKey'), nullable=False)
    GranularityKey: int = Column(Integer, ForeignKey('Trading.Dimension_Granularity.GranularityKey'), nullable=False)
    InstrumentKey: int = Column(Integer, ForeignKey('Trading.Dimension_Instruments.InstrumentKey'), nullable=False)
    Open: float = Column(Float)
    High: float = Column(Float)
    Low: float = Column(Float)
    Close: float = Column(Float, nullable=False)
    Volume: int = Column(Integer)

    Date = relationship("Dimension_Date")
    Time = relationship("Dimension_Time")
    Granularity = relationship("Dimension_Granularity")
    Instrument = relationship("Dimension_Instruments")
    Indicator = relationship("Dimension_Indicators")

The following example would sit in the Dimensions folder called ‘Dimensions_Instruments.py’. As you can see the files are similar in strucutre. The only different being the names of the attributes and table names.

# Dimension_Instruments.py

from dataclasses import dataclass

from sqlalchemy import Column, String, Integer, ForeignKey
from sqlalchemy.orm import mapped_column, Mapped, relationship

from ..Base import Base


@dataclass
class Dimension_Instruments(Base):
    __tablename__ = 'Dimension_Instruments'
    __table_args__ = {'schema': 'Trading'}

    InstrumentKey: int = Column(Integer, primary_key=True, autoincrement=True)
    Name: str = Column(String, nullable=False)
    InstrumentTypeKey: Mapped[int] = mapped_column(Integer, ForeignKey('Trading.Dimension_InstrumentType.InstrumentTypeKey'))
    LineTypeKey: Mapped[int] = mapped_column(Integer, ForeignKey('Trading.Dimension_LineType.LineTypeKey'))

    InstrumentType = relationship("Dimension_InstrumentType")
    LineType = relationship("Dimension_LineType")

Creating the Schema

Once you are happy with your schema and the relationships its time to create the tables using a database engine. This is a quick and nasty example, please see the PostgreSQL section of this article series, it provideds error handling and a configuration interface.

if __name__ == '__main__':

    # Import required libraries
    from sqlalchemy import create_engine

    # Import our schema base
    from DOL.Trading.Base import Base

    # Create a database engine 
    engine = create_engine("postgresql://database_name:Password@localhost:1234/postgres", echo=True)
    
    # Drop the old tables while in dev mode as it dosent allow errors to persist
    Base.metadata.drop_all(engine)

    # Creating new tables
    Base.metadata.create_all(engine)

If all goes well and you have set the relationships up correctly, you will see a beautiful schema and an ER diagram in your target database viewer of choice. DBeaver is a great tool or if you are using Postgres PgAdmin.

Example Entity Relationship Diagram from PgAdmin4

From here the only way is up. You can make as many schemas and tables needed, and best of all with the method above it will be organised and managable.

That’s all she wrote, folks. I hope you learnt some things and will use my tools; hopefully, it will help you along the way. Peace.

If you want to help buy me a coffee that fuels these articles’ late nights of research and development, please consider donating to my PayPal link below. Thanks so much, and have a great day.

paypal.me/JEFSBLOG

Conclusion: The Data Object Layer (DOL) plays a vital role in building a production-ready algorithmic trading framework in Python. By providing a structured approach to managing and accessing data, it simplifies the development process and enhances the reliability and performance of your trading algorithms. By employing the principles of abstraction, modularity, and encapsulation, you can effectively harness the power of DOL to create robust and efficient trading systems that can adapt to the ever-changing financial markets.