Object-relational Mapping with SQLAlchemy

Published in

Analytics Vidhya

9 min readJan 12, 2020

What is Object-relational Mapping?

Object-relational Mapping (ORM) is a technique used to map objects to a database. With ORMs, we can directly create tables with classes and data in the tables with the objects (instance of the class) created. This programming technique provides a standard interface that allows developers to create code that can be used to communicate with a wide variety of database engines, rather than customizing for each single database engine.

SQLAlchemy

SQLAlchemhy, one of the ORM implementations in Python, is a library that facilitates the communication between Python programs and various database engines and it will be used in this article to persist and query data. For me, the two major advantages for using an ORM is that it allows me to entirely stay away from writing sql queries in my code (not because I hate sql) and it also ensures that I can write a generic code that can be used with various relational database engines (with little or no tweaks, if we need to change the database engine). Some of the features that ensures this ability to work with several database engines will be discussed in the following sections.

SQLAlchemy Data Types

SQLAlchemy provides support to implicitly and explicitly state a column’s data type. It does this by allowing a general mapping of Python data types to the most common data types found in relational databases column, such that SQLAlchemy will choose the best database column type available on the target database and map to the particular Python data type as below;

Integer() — INT
String() — ASCII strings — VARCHAR
Unicode() — Unicode string — VARCHAR or NVARCHAR depending on database
Boolean() — BOOLEAN, INT, TINYINT depending on db support for boolean type
DateTime() — DATETIME or TIMESTAMP returns Python datetime() objects.
Float() — floating point values
Numeric() — precision numbers using Python Decimal()

Also, it allows to explicitly state the data type by either using the SQL standard syntax or syntax common to a subset of database backends for data types. Then, it also provides a way to specify vendor-specific types like BIGINT for MySQL and INET for PostgreSQL.

SQLAlchemy Engine

SQLAlchemy Engine is the starting point for any SQLAlchemy application as we need to create an Engine whenever we want to use SQLAlchemy to interact with a database. Engines are created by running the create_engine function imported from sqlalchemy. This function takes a database URI (dialect+driver://username:password@host:port/database) to create the Engine object. The engine once created can be used to either directly interact with the database or can be passed to a Session object to work with the ORM. In this article, we will be passing our engine to the Session object since we are discussing about ORMs. Note that, the create_engine function does not establish any actual DBAPI connection directly. The Engine refers to a connection pool, which means under normal circumstances, there are open database connections present while the Engine object is still resident in memory. A connection pool is a way to maintain a pool of active database connections in memory. So in order to actually connect to the database, the connect method of the Engine object or an operation which is dependent on the method needs to be called. This method leads to the use one of the database connections in the pool to connect to the database.

SQLAlchemy Connection Pools

A connection pool is an implementation of the object pool design pattern. Object pool is a way of caching objects so as to be reused. Object pool design pattern offers a significant performance boost especially in situations where the cost of initializing a class instance is high. Also, the pool can automatically grow by creating new objects when the pool is empty and a request is sent to the pool for an object, or pools can be restricted to the number of objects created. Hence, a connection pool is a standard technique used to maintain long running connections in memory for efficient re-use, as well as to provide management for the total number of connections an application might use simultaneously. SQLAlchemy implements various connection pool patterns. The Engine returned with the create_engine function in most cases has a QueuePool which has reasonable pooling defaults.

SQLAlchemy Dialects

SQLAlchemy Dialects is used by SQLAlchemy to communicate with the various implementations of the Python Database API Specification (DBAPI). The DBAPI defines a standard interface to access databases for modules/libraries developed in Python. SQLAlchemy Dialects describes how to talk to a specific kind of database/DBAPI combination. Some implementations of the DBAPI are pyscopg2 and mysql-connector-python for PostgreSQL and MySQL resoectively. NOTE: All Dialects require that an appropriate DBAPI is installed.

SQLAlchemy dialects that are out-of-the-box and supported are;

PostgreSQL
MySQL
SQLite
Oracle
Microsoft SQL Server

NOTE: There are more SQLAlchemy Dialects.

SQLAlchemy Basic Relationship Patterns

We can now use all what we have learnt so far to map relationships between classes to relationships between tables. SQLAlchemy supports four types of relationships: One-to-One, One To Many, Many To One, and Many To Many. These relationships will be implemented next. But before then, we need to install some libraries and run an actual database in order to query data on SQLAlchemy. As we already know, SQLAlchemy provides support for many different databases engines, but we will be working with the PostgreSQL. But first, let’s create a new directory and a virtual environment for our project as follows;

>> mkdir object-relational-mapping 
>> cd object-relational-mapping 
>> python3 –m venv orm-venv 
>> venv/Scripts/activate

The first and second commands above creates the project directory (object-relational-mapping) and changes to the directory respectively while the third and fourth commands creates a virtual environment and activates the virtual environment respectively.

Installing SQLAlchemy and its Dependencies

Firstly, remember that all Dialects require that an appropriate DBAPI is installed and since we will be working with the PostgreSQL database engine, then we need to install psycopg2 which is the most common DBAPI implementation to connect to a PostgreSQL engine. Also, SQLAlchemy is not part of the Python standard library and so we need to install this also. These are the only two libraries that will be needed actually and so to install these libraries into our virtual environment, we will use pip as follows:

>> pip install sqlalchemy psycopg2

Starting PostgreSQL

There are several ways to get an instance of the PostgreSQL database engine. In this tutorial, we will be making use of the RDS service on Amazon Web Services (AWS). Another option will be to install PostgreSQL locally on our current environment. So in order to create an instance of PostgreSQL on AWS, you need to create an account on AWS if you don’t have one already. NOTE: For this example, we will be staying within the Free-tier limit on AWS. So once we are logged into our AWS account, click on services in the navigation bar and then click on RDS under Database. On the new page that opens, click on “create database”. Then follow the following steps to create the DB instance;

Fill the following fields; DB instance identifier, Master Username, Master password and Confirm password
Next, scroll down up to “Additional connectivity configuration” and click the arrow to show the settings under this category, then click on “Yes” under “Publicly accessible” in order to make it possible for us to connect to the database externally.
Also, scroll down up to “Additional configuration” and click the arrow to show all the settings under this category and then fill the “Initial database name” field
Finally, click on create database and wait a few minutes until the status of your database changes to “Available”.
Once the database becomes available, you can click on it to find the “endpoint” that will be used to connect to the database. This endpoint is under the “Connectivity & Security” tab after clicking on the newly created database.

NOTE: Please remember to delete the database after the example in order not to incur any charges.

Mapping Classes to create Relationships

Here, we will be implementing the One-to-Many relationship and ideas on how to extend the example in order to implement the other relationships will be mentioned for anyone who wishes to do that. This example is based on creating two Database Table; clubs and players. The clubs table will consist of football clubs while the players table will contain football players and the relationship is such that the football players can only play for a club while a club can consist of several players. Firstly, let’s create a base file to create our engine and then a session in order to be able to work with the ORM. Also, we will define a base class in the base file from which our Python Classes will inherit from in order to produce the appropriate Database Tables. The content of the base file is as shown below;

base.py

from sqlalchemy import create_engine 
from sqlalchemy.ext.declarative import declarative_base 
from sqlalchemy.orm import sessionmaker #creating our engine 
engine = create_engine('postgresql+psycopg2://yinka:password@orm.cwmprma8kzpp.us-east-1.rds.amazonaws.com:5432/orm') #remember that the engine needs to be passed to a Session object in order to be able to work with the ORM 
Session = sessionmaker(bind=engine) #the base class for defining our classes in order to produce the appropriate Tables 
Base = declarative_base()

Please note the general format for the database uri is;

>> dialect+driver://username:password@host:port/database

and then based on the credentials we used “Master username”, “Master password” and “Initial database name” while creating our database in AWS and the “endpoint” to the created database;

>> postgresql+psycopg2://Master username:Master password@endpoint:5432/Initial database name

So please edit the database uri in the code in order to able to connect to your database as I would have already deleted this database by the time you are seeing this. Thank you :)

club.py

Next, let’s create the clubs table as follows;

from sqlalchemy import Column, String, Integer, Date 
from base import Base class Club(Base): 
    __tablename__ = ‘clubs’     id = Column(Integer, primary_key=True) 
    club_name = Column(String) 
    club_stadium = Column(String) 
    date_founded = Column(Date)     def __init__(self, club_name, club_stadium, year_founded):
        self.club_name = club_name 
        self.club_stadium = club_stadium 
        self.year_founded = year_founded

This inherits from the Base class to create the clubs table. This table has four columns;

the primary key
name of the club
name of the club’s stadium
the date the club was founded

Note that we are using Python data types (Integer, String, Date etc) which SQLAlchemy maps to the most appropriate database column type depending on the target database engine.

player.py

Next, let’s create the players table as follows;

from sqlalchemy import Column, String, Integer, Date, Table, ForeignKey 
from sqlalchemy.orm import relationship 
from base import Base class Player(Base): 
    __tablename__ = 'players' 
    
    id = Column(Integer, primary_key=True) 
    player_name = Column(String) 
    player_number = Column(Integer) 
    club_id = Column(Integer, ForeignKey('clubs.id')) 
    club = relationship('Club', backref='players')     def __init__(self, player_name, player_number, club):
        self.player_name = player_name 
        self.player_number = player_number 
        self.club = club

The players table also has four columns;

the primary key
name of the player
shirt number of the player
information about the club the player plays for which is inherited from the clubs table.

The “club” variable is not a column but what defines the relationship between the players and the clubs table.

update.py

It’s time to actually create our database schema and insert some data into it. This is done as follows;

from player import Player 
from base import Session, engine, Base 
from club import Club 
from datetime import date #create database schema 
Base.metadata.create_all(engine) #create a new session 
session = Session() #create clubs 
manchester_united = Club('Manchester United', 'Old Trafford', date(1878, 1, 1)) 
chelsea = Club('Chelsea', 'Stamford Bridge', date(1905, 3, 10))
juventus = Club('Juventus', 'Allianz Stadium', date(1897, 11, 1)) #create players de_gea = Player('David de Gea', 1, manchester_united) 
pogba = Player('Paul Pogba', 6, manchester_united) 
kante = Player("N'Golo Kante", 7, chelsea) 
ronaldo = Player('Cristiano Ronaldo dos Santos', 7, juventus) #persist the data 
session.add(manchester_united) 
session.add(chelsea) 
session.add(juventus) session.add(de_gea) 
session.add(pogba) 
session.add(kante) 
session.add(ronaldo) #commit and close session 
session.commit() 
session.close()

Firstly, we created the database schema and then a session that will be used to persist the data. Next, we created some club and player objects that will be converted into rows of data in our clubs and players tables respectively. Then, we persisted the data and finally committed the data before closing the session. With this, the data now sits in our database.

query.py

Finally, let’s query our database to confirm that the data we persisted in the last section actually has been saved. This is achieved with the following code;

from club import Club 
from base import Session 
from player import Player #creates a session 
session = Session() #extracts all the players 
players = session.query(Player).all() for player in players: 
    print(f’{player.player_name} plays for {player.club.club_name} and wears shirt number {player.player_number}’)

The output of this is as follows;

David de Gea plays for Manchester United and wears shirt number 1 Paul Pogba plays for Manchester United and wears shirt number 6 N'Golo Kante plays for Chelsea and wears shirt number 7 Cristiano Ronaldo dos Santos plays for Juventus and wears shirt number 7

SWEET!!!

FINALLY

Thank you for coming and I hope this has been helpful. As promised, in order to complete this example to implement the other relationships;

One-to-One: A new table with information about players that are the all-time highest goal scorers for a club. Since a club can only have only one all-time highest goal scorer and its very rare to have a player be the all-time highest goal scorer in more than a club, then a one-to-one relationship can be established with the clubs table.

Many-to-Many: A new table with information about sponsors. Since a club can have more than a sponsor and a sponsor can sponsor more than a club, a many-to-many relationship can be established with the clubs table.