Dynamic Models In SQL

Rakshit Choudhary
4 min readMar 15, 2020

--

We at Grofers started using segment as our solution for enabling us to create different user segments for running analytics. It exports the segments created on the dashboard to our self hosted database to which our internal services can query from. This integration was adding new columns to our table every time a new segment was created on the dashboard. Now in order to query these columns we had to update the model in our codebase every time a new column was added. This turned out to be quite cumbersome for us to manage.

We could have gone with an integration with a NoSQL database like MongoDB but segment currently doesn’t support it.

The solution we ended up going up was for creating dynamic models in SQL using SQLAlchemy’s Inspector class.

Introduction

This is a quick tutorial on how to create dynamic models in SqlAlchemy by getting the latest table metadata from the database.

Model :

A model is the single, definitive source of information about your data. It contains the essential fields and behaviors of the data you're storing. Generally, each model maps to a single database table.

The basics:

  • Each model is a Python class.
  • Each attribute of the model represents a database field.
  • With all of this, Flask gives you an automatically-generated database-access API; see Making queries.
class User(db.model):
__tablename__ = "users"

id = db.Column(db.BigInteger, primary_key=True)
name = db.Column(db.Text, index=True)
email = db.Column(db.Text, index=True)
segment_A = db.Column(db.Boolean, index=True)
segment_B = db.Column(db.Boolean, index=True)

But the problem arises when the columns in your table change very frequently.

In such a scenario you need to update your model every time you are adding a new column to your table. Which is very tedious if you are dealing with production code.

Enter the SQLAlchemy Inspector :

So we can overcome this by dynamically fetching the metadata about the database and then creating the model on the fly.

SQLAlchemy provides Inspector which is a low-level interface that provides a backend-agnostic system of loading lists of schema, table, column, and constraint descriptions from a given database are also available.

A Inspector an object is usually created via the inspect() function:

from sqlalchemy import inspect, create_engine
engine = create_engine('SQLALCHEMY_DATABASE_URI')
insp = inspect(engine)

Inspector returns the following things from the database

  • Schema
  • Tables
  • Columns and constraints.

Inspector object has different methods that we can use to extract out information that we need about the DB like schema name, foreign keys, indexes etc . You can read more about these here. In this blog, we will be using the get_columns method of the inspector object which gives us information about columns.

def refresh_user_model():
# getting the schema and extracting the required column names
inspector = Inspector.from_engine(engine)
db_columns = inspector.get_columns('users',schema=DB_SCHEMA)
new_user_model_columns = []
for column in db_columns:
new_user_model_columns.append(tuple([column['name'],
column['type']]))

get_columns return column information as a list of dicts with these keys:

  • name - the column’s name
  • type - the type of this column; an instance of TypeEngine
  • nullable - boolean flag if the column is NULL or NOT NULL
  • default - the column’s server default value - this is returned as a string SQL expression.
  • attrs - dict containing optional column attributes

Primary Key Requirement in Models

One important point to note is that while defining a model in SQLAlchemy we need to assign a primary key to every model. Because in order to map to a particular table, SQLAlchemy ORM needs at least one column denoted as a primary key column.

So creating a new model we need to manually make a key as a primary key. Here I am mapping ‘id’ as a primary key because most of the tables have a unique identifier ‘id’ present in them.

Now in order to map a new model with an existing model, we can use mapper() function. A mapper appends the new table model with another model.

Because I have manually added id as a primary key I need to remove it from my new_user_model_columns in order to avoid mapping the same column twice.

Finally, before mapping a table again we need to clear the previous mapping as I am calling this method repeatedly at regular intervals. For that, we are using clear_mappers() which clears all the previous mappings.

# remaking the entire model
new_user_model_columns.remove('id','BigInteger')
metadata = MetaData(bind=engine, schema=DB_SCHEMA)
new_user_model = Table('users', metadata,
Column('id', BigInteger, primary_key=True),
*(Column(col, type) for col, type in
new_user_model_columns))
clear_mappers()
mapper(User, new_user_model)

Periodic refreshing the model :

For this, we can use threading where a method will keep on calling the refresh_user_model() method at regular intervals.

def start_user_model_sync():
threading.Timer(MODEL_REFRESH_TIME, start_user_model_sync).start()
refresh_user_model()

It takes around .03-.04 seconds to refresh the model.

Thanks for reading! If you have any questions or comments, please leave a response below.

--

--