EAV: Flexible RDBMS Schema

Kamalmeet Singh
3 min readFeb 4, 2022

--

Problem Statement

Allow the end-user to add custom columns on the fly to the tables. For example, a Lead table has 10 fixed columns like id, name, email, phonenumber, status, etc. But a user can add additional columns say leadprovider which is not needed by any other companies.

Option 1: EAV

Implementation

EAV or Entity Attribute Value is an industry wide accepted solution to manage flexible columns in a RDBMS based system. This helps extending tables tenant wise based on each tenant need. This approach does not have a limitation on the scale as data is being stored in a separate table.

Pros

  1. Space efficient
  2. Scalable
  3. Industry Best Practice

Cons

  1. Complexity
  2. Performance

Option 2: EAV — Multiple Level

Implementation

A variation of the EAV model, where we will have an Attribute and Value table per entity that we want to keep customizable.

Example

Entity: Lead

Attribute: Lead_Attribute

Value: Lead_Value

Entity: Product

Attribute: Product_Attribte

Value: Product_Value

Pros

  1. Space efficient
  2. Scalable
  3. Better performance than EAV Single level

Cons

  1. More tables to manage
  2. Higher Complexity
  3. Performance

Option 3: Nullable Attributes

Implementation

In this approach, instead of having additional tables, for values, we will add extra generic attribute columns.

Example

Entity: Lead

Columns

Id

Name

Email

Phone

..

Attribute1

Attribute2

Attribute3

Attribute4

Entity

Lead_Attribute

companyID

AttributeName

Pros

  1. Easier implementation than EAV

Cons

  1. Extra columns are always there even if they are not needed
  2. Performance impact
  3. Storage Impact

Option 4: Dynamic Attributes

Implementation

Various databases support this particular use case with a special form of columns. For example, MariaDB has the option to create Dynamic columns.

https://mariadb.com/kb/en/dynamic-columns/

This solves the very specific use case we have, that one can create columns on the go.

Pros

  1. In-Built support for dynamic columns

Cons

  1. This makes us dependent on MariaDB
  2. It will be challenging to integrate Analytics tools easily as there is a specific way to fetch dynamic columns.

Option 5: JSON based storage for additional attributes

Implementation

Another way to store additional columns is to add them as a single column in JSON. This will help keep all the additional information in a single attribute.

Pros

  1. Performance
  2. Low Complexity

Cons

  1. Using additional columns for analytics purposes might be difficult

Recommendation

Option 1: EAV

As this is a tried and tested solution accepted throughout the industry, this looks like the best option we need to go ahead with. It has certain drawbacks like additional coding complexity and performance impact due to additional tables and joins. But as this is an Industry accepted solution, it will be easier to implement and handle any corner cases.

Option 5: JSON based storage

The advantage of JSON-based storage is simplicity in implementation. The challenge with this approach might be towards analytics as the data is stored in single column.

References

https://mariadb.com/kb/en/dynamic-columns/

https://inviqa.com/blog/understanding-eav-data-model-and-when-use-it

https://en.wikipedia.org/wiki/Entity%E2%80%93attribute%E2%80%93value_model

https://db.in.tum.de/research/publications/conferences/sigmod2009-mtd.pdf

--

--

Kamalmeet Singh

Tech Leader - Building scalable, secured, cloud-native, state of the art software products | Mentor | Author of 3 tech books |