The Maximum Width of MySQL Database Tables
One of ACL’s cloud-based products stores customer data in a user-defined table format. Tables have rows and columns, with each column having data types (such as characters, numbers, or dates) that can be specified by the customer themselves. Although most tables are less than 100 columns wide, we’ve recently seen examples that approach 1000 columns. In order to provide a quality product, we wanted to understand the theoretical table limits.
As it happens, the underlying MySQL database server doesn’t support a large number of columns. Our initial findings indicate that only 197
textcolumns can be added to a MySQL table. For us to support 1000 columns, some creative thinking is needed.
If you’re interested in jumping directly to the proposed solutions, please see this follow-up blog post.
Understanding the Current MySQL Limits
We first discovered the wide table problem by trying to import existing tables from another database into MySQL. For example, we used the following type of query to create a 534-column table:
c3 DECIMAL(38, 6),
CREATE TABLE queries of this size, we consistently saw the following cryptic error message:
(42000) at line 2: Row size too large (> 8126). Changing some columns to TEXT
may help. In current row format, BLOB
prefix of 0
bytes is stored inline.
Unfortunately, unless you have an in-depth knowledge of how MySQL works, you’ll have no hope of understanding what this message means. After searching the Internet, a few blog posts offered some helpful suggestions, but none of them really explained why there was a limit, nor how we could work around this restriction.
Reading the MySQL Documentation
If in doubt, read the manual. We took a deep dive into the MySQL documentation for the InnoDB storage engine (the default engine for MySQL 5.7) and learned quite a few things.
There was no clear answer on the maximum number of columns supported, except for the theoretical maximum of 1017 columns. Throughout the documentation it became clear that the actual maximum depends on both the underlying MySQL storage engine (such as InnoDB or MyISAM) as well as the type of data stored in each row.
Here are some important facts about our environment that could impact our choice of wide-table solution.
- We’re using the InnoDB engine within MySQL since we need transaction support (the MyISAM storage engine doesn’t support transactions).
- We’re restricted to using the MySQL storage engines that are supported by AWS’s Relational Database Service (RDS). This means we can’t use any non-standard solutions.
- In our product, we make use of the
DECIMAL(38,6)data types. These were carefully chosen to support the field length and numeric precision we require.
- We’re using MySQL 5.7.19 — different versions have different behaviours.
After reading through the MySQL documentation, the following configuration settings seemed related to our wide-table problem. Each of them could conceivably impact the number of supported columns.
Parameter: Database Block Size
All data in a table is stored within InnoDB disk blocks. The default size is 16KB per block, although this can be decreased to 1KB or increased to 64KB. However, the block size does not directly translate to the amount of data that can be stored in each row. The following quote from the Limits on Table Column Count and Row Size page stood out:
InnoDBrestricts row size (for data stored locally within the database page) to slightly less than half a database page for 4KB, 8KB, 16KB, and 32KB
innodb_page_sizesettings, and to slightly less than 16KB for 64KB pages.
Therefore, the value of 8126 bytes we saw in the error message seems justified (slightly half of our 16KB block). Although we didn’t attempt it, increasing to 64KB blocks would merely double that space.
Parameter: Row Storage Format
The InnoDB storage engine has several different approaches for storing row data within the disk blocks. The particular storage format can be selected by adding the
ROW_FORMAT= flag to the end of the
CREATE TABLE SQL statement. See the InnoDB Row Storage and Row Formats page for more detail.
- COMPACT — As the name suggests, MySQL tries to store row data as compactly as possible. It does this by storing the data in the main B-Tree index for fast access. However, for large fields (such as long
TEXTfields) it'll only store the first 768-bytes inline, with the remainder of the data being stored in separate overflow pages (requiring additional disk access to retrieve).
- COMPRESSED — In this format, the data is compressed (using standard compression algorithms) before being written to disk. This speeds up query processing by requiring that fewer disk blocks be read or written. The one caveat of this technique is that the MySQL system tablespace can not support compression so we must either use the file-per-table tablespace or create a general tablespace.
- DYNAMIC — This row format provides a compromise solution where data is stored inline in the B-Tree if it’s small, else the entire field (not just the overflow data) is stored in overflow pages. This suggests that it’s possible to have a minimally-sized pointer to the overflow page be stored in our precious 8126 bytes, rather than storing actual text data.
Experimental Column Maximums for InnoDB
Armed with the above information, it was now time to determine the actual maximum limit of InnoDB, since the theoretical maximum of 1017 columns clearly wasn’t going to work. For each of the data types used in our product, and for each of the available row formats, we attempted to create a wide table.
Ideally, the maximum width could be determined by whether the
CREATE TABLE command is successful. However, for the
MEDIUMTEXT data type, it’s also important to add data into each column of the table, rather than defaulting to a
NULL value, or using just a short string. Only with realistic-looking data can we be sure it works.
The results are as follows:
- COMPACT format = 10 columns — Due to each
MEDIUMTEXTfield storing the first 768-bytes directly in the B-Tree, our 8126 bytes were quickly used up. Not a great solution for wide tables.
- DYNAMIC format = 197 columns — In this case, each
MEDIUMTEXTfield was stored in a separate overflow disk page, with roughly 40-bytes (8126/197) used to store a pointer to that page from the B-Tree index. Note that having all data on a separate page will slow down query performance, but at least we're able to have more columns per table.
- COMPRESSED format = 186 columns — With compression, we can almost get the same number of columns as with
DYNAMIC. Text usually compresses well, so it's unclear why there are fewer (not more) columns.
- COMPACT/DYNAMIC/COMPRESSED formats = 1016 columns — For
DATETIMEtypes, regardless of the row format, we're able to reach the theoretical maximum number of columns. This implies that a
DATETIMEvalue is stored using something around 8 bytes each (8126/1016).
DECIMAL(38, 6) Columns
- COMPACT/DYNAMIC formats = 447 columns — Both
DYNAMICrow formats require about 18 bytes per column (8126 / 447), although this will likely change if we modify the
- COMPRESSED format = 399 columns — With
COMPRESSEDformat, the number of columns is slightly less than the other two formats. This shows that compression doesn't always provide a better result, especially for binary-like data.
In summary, the maximum number of columns supported by MySQL (with InnoDB) ranges from 197 columns to 1016 columns, based on data types used. Given that our customers decide on the column data types, we’re forced to assume the worst case of
MEDIUMTEXT. In theory we could increase the disk block size to 32KB and we might see 197 x 2 = 394 columns, but that testing wasn’t attempted. In any case, it’s still far beneath our 1000 column expectation.
In a follow-up blog post, we’ll examine several approaches to simulating tables of up to 1000 columns in MySQL. We’ll then study how MySQL performs when these tables are placed under a realistic query workload.