Supporting Wide Tables in MySQL

In a recent blog post, we saw that MySQL (with the InnoDB engine) has an upper limit on the number of columns allowed in a single table. This limit is dependent on the data types used, ranging from 197 columns of typemediumtext (assuming 16KB disk blocks) through to 1016 columns of type datetime. Attempting to define wider tables results in a somewhat obscure error message.

In this follow-up post, we’ll examine a couple of techniques for simulating wide tables of 1000 columns. For these techniques to be useful, they must support the same query operations we use with narrow (non-wide) tables.

Proposed Solutions

After considering our options, we came up with two solutions for supporting wide tables:

  1. Join multiple sub-tables — For example, join together 10 x 100 column tables, giving a total of 1000 columns.
  2. Use JSON-valued columns to create virtual columns — That is, store 100 JSON key/value pairs in each of 10 different JSON-typed columns.

Although these ideas are fairly straightforward, at least on the surface, it’s unclear how they’ll perform in practice. Let’s start by learning more about the two solutions.

Solution 1 — Joined Tables

The first choice is to divide each wide table into multiple narrow tables, then join them together based on their id column (the primary key). For example, columns 0 to 99 of table 17 would be stored in table17_0, columns 100 to 199 would be stored in table17_1, all the way through to columns 900 to 999 being in table17_9. As a general rule, column N will be stored in the sub-table numbered floor(N / 100).

To invoke a query, we join all sub-tables to create a wide-table of 1000 columns. Operations such as avg and sum can be used in the normal way.

mysql> select avg(c42), sum(c134)
from table17_0 join (table17_1, ... table17_9)
on (table17_0.id = table17_1.id
and ...
and table17_0.id = table17_9.id);

One initial concern with this approach is that joining tables can be CPU and RAM intensive, but without experimentation it’s unclear what the impact will actually be. It’s also unclear whether a 1000-column temporary table can be stored within MySQL. It might be possible in RAM, but if a temporary table was spilled to disk, would it suffer from InnoDB’s column limitations?

Observant readers will realize that selecting columns c42 and c134 doesn’t require all 10 sub-tables to be joined. Instead, we could join only the sub-tables of columns mentioned in the query. This includes the select portion of the query, as well as columns mentioned in the where, group by, and order clauses (amongst others). For example, the following query only requires sub-tables 1, 3, and 7 to be joined.

mysql> select avg(c134) 
from ... on ... where c763 > 5
order by c357;

In our testing, we’ll evaluate both the join all sub-tables and join sub-tables when needed approaches.

Solution 2 — JSON Tables

Another possible solution to the wide-table problem is to use JSON-valued columns. The JSON column type stores JSON objects, allowing for an extremely large number of virtual columns.

As an example, the following JSON object stores virtual columns c100 through to c199.

{
"c100": "2017-10-12",
"c101": "123.456",
...
"c199": "My Text"
}

As an optimization, we’ve considered using hard-typed columns for the first 100 columns, but virtual columns for the remaining 900. This general solution allows narrow tables to benefit from the performance and compactness of regular columns, while still allowing wide tables to exist, albeit less efficiently.

To create a JSON-based wide table, we use the following syntax:

mysql> create table table0 (
id int
not null auto_increment primary key,
c0 MEDIUMTEXT, ..., c99 DATETIME,
j1 JSON, j2 JSON, j3 JSON, ..., j9 JSON
);

To compute the value of a field, we use the following algorithm:

For column N, where N < 100, directly access the column via its name cN.

For column N, where N ≥ 100 (such as 234 — stored in JSON field j2):

  • For a MEDIUMTEXT column, use: JSON_EXTRACT(j2, '$.c34')
  • For a DECIMAL column, use: CAST(JSON_EXTRACT(j2, '$.c34') AS DECIMAL(38,6))
  • For a DATETIME column, use: STR_TO_DATE(JSON_EXTRACT(j2, '$.c34'), "%Y-%m-%d")

For virtual columns, clearly there’ll be extra CPU work to parse the JSON string and to locate the appropriate key/value pair. Next, there’ll be work to convert the string into the native data type. Only then can arithmetic operations be performed.

The advantage of this JSON approach is that no table joins are required, since all data is held within a single table. However, there’s extra CPU burden for locating keys within the JSON string, as well as converting strings into native data values. Experimentation is still needed to measure the impact.

Measuring Actual Performance

To evaluate each of our wide-table solutions, a set of experiments was devised. Tests were performed on MySQL 5.7.19 via Amazon RDS (Relational Database Service). The RDS server had size db.m4.xlarge (4 CPU and 16GB RAM).

For each of the Joined Tables and JSON Tables approaches, the database was pre-populated with appropriately-formatted tables:

  • We created 25 test tables, populated with test data. For joined wide-tables, multiple sub-tables were created (for example:table25_0, table25_1, table25_2etc). For JSON wide-tables, a single table was created (for example:table25).
  • Each table held 100,000 rows of randomly-generated data. All tables had 1000 columns, except for the base-case testing of 100 columns (see later)
  • Column types alternated between MEDIUMTEXT, DATETIME, and DECIMAL in a predictable fashion, allowing our tests to choose a random column of a specific data type.

For each table configuration, a series of performance tests were run:

  • Each test run consisted of 1000 individual queries, with the query duration (in milliseconds) being recorded. Queries (representative of our typical workload) included aggregations (such as sum or avg) over a full column, grouping of rows into sub-groups and counting the members of each group, returning a randomly-selected group of 500 full rows, as well as filtering rows that match a specific regular expression (regex).
  • At the end of each test run (of 1000 queries), the average query duration was computed. We kept all the raw data and could go back to look for outliers, or other interesting patterns obscured by taking the average.
  • Each individual query was executed on a randomly-chosen column (or columns) on a randomly-chosen table. This gave a fairly realistic workload, assuming multiple customers are accessing multiple tables at any given time. Although the goal was to randomly select from one of 25 different wide tables, many of the experiments were limited to using only one or two tables, for reasons we’ll see later.
  • All test runs were repeated with 1, 2, 5, 10, 15, and 20 concurrently active users.

Experiment: Base Case — 100 Columns

Before we could measure the performance of wide-tables, we first needed an understanding of the typical performance for narrow tables (100 columns). This should have been simple, but uncovered an important fact about database performance. The following figure shows the average time for randomly selected queries on 100-column tables. The x-axis varies from a single simulated database user, all the way up to 20 concurrent users (threads).

The first scenario (the purple line) illustrates the performance trend for up to 20 concurrent users, reading from 20 possible tables. There’s no assignment of users to tables, so any of the simulated users can perform a random query on any of the 20 tables. The trend for this first scenario should not be surprising. Queries take less than a second (on average), with the test duration increasing slightly as we add concurrent users.

The second scenario (the green line) illustrates the same test cases, but with a total population of 25 tables (5 more than the first case). In this situation, the average query response time is significantly worse — up to 3 seconds.

The explanation here is that MySQL’s in-memory buffer pool can only hold around 20 of our narrow tables (100,000 rows of 100 columns each). With InnoDB, the buffer pool is used for caching data and indexes in memory, making access to tables significantly faster than reading from disk. On the MySQL 5.7 server used for testing, the buffer pool was sized at about 13GB of RAM.

The challenge in the second scenario is that the buffer pool isn’t large enough to store all 25 tables. Instead, many of the queries require data to be loaded from disk, replacing other disk blocks that might currently be cached. Due to tables being swapped in and out of memory frequently, the query times were significantly higher.

(Note: it’s unclear why the query time peaks at 10 concurrent users — this would take more research to understand).

It’s interesting to note that many of the remaining test cases (to be discussed later) are heavily impacted by this performance characteristic of the buffer pool. Therefore, it’s not just necessary to find a wide-table solution that works correctly, but also to understand how the solution interacts with the buffer pool. Too much disk access simply slows down queries, negatively impacting all users of the database, not just those who are using wide tables.

Experiment: Joined Tables

With the base-case understood, let’s compare the performance of both the Join all sub-tables and Join sub-tables when needed approaches. In both cases, we use a collection of 10 sub-tables, each with 100 columns, joined to create a 1000-column wide table. What differs between the two approaches is the number of sub-tables joined for each query.

There are several things we can learn from this graph:

First, we learn that the sub-tables approach (whether joining all, or joining when needed) works reliably. This is good news, eliminating the concern that creating 1000-column tables might not be feasible in MySQL, especially if temporary tables were spilled to disk.

Next, we learn that joining all 10 sub-tables (the green line) provides significantly worse performance than only joining the tables required for the specific query. The difference isn’t too significant for a small number of concurrent threads (simulated users), but grows out of control as the number of threads increases. This should be no surprise, since the join operation is quite expensive on CPU and RAM.

Reading between the lines, it’s clear that MySQL doesn’t do anything to optimize away unnecessary table joins. Perhaps there are configuration options we’re unaware of, but the optimizer would need confidence that sub-tables have an exact 1:1 relationship on their id columns, and therefore that rows would never be discarded or duplicated by the join. Instead, our application must optimize the query for itself, and only join the tables that are required to satisfy the query.

Impact of Disk IOPS

Observant readers will notice that our previous tests were executed with only two active tables (instead of 20). In reality, each of our wide tables holds as much data as 10 narrow tables, so we already have the equivalent of 20 narrow tables in the database. Based on what we know about the MySQL buffer pool, its reasonable to assume that having more than two active tables negatively impacts query performance.

Clearly, this large working set of database blocks is fairly limiting. Even if we use the join-when-needed approach, there’ll still be significant delays as sub-tables are loaded into RAM. The next question is whether we can speed up the disk-to-RAM transfer as much as possible. To do this, we must increase the disk IOPS (input/output operations per second), a configurable parameter in AWS RDS.

In the next graph, the blue line shows the steep gradient associated with having a working set of database blocks (20 active wide tables) which is significantly larger than the buffer pool can hold. Most of the query time was dedicated to reading disk blocks into memory before the query could be completed. For some queries we might be lucky if the disk block was already in RAM, but for most queries this was not the case.

The yellow line illustrates the impact of increasing the database’s maximum IOPS from 2500 IOPS (our first blue line) up to 7500 IOPS. We see a significant reduction in the time required to load disk blocks into the MySQL buffer pool, but the performance is still nowhere close to our desirable base case. Note that our test runs were aborted after measuring with 5 concurrent users, simply due to the time required.

In summary, our join sub-tables when needed approach is feasible to use in practice. However, due to the number of disk blocks required to store a wide table, it’s unreasonable to access more than a couple of 100,000 row by 1000 column tables at one time.

Experiment: JSON Tables

For our second proposed wide-table solution, using JSON-typed columns, we performed a similar set of tests. In this case, we used two active tables, with 100 hard-typed columns (standard MySQL columns), and an additional 9 JSON-typed columns each storing 100 virtual columns. Similar test data was loaded into the tables, and the same randomly-generated queries were executed (although modified to use JSON_EXTRACT to read JSON-typed columns).

As the following graph shows, the performance impact of JSON tables was immediately obvious. Even for a single thread, queries were taking over 10 seconds. There seemed no point in investigating further.

The main reason for this performance degradation is that JSON tables are enormous, and won’t easily fit into the buffer pool. Given that numeric and date types will be encoded as strings (rather than their native data format), there’ll be a significantly larger number of disk blocks used for storage. Additionally, it’s likely (although not confirmed) that MySQL reads all JSON columns into memory, even if they’re not required for the query.

Although it’s not the key bottleneck, there’ll also be a non-trivial amount of overhead to parse a JSON column (containing 100 virtual columns), extract the specific column we’re interested in, and then convert that column’s value into a native number or date. The impact of this slowdown was anecdotally experienced when first constructing the JSON test driver on much small tables.

The obvious conclusion is that JSON tables are far less performant than joined table, and are therefore not worth considering further.

Conclusion

In summary, we can support 1000-column tables in MySQL, although with a few caveats:

  • Joining narrow sub-tables to make a single wide-table is the best approach. The alternative of using JSON objects to create virtual columns was far less efficient.
  • Application code must only join the sub-tables required for the query, rather than joining all sub-tables. This reduces the CPU and RAM needed by join operations, and reduces the working set of database blocks in the buffer pool.
  • Since wide-tables require a larger number of database blocks than narrow tables, the impact of cache-misses in the MySQL buffer pool becomes a lot more apparent.