SQL Server query mode execution and columnstore indexes
A magic trick to activate batch mode execution on SQL Server 2017 “without” columnstore indexes!
This article covers two SQL Server query processing methods known as row mode execution and batch mode execution. We will also describe the unique mechanism to enable batch mode execution on SQL Server 2017 even when you cannot create columnstore indexes.
Row mode execution
Row mode execution is a query processing method used with traditional disk-based tables, where data is stored in row format. Tables that use this traditional storage format are also referred to as “rowstore tables”. When a query is executed and accesses data in row store tables, the execution tree operators and child operators read each required row, across all the columns specified in the table schema. From each row that is read, SQL Server then retrieves the columns that are required for the result set, as referenced by a SELECT statement, JOIN predicate, or filter predicate.
Row mode execution is very efficient for OLTP scenarios, but can be less efficient when scanning large amounts of data, for example in Data Warehousing scenarios.
Batch mode execution
Batch mode execution is a query processing method used to process multiple rows at the same time. This method is based on the use of vectors, each column within a batch is stored as a vector in a separate memory area. Batch mode execution uses algorithms optimized for multi-core CPUs ensuring higher memory throughput. The first implementation of the batch processing method was tightly integrated and optimized with the columnstore storage format, however these are two independent features.
When a query is executed in batch mode, and accesses data in columnstore indexes, the execution tree operators and child operators read multiple rows together in column segments. SQL Server reads only the columns required for the result, as referenced by a SELECT statement, JOIN predicate, or filter predicate. Batch mode execution is very efficient Data Warehousing scenarios, where large amounts of data are read and aggregated.
Batch mode execution has been extended over time so starting from SQL Server 2019 and Azure SQL Database it does not require the presence of columnstore indexes to be used. This is the new batch execution mode on rowstore that allows you to process multiple rows simultaneously in analytical workloads without columnstore indexes. Batch mode execution on rowstore, in SQL Server 2019 and Azure SQL Database, enables batch mode for all execution plan operators that support this feature.
Some of you are wondering if it is possible to enable batch execution mode on rowstore also in SQL Server 2017 without adding columnstore indexes … what you are thinking is possible with a trick, let’s see how!
The database on which we will run the example query is the AdventureWorks2017 with compatibility level set to SQL Server 2017 (140). This copy of the AdventureWorks2017 database contains the dbo.bigTransactionHistory table that was added thanks to the script developed by Adam Machanic and published in his article Thinking Big (Adventure). The table contains a lot of transactions of the managed products, it contains 31.263.601 rows!
Consider the following query that extracts the dynamic stock level of products in the warehouse and returns the running-total calculated over the warehouse quantity. The returned data-set is sorted by product identifier and transaction identifier.
,SUM(Quantity) OVER(PARTITION BY ProductID ORDER BY TransactionID ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)FROM
The execution plan generated for the query uses parallelism, the IX_ProductId_TransactionDate index is read entirely (Index Scan) because there aren’t filter predicates. The Sort operator sorts the data by TransactionID, ProductID and Quantity.
The sorting operation (Sort Task) represents the most expensive operation in the execution plan and it is also highlighted with the warning symbol (Sort Warning) because it uses the tempdb system database (spill on tempdb). Afterwards, in the execution flow, data are partitioned by ProductID (Segment) and aggregated (Stream Aggregate) to calculate the running-total over the Quantity column.
As you can see in the following picture, all tasks in the query plan are executed in “Row mode”.
In the test environment, the execution of this query takes 40 seconds (without returning data). This query would have a huge benefit if it were run in batch mode but we don’t have SQL Server 2019 and we don’t want to add a columnstore index to the dbo.bigTransactionHistory table because it is affected by an OLTP workload.
There is a trick we can use: creating a filtered columnstore index whose filter predicate contains a contradiction such as the request to have the TransactionID column equal to -1 and equal to -2 at the same time! The filtered columnstore index will be empty so it will not add any overhead to the DML queries and, at the same time, it will activate the batch mode processing!
The following T-SQL code implements the creation of the filtered columnstore index on the dbo.bigTransactionHistory table.
CREATE NONCLUSTERED COLUMNSTORE INDEX
ON dbo.bigTransactionHistory (TransactionID)
WHERE (TransactionID = -1 AND TransactionID = -2);
Because we have now a columnstore index (even if empty) the batch mode execution is enabled, the execution plan continues to be characterized by parallelism but the operators, except the Index Seek, work in batch mode. The Sort operator continues to have the highest cost but provides data to the Window Aggregate operator which has taken the place of the Segment, Sequence Project, Window Spool and Stream Aggregate operators as shown in the following picture.
In the test environment, the execution of the query takes about 20 seconds (without returning data). Half time compared to the previous run (row mode execution).
The performance of the query can be improved by creating the following index on the ProductID and TransactionID columns with the Quantity column included (in the INCLUDE clause).
CREATE NONCLUSTERED INDEX
ON dbo.bigTransactionHistory (ProductID, TransactionID)
Because there is now a covered index for the query we have the best plan ever, batch mode execution without sorting tasks as you can see in the following picture.
Batch mode execution in SQL Server 2019 and Azure SQL Database does not require columnstore indexes to be used. In SQL Server 2017 the batch mode execution can be activated by creating a filtered columnstore index in which a contradictory condition is specified such as (TransactionID = -1 AND TransactionID = -2). The filtered columnstore index created in this way will have no maintenance costs but it will activate batch mode execution for operators who support this type of processing method!
Enjoy SQL Server and T-SQL language!