Why is SQL Server doing an Index Scan instead of Index Seek?
--
This article explains how implicit conversion affects SQL query performance and causes the query optimizer to perform an index scan instead of index seek while searching for a specific value.
Implicit conversion… A quick reminder
In SQL, implicit conversion is an automatic data type conversion done by the compiler within an operation where different data types are compared or integrated. For instance, if a user is looking to insert an integer value within a VARCHAR column or even when a user tries to compare VARCHAR and NVARCHAR values.
SELECT * FROM Table WHERE 1 = '1'
Implicit conversion is not supported between all data types. Sometimes data types cannot be compared or may require some explicit functions. The diagram below summarizes the supported data type conversion in SQL Server.
As the name implies, implicit conversion is a kind of invisible data processing to the end user, increasing resource consumption and query execution time.
When working with a small volume of data, implicit conversions could be acceptable. In contrast, these operations become harmful while querying a large volume of data.
Index Scan vs. Index Seek
Index Scan and Index Seek are two SQL Server query execution plan operators that indicate that the query compiler decided to utilize an index (Clustered or Non-Clustered) during the execution.
- Index Scan: It indicates that the SQL engine will read the whole index looking for matches except if it is used with a TOP operator. The time of a scan operation is usually proportional to the size of the index.
- Index Seek: It indicates that the SQL engine will propagate the index to seek directly to matching records. The time taken depends on the number of matching records.
Usually, a seek operation is more performant than a scan operation, unless scanning all values is required.
Implicit conversion causing Index Scan
One of the common implicit conversion effects in SQL Server is causing the query optimizer to perform an index scan instead of an index seek operation, even if the query is matching rows against a constant value.
SELECT * FROM TABLE WHERE [Column] = 'Some Value'
This mainly occurs when comparing some values to a VARCHAR column.
Querying an NVARCHAR column
To illustrate this behavior, we ran the following experiment using the StackOverflow2010 database provided by Brent Ozar.
In the Users table, the DisplayName column type is NVARCHAR. We created another Column encDisplayName, using the following SQL Command:
ALTER TABLE dbo.Users ADD encDisplayName VARCHAR(40);
UPDATE dbo.Users SET encDisplayName = DisplayName;
Now, Let’s visualize the query execution plan while trying to filter rows using the DisplayName and encDisplayName columns.
Let’s start by filtering the DisplayName (NVARCHAR) column. In the First command, we tried to filter using a VARCHAR value. The query execution plan shows an Index Seek operation, meaning that the implicit conversion did not affect the query execution.
SELECT *
FROM [StackOverflow2010].[dbo].[Users]
WHERE DisplayName = 'Hadi'
The same execution plan is used when filtering using an NVARCHAR value.
SELECT *
FROM [StackOverflow2010].[dbo].[Users]
WHERE DisplayName = N'Hadi'
If we try to filter using an integer value, we can note that the query optimizer no longer chooses an index seek operation.
Moreover, the following warning is visualized in the query execution plan.
Type conversion in expression (CONVERT_IMPLICIT(int,[StackOverflow2010].[dbo].[Users].[DisplayName],0)) may affect “CardinalityEstimate” in query plan choice. Type conversion in expression (CONVERT_IMPLICIT(int,[StackOverflow2010].[dbo].[Users].[DisplayName],0)) may affect “SeekPlan” in query plan choice…
This means that the query optimizer decided to convert the values of the DisplayName column to compare it with the integer value specified in the WHERE clause instead of converting the integer value (1) to NVARCHAR.
This problem can be solved by explicitly converting this value to NVARCHAR.
SELECT *
FROM [StackOverflow2010].[dbo].[Users]
WHERE DisplayName = CAST(1 as NVARCHAR(40))
Querying a VARCHAR column
Now, let’s repeat the same experiment over the VARCHAR column.
Filtering using a VARCHAR value results in an index seek operation.
SELECT *
FROM [StackOverflow2010].[dbo].[Users]
WHERE encDisplayName = 'Hadi'
While filtering using an NVARCHAR operation, use an index scan operation. Since the query optimizer decided to convert all values in the encDisplayName column to NVARCHAR instead of just converting the NVARCHAR constant value to VARCHAR.
This behavior may look strange! Especially since we are comparing two string values. Still, the SQL engine considers converting an NVARCHAR value to VARCHAR unsafe and may lose some Unicode characters if the default collation does not support them. This is why it prefers converting the whole column rather than just converting a single value.
As stated before, using an explicit conversion will solve this issue.
SELECT *
FROM [StackOverflow2010].[dbo].[Users]
WHERE encDisplayName = CAST(N'Hadi' as varchar(40))
Conclusion
This article explained implicit conversions and the difference between an index scan and an index seek operation. Then, it illustrated how implicit conversion could affect the query plan estimation and let the query optimizer decide to use an index scan rather than an index seek. Finally, it explained how this behavior could be solved using explicit conversion.