We still need indexes, and I will continue to commute by train even when self-driving cars become reality…
When the Autonomous Data Warehouse Cloud Service had been announced, Oracle came with this surprising idea that we do not need to create Indexes, Partitions and Materialized views for our analytic queries. It was even blocked in ADW and recently released but not recommended. Automatic indexing is for ATP and the message for ADW is: you don’t need indexes for your analytic queries.
In my opinion, and even with the best performance in non-index access, we will still need index range scans. And even when it is not the fastest access path. Because the fastest response time is not the first performance criteria for the end-user.
Full Table Scan vs. Index access
Here is a diagram I used to explain the cost of Full Table Scan vs. Index Access, depending on the number of rows to return.
Full Table Scan depends only on the size of the table. It has the same cost to return zero, one, half, or all the rows. Index Access is linearly proportional to the number of rows. The optimizer has a hard job to determine the point of inflection. And anyway, around the point of inflection, no method is really optimal. I always recommend providing structures that make one of those two access paths obviously the best.
The idea of the Autonomous Data Warehouse is to lower the cost of the Full Table Scan, thanks to a combination of hardware and software optimizations. Of course, it cannot be lowered to compete with an index unique access retrieving one row, like for OLTP ‘select * where PK=’. But the inflection point can be moved down enough so that Full Table Scan response time is correct for any analytic query. Then, with only one access method, the optimizer job is easier, less prone to errors. We don’t have to design indexes, and all this can be branded as autonomous tuning.
Before going to the problem of having no indexes to range scan, I’ll try to summarize quickly the properties of Full Table Scans and Index access. There’s also an excellent read to go into the detail: https://use-the-index-luke.com/ by Markus Winand.
There are two access paths to read table rows:
- Full Table Scan, which is always available (at the exception of IOTs - Index Organized Tables, as there’s no table) where all the formatted blocks of a Heap table are read, without a specific order.
- Access by ROWID where a sorted structure (as an Index) or hash function provides the physical address where to get the rows we need, without having to scan all blocks.
Full Table Scan is efficient because it can to read rows as it is physically the most efficient: contiguous blocks, in large I/O calls, in parallel, asynchronous, bypassing all caches. However, it has to read a lot more than needed. Without any determined order. Then the rows have to be filtered out later. And often sorted or hashed later, to be joined, ordered or deduplicated. Many features have improved Full Table Scan to keep it efficient even when we do not need a lot of rows: Storage Indexes, Zone Maps, SmartScan, Compression, Column Store, In-Memory, HCC,… Those features keep Full Table Scans still efficient for analytic queries where we query only a subset of rows and columns.
Yes, Full Table Scan on ADW is fast. But from my experience, ‘fast’ response time is not the most important criteria for the end-user. They want predictable response time.
Users prefer predictability over fast response time
You may think that Full Table Scan is predictable. That’s because you know the size of the table. But the end-user doesn’t know if the table is compressed, or full of free space. They do not realize how it can be longer when requesting the result to be ordered. They do not understand that querying today’s orders is not faster than querying the whole week. They cannot predict the response time because it is not proportional to the result they expect.
And there’s more: when many people run the same Full Table Scan, even when expecting different data, they are in competition and all response time will be longer because of buffer busy waits. When you query at a time where there’s a lot of activity on the storage, the response time will also be longer. All this cannot be predicted by the end-user. And the improvements such as Storage Index, In-Memory, Compression,… are not predictable by nature as they depend on the activity. They adapt from the previous queries and concurrent workload.
Even when Index Access response time is higher, it stays more predictable because it processes only the required rows. The time will be the same when the table grows by 200%. Or when 5 users are running similar queries. The cost is proportional to the number of rows returned: the end-user will accept easily that querying one week takes x7 more time than for one day. If it takes 5 minutes, they will manage it because they expect this longer duration. They don’t want the fastest response time, they just want a predictable time.
The features of the Oracle Autonomous Database are presented with an analogy with cars, to push on the self-driving idea. For OLTP, with the Autonomous Transaction Processing, Oracle develops Automatic Indexing. And the analogy is: indexes are like new roads. The optimizer evaluates the execution plans (plans are like driving directions) with the goal to choose the one with the fastest response time. There’s no decision about the most predictable, the one which is less subject to contention with others, or the one which has a response time proportional to the result set. The CBO chose the lower cost, and the CBO cost is the estimated response time. With this goal, an optimized table scan looks the best.
But my goal, keeping the response time predictable by the end-user, is different. I’ll continue the analogy with the cars. Here is what Google Maps estimates for my morning commute to work:
Car: 35 min — 1 h 5 min
This is how ‘predictable’ is the commute by car at that time - the peak hour around Geneva. Typically between 35 and 65 minutes. It is, on average, faster than the train. But the time is not linearly proportional to the kilometers. Most of the time is spent in the few ‘red’ kilometers. That’s my analogy with Full Table Scan. We can drive fast, very fast. But the time it takes to reach the destination is uncertain.
And here is what Google Maps estimates for the same commute by train:
Train: 1 h — 1 h 1 min
This is completely predictable, proportional to the distance, and with nearly no variations. Index Access is like my commute by train: most of the time not faster than driving, but finally preferable because of its stability and predictability. And with good design, Index Access can be as precise as Swiss trains ;)
You can guess that I take the train to go to work. I don’t care that’s it takes longer, because I know it and can manage it — like opening my laptop and writing a blog post — so that it is not wasted time. And I know exactly when I’ll arrive at work for a meeting, or at home to pick up the kids.
Index, partitions, materialized views
If predictability is your first criteria, then your optimization should not rely on caches, parallelism, compression, storage indexes, in-memory population, result cache,… Of course, all these features are welcome when they kick-in at execution time, but they will not help you to forecast a predictable response time. The predictable optimizations are those for which the cost is proportional to the expected result, where we can forecast performance:
- Partitions clearly reduce the cost of Full Table Scan with something predictable. The partition pruning has a business meaning. If the user queries a specific time range or geographical area, she knows that the time to retrieve will be proportional to it. Partitions, when properly designed, also keep the cost independent from the table growth.
- Materialized views give this end-user predictability for aggregations. When a user queries for the monthly amount of sales in the year, she expects a 12 rows result, and she may not imagine that millions of rows have to be scanned for this. With a materialized summary, the response time will meet the volume of the result.
- Indexes, when properly designed, will range scan exactly the values required for the query. And they will not add additional time for sorting or deduplicating the rows. Index access will give the best user experience because the retrieval is proportional to the cost imagined by the user. And Top-N queries is an important part of analytic queries. Only an index range scan answer a Top-N query without reading all the rows.
Index access is preferable when we query a limited number of rows. Full Table Scan is better when we need to read a large part of the table. The optimizer can choose the best one, but it is based on estimations. If you are in the area around the inflection point, you have the risk of execution plan change between two plans that are not optimal. This is where Cardinality Feedback can be good or bad. You should never leave critical queries in this zone. In the diagram at the top of this post, there’s a dotted line for ‘covering index’ which pushes the inflection point to a cardinality where it is still an alternative to FTS for a high number of rows. This is when the index contains all required columns and there’s no need to go to the table (and no dependency with the clustering factor, which is difficult to predict by the end-user).
In my opinion
With the price of a limited human-labor index design, we can provide speed and predictability to our users. Expensive hardware and software can compete with the former, but not the latter. And this is why I think we will always need indexes. The reason is the same as why I prefer to commute by train: predictability of the time.
By the way, there’s another reason after predictability and speed: green computing. Look at the resource (CPU and I/O) usage efficiency and guess which execution plan has the lower carbon footprint. Full Table Scan in Parallel Query, like all the trendy Big Data solutions where you scale by adding more nodes, is awesome and fast, but extremely expensive on resources. Indexes, when correctly designed, will read only what is needed to get the result, with predictable response time.