Apache Phoenix Tuning | TrueCar Inc.
TrueCar Inc. is an automotive pricing and information company with data at its core. The Data Engineering team aggregates large, complex automotive data-sets to build products that serve our consumers’ demands. We use HBase, a distributed, column-oriented, NoSQL database that provides scalability along with fast and random access to large volumes of data. However, we needed an SQL layer over HBase to prevent the need to handle and manage lower level HBase operations. Apache Phoenix perfectly fits our requirements, as it provides a set of capabilities above HBase while maintaining HBase performance and throughput.
“Apache Phoenix — We Put the SQL Back in NoSQL.” This tagline defines Apache Phoenix as an SQL access to HBase. Apache Phoenix provides a query abstraction layer over HBase and serves as a means to increase performance and add functionality while simplifying access to the NoSQL database. In this blog, we will primarily focus on Apache Phoenix and will take a deep dive into the various techniques we use at TrueCar Inc. to tune Apache Phoenix.
We have an internal vin decoding service that handles 20K requests per second and does multiple Phoenix lookups as it applies business logic to determine vehicle data for a given vin. Each time the service is called, it creates a new PreparedStatement. Phoenix queries the SYSTEM.CATALOG table to get the recent metadata of the table, but since the SYSTEM.CATALOG table does not span across multiple regions, this leads to hotspotting.
To resolve this issue, we use the UPDATE_CACHE_FREQUENCY option. This determines how often the server will be checked for metadata updates. There are three possible values for this option: ALWAYS, NEVER, and a millisecond value. Frequency can be set per table or can be configured at the cluster level by setting the following property to a millisecond value in hbase-site.xml.
“phoenix.default.update.cache.frequency” = 180000 (180 seconds)
This indicates that Phoenix should cache the metadata of the queried table on the client-side and update it only after 180 seconds. Setting this frequency led to a 5x performance improvement as it avoids pinging the region hosting the SYSTEM.CATALOG table every time a query is compiled.
Without pre-split, the Phoenix table is seeded with a single region. When records are being written to an underlying HBase table that is not split across regions, all writes will hit one region on a particular region-server.
Consider a job that spins 300 mappers in parallel writing to such an HBase table. It will start populating the data in one region server. Each region has a predefined size, and because of that, once it reaches its maximum limit, it will split into two smaller regions within the same RegionServer. This limits the write throughput to the capacity of a single server instead of making use of multiple/all nodes in the HBase cluster.
This necessitates pre-splitting the table, which leads to better distribution of the table’s data across the cluster.
In addition, if there is a global secondary index created on the Phoenix table, then Phoenix stores the secondary index data in another table.
It is equally important to pre-split the index table, as a hotspot on the index table can slow down the writes to the main table even if it is pre-split.
Bloom Filter is a lightweight in-memory structure that can be enabled on a column family.
The two possible values for Bloom Filter are:
ROW(default): If table does not have a lot of Dynamic Columns.
ROWCOL: If table consists of large sets of Dynamic Columns.
Bloom Filter reduces the number of disk reads to only the files likely to contain that row-column combination. Bloom Filter increases the probability of finding a given row-column combination in a particular HFile. Taking this step led to a 2x performance improvement in the HBase read operation on a table with 40K dynamic columns.
Bloom Filter can be configured from the HBase shell using the following command:
At TrueCar, we perform multiple ad hoc queries and a programmatic export of data by selecting certain columns from a given table in order to generate reports, thus performing a full table scan on certain datasets.
The NO_CACHE hint in Apache Phoenix prevents the query results from populating the HBase block cache, thereby reducing the unnecessary churn in LRU. An example of how to use the NO_CACHE hint is as follows:
SMALL Hint reduces the number of RPC calls between client and server and is used when the query is doing point lookup or when we retrieve data that is less than the block size.
The following 3 cases indicate the advantages of a covered index including the SMALL Hint.
Dataset: A Phoenix table with 50 columns and 2M records.
Requirement: Retrieve 2 columns — COL_2, COL_3 based on a given column COL_1.
Case 1: Create a global secondary index.
A SELECT query on the index table ‘FOO_IDX’ will do a reverse lookup on the main table ‘FOO’ to retrieve the two columns ― COL_2 and COL_3
Case 2: Create a covered index with Hint
This covered index will prevent reverse lookup to the main table FOO since we have included the required columns as part of the index table FOO_IDX.
Case 3: Covered index with SMALL Hint
This further improves the query performance as compared to just creating a covered index.
This graph demonstrates the advantages of creating a covered index as opposed to a secondary index and further highlights the improvement in query performance realized by using a SMALL Hint.
The overall performance improvement is 9x, as reflected in the above graph.
The IN_MEMORY option allows us to cache small datasets in-memory. Setting IN_MEMORY=true provides fast reads (in single digit milliseconds). This option can be set to true with CREATE/ALTER statements.
At TrueCar, we have a web service that spins 600 mappers, which loads 1GB of memory occupying 600GB RAM. Using the IN_MEMORY=true option frees up the memory space and creates a centralized storage.
INCREMENTAL PRIMARY KEYS
The records in HBase are sorted lexicographically based on the row-key columns. This enables faster read operations and fast fetching of a range of data given start and stop keys. At TrueCar, we have multiple use cases where row keys are in an incremental sequence like 1, 2, 3…99998, 99999. This leads to an uneven write load distribution and causes hotspots because one particular region will be overwhelmed with all the traffic during a read-write operation, leading to performance degradation.
To solve this issue, we reverse the primary key and pre-split the table based on the reversed primary key.
Reversing pKey was preferred over Salting because with Salting we have to specify the number of buckets during table creation. And thus the salt buckets remain the same even if the data size keeps growing, which might again lead to uneven data distribution.
Reversing pKey randomizes the row keys and helps to distribute the traffic across all the region servers.
READ LATEST DATA
For some of our data pipelines, there are use cases where we have to retrieve the most recent data for a given row key. Since Hbase is an ordered HashMap, it stores the records in ascending order for the same row-key element.
Consider the following example, where we want to retrieve the latest transaction for a given vehicle. A query based on a given VIN will provide all the records for that VIN. However, the most recent transaction for the VIN is the last record. Thus, in order to retrieve the latest record, we have to ORDER BY SALE_DATE in descending order.
In order to retrieve the latest record for a given VIN, we can add another row-key element.
MILLIS_UNTIL_EPOCH = FUTURE-DATE — CURRENT_EPOCH_TIME
This allows the latest record to be at the top and a LIMIT 1 query for a given VIN will allow it to read the latest data.
In this post, we have addressed the cluster-level settings, table-level settings, and data-modelling techniques that we follow at TrueCar to achieve performance optimization and high-level efficiency for all data pipelines. Some of this content was also presented at Dataworks Summit 2018.
Video of talk — https://www.youtube.com/watch?v=lAy-xjRtU6Y