Designing Bigtable Schemas
Bigtable is a NoSQL database service provided by Google Cloud Platform. In Bigtable, each row represents a single entity which labeled with a unique row key. Each column stores attribute values for each row, and column families can be used to organize related columns. At the intersection of a row and column, there can be multiple cells, with each cell representing a different version of the data at a given timestamp.
In order to design the schema and which row key that should be used, we can try to take a look into the following questions:
- What does an individual row represent? (Identifying the row structure)
- What will be the most common queries to this data? (Creating a row key)
- What values are collected for each row? (Identifying the column qualifiers)
- Are there related columns that can be grouped or organized together? (Identifying the column families)
A best practice for creaating Bigtable’s schema
- Storing data with similar schemas in the same table, rather than in separate tables
- Using column qualifiers as data, so that you do not repeat the value for each row
- Organizing related columns in the same column family
- Choosing short but meaningful names for your column families
- Design your row key based on the queries you will use to retrieve the data
- Avoid row keys that start with a timestamp or sequential numeric IDs or that cause related data to not be grouped
- Design row keys that start with a more common/general value and end with a more granular value
- Store multiple delimited values in each row key using human-readable string values
References
Chang, F., Dean, J., Ghemawat, S., Hsieh, W. C., Wallach, D. A., Burrows, M., Chandra, T., Fikes, A., & Gruber, R. E. (1970, January 1). Bigtable: A distributed storage system for structured data. Google Research. Retrieved August 27, 2022, from https://research.google/pubs/pub27898/
Google. (n.d.). Schema design best practices | Cloud Bigtable documentation | google cloud. Google. Retrieved August 27, 2022, from https://cloud.google.com/bigtable/docs/schema-design