UUID Alternatives for Cloud Apps

When UUIDs are not the best solution…

Chris St. John
12 min readOct 16, 2024

UUIDs have long been the favorite identifier format in many cloud and big data applications. These identifiers are used in many apps requiring unique identification of data records, resources, and entities: databases, resource ids, session and transaction IDs, object storage etc.

For non-devs: UUID = big string of number/letters with dashes (pattern: 8–4–4–4–12) that your eyes glaze over like 1e7cacb7-af9f-4f07–88fd-45370c25ab62

Devs 🛠️ 🚀 ️are now using alternative ID generation methods that may better suit their needs.

While UUIDs have been, and continue to be, widely adopted (still the most used ID type), their large size and lack of natural ordering (in some versions at least — see below) can lead to inefficiencies in performance, storage, and indexing — at scale.

Additionally, the growing popularity of systems that rely on timestamped or lexicographically sortable IDs — such as KSUID or ULID — has made these options more appealing

We’re going to look at some of the alternatives available, their usage and PROS and CONS. I’m not going to tell what to use, because really it depends on your use case. Even for a simple internal POC, sequential IDs might be adequate (though normally not recommended).

ARTICLE UPDATE! (Nov 4, 2024)

Note: After initially publishing, some users (see comments) have pointed out you can use other alternative/newer versions of UUID which may resolve previous critiques of it. It’s a fair point.

I have added some additional info about other UUID versions.

Of course it still depends on your use case. A major goal was just to introduce non-UUID alternatives, not so much to insist one is better than another (which I had also mentioned in the article). But the great feedback was very useful and informative, much appreciated!

Here are some helpful comments (see comment section for author’s full comments), and I’ll update some areas of the article:

  • “ if a specific DB system has a specific ID for that system, USE IT!” (Jarrod Roberson)
  • “change uuid v4 with v7 or v8 it is more performant and sortable :)” (Jeremy Teichmann)
  • “Or just use UUIDv1, UUIDv6, or (best) UUIDv7 and call it a day.” (Miles Elam)
  • “what would have been good is to include perhaps the different versions of UUID like v7” (Jean-Paul de Jong)
  • “CUID has been deprecated for some very valid reasons and CUID2 is the best choice for opaque identifiers now.”… (Jarrod Roberson)
  • “Just a tip for anyone using MongoDB. Use the ObjectIds please. Do NOT swap the ids out for anything else, just because you might be familiar with something different (especially not sequential ids)” (Scott Molinari)

More discussion of UUID on StackOverflow: Which UUID version to use?

To be clear, there are different versions of UUID.

I’m not going into detail on each one, but it’s good to know alternatives exist.

  • v1 (Time-based): A combination of the current timestamp and the MAC address of the machine, unique but may expose hardware information.
  • v2 (DCE Security): Similar to Version 1 but includes POSIX UID/GID information, for applications requiring user or group identification.
  • v3 (Name-based, MD5): Hashes a namespace identifier and a name using the MD5 algorithm, producing consistent UUIDs for the same input data.
  • v4 (Random): Uses random numbers, offering simplicity and a low probability of duplication.
  • v5 (Name-based, SHA-1): Similar to Version 3 but uses the SHA-1 hashing algorithm, providing a more secure hash function for generating UUIDs from names.
  • v6 (Ordered Time-based): A reordering of Version 1 UUIDs to improve database indexing by placing the timestamp in the most significant bits, facilitating chronological ordering. May be suitable for some use cases of database keys needing ordering.
  • v7 (Unix Epoch Timestamp): Encodes a Unix timestamp with millisecond precision in the most significant 48 bits, followed by random data, ensuring uniqueness and time-ordering.
  • v8 (Custom): Reserved for custom implementations, allowing for the inclusion of application-specific data within the UUID structure.

As a reference, lets start with UUIDv4, a newer version of the UUID standard, is a 128-bit value (16 bytes), where 122 bits are randomly generated and the remaining bits contain version and variant information.

Example: 1e7cacb7-af9f-4f07–88fd-45370c25ab62

https://www.npmjs.com/package/uuid (also can use node.js crypto)

Pros of UUIDv4:

  • High degree of uniqueness due to its large 122-bit random space, making collisions highly unlikely.
  • Universally recognized standard, making it widely supported across many platforms, databases, and programming languages.
  • No need for coordination between systems, meaning UUIDv4 can be generated independently across distributed systems without risking collisions
  • Ideal for environments where many different machines or services are generating IDs independently.
  • UUIDs are easy to pass in URLs or files without encoding issues.
  • Part of Python’s standard library or the node.js crypto library can make them also

Cons of UUIDv4:

  • Much larger than necessary for certain use cases, leading to increased storage and bandwidth usage.
  • UUIDv4 is entirely random and lacks any inherent ordering (like timestamps or sequences). This can be inefficient for databases that index records, as UUIDv4 does not support natural ordering. Update: You can get ordering with UUIDv6.
  • UUIDs are long, random-looking strings which are not easily readable or memorable for humans.
  • Though UUIDv4 is highly random, if the random number generator is weak or predictable, the uniqueness could be compromised, leading to potential collisions.
  • In systems where uniqueness is easier to guarantee (such as within a single database), UUIDv4 may be overkill and inefficient compared to simpler IDs like auto-incrementing integers.

Here are 9+ alternatives to using UUIDs for generating unique identifiers in software systems, along with their pros and cons:

  1. Auto-Incrementing IDs (Sequential IDs)
  2. Snowflake ID (Twitter Snowflake)
  3. KSUID (K-Sortable Unique Identifier)
  4. ULID (Universally Unique Lexicographically Sortable Identifier)
  5. NanoID
  6. Random Hash-Based ID (SHA-256 or MD5 Hashing)
  7. ObjectID (MongoDB ObjectID)
  8. CUID (Collision-Resistant Unique Identifier)
  9. Others (that are less common)

1. Auto-Incrementing IDs (Sequential IDs)

Auto-incrementing IDs are numeric values that increment by one each time a new record is added, typically used in relational databases.

This alternative is provided as a reference for the simple implementation that many people learn with… but is most likely not what you want, it’s normally considered a bad practice unless you have some specialized use case.

The cons underscore why serious production systems do not use sequential IDs.

Example: 56482

Pros:

  • Simple to implement and easy to understand.
  • Efficient in terms of storage (smaller numeric data types).
  • Can be indexed easily for better performance in databases.
  • Works for small-scale systems or databases where order matters and there is no need for globally unique identifiers, such as primary keys in relational databases.

Cons:

  • Not suitable for distributed systems as it can lead to conflicts.
  • Predictable, which could pose a security risk (guessable IDs).
  • Requires database coordination to avoid duplicates in sharded systems.

For some problems with this approach see the article link below:

2. Snowflake ID (Twitter Snowflake)

A distributed ID generation algorithm that generates 64-bit unique IDs using a combination of a timestamp, machine ID, and sequence number.

A 64-bit unique identifier made up of a timestamp, machine ID, and sequence number.

Example: 5643574219214851220

Pros:

  • Distributed and scalable, suitable for distributed systems.
  • Timestamp component provides approximate ordering.
  • Generates IDs with guaranteed uniqueness across systems.
  • If you need scalable, time-ordered unique IDs, especially in social media or messaging apps

Cons:

  • Slightly more complex to implement than simple IDs.
  • Timestamp-based IDs can leak timing information.
  • Requires careful configuration of machine identifiers to avoid collisions.

Wikipedia: https://en.wikipedia.org/wiki/Snowflake_ID

NPM: https://www.npmjs.com/package/snowflake-id

PyPI: https://pypi.org/project/snowflake-id/

☁️ ⚡️ Quick promo! I have several PDFs in SystemsArchitect.io Store, this one is great for checklists and details BIG DATA for ✅ AWS Cloud Architect Best Practices for AWS Kinesis, Athena, Glue, Glue Studio, Lambda, EMR, AWS Batch, Amazon S3, DynamoDB, Amazon RDS, Aurora, AWS Redshift, AWS Data Exchange, AWS Data Pipeline, QuickSight, OpenSearch, MSK, AWS Glue DataBrew, AWS Lake Formation, Step Functions

3. KSUID (K-Sortable Unique Identifier)

KSUIDs are a variation of UUIDs that include a timestamp, making them sortable by creation time.

A 27-character string consisting of a timestamp and randomly generated bits, ensuring k-sortability.

NPM: https://www.npmjs.com/package/ksuid

PyPI: https://pypi.org/project/svix-ksuid/ or https://pypi.org/project/ksuid/

Example: 1avvTqCSFGnD5LDc4hN6GFFCAXD

Pros:

  • K-sorted (sortable by time) while being globally unique.
  • Compact and efficient representation.
  • If you need unique, time-ordered IDs with better chronological sorting and unique user-generated content identifiers.

Cons:

  • Still larger than basic numeric IDs.
  • A little more complex to work with than sequential IDs.
  • May expose timestamp information if privacy is a concern.

🥰 Thanks for reading my article… please clap, follow and share this article if you like it, thanks! 🚀

4. ULID (Universally Unique Lexicographically Sortable Identifier)

Similar to KSUID, ULID is a lexicographically sortable ID format that combines a timestamp with random data for uniqueness.

A 26-character alphanumeric string based on timestamp and randomness, ensuring lexicographic sorting.

NPM: https://www.npmjs.com/package/ulid

PyPI: https://pypi.org/project/python-ulid/

Example: 22H1UECHZX3FGGSZ7A9Y9BVC1

Pros:

  • Sortable based on creation time.
  • More human-readable than UUID.
  • Suitable for high-scale distributed systems.
  • Where lexicographical sorting is required along with globally unique identifiers, such as e-commerce or document management systems.

Cons:

  • Similar to KSUID, the timestamp is exposed, which might not be ideal for privacy.
  • Slightly more complex to generate than UUIDs.

5. NanoID

NanoID is a small, fast, and secure alternative to UUID, designed to be URL-friendly and customizable in terms of size.

A short, random, URL-friendly string with customizable length and alphabet.

NPM: https://www.npmjs.com/package/nanoid

PyPI: https://pypi.org/project/nanoid/

Example: E9SxJKL8_K5emHi2B-noZ

Pros:

  • Compact size and customizable length.
  • URL-safe and non-sequential, which improves security.
  • Fast generation and requires less storage than UUIDs.
  • Great for frontend applications or situations where short, URL-friendly, and unique IDs are needed, such as in public-facing URLs or session identifiers.

Cons:

  • Customizable size can lead to a smaller namespace and potential collisions.
  • Not widely adopted in enterprise systems compared to UUID.

6. Random Hash-Based ID (SHA-256 or MD5 Hashing)

Randomly generated strings using hash functions like SHA-256 or MD5 to create unique identifiers.

A fixed-length 32- or 64-character string generated by hashing data (like a combination of timestamp and user data).

Example:

Built in for node.js https://nodejs.org/api/crypto.html

Python’s built-in hashlib library

Example (SHA-256): 1d214892da28032151d0e36c2dc6291673603d1d61ab3dd32a11e2321d1542f2

Pros:

  • Can generate very large unique spaces, almost collision-proof.
  • Can use input data like user details to generate unique IDs deterministically.
  • Secure when using cryptographic hash functions (SHA-256).
  • If you need cryptographic uniqueness, such as file hashes for deduplication, or when an identifier must remain constant across systems.

Cons:

  • Large storage footprint compared to numeric IDs.
  • Slower to generate and verify due to computational complexity.
  • Depending on the hash function, may not be entirely collision-proof.

7. ObjectID (MongoDB ObjectID)

MongoDB’s ObjectID (BSON binary JSON) is a 12-byte unique identifier that includes a timestamp, machine identifier, process identifier, and a counter.

NPM: https://www.npmjs.com/package/bson-objectid

PyPI: https://pypi.org/project/pymongo/

Example: 102e1b71bcd16cd721434331
A 24-character hexadecimal string consisting of a timestamp, machine ID, and process ID.

Pros:

  • Provides unique, distributed IDs with minimal coordination.
  • Includes a timestamp, allowing for sorting by creation time.
  • Efficient in terms of storage (12 bytes, smaller than UUID).
  • Optimized for use in NoSQL databases like MongoDB, particularly for document storage systems where a timestamped, unique identifier is required.

Cons:

  • Exposes creation time, which may not be ideal for privacy.
  • Tied to MongoDB; might require some adaptation to use in other systems.
  • Slightly more complex than basic numeric IDs.

8. CUID2 (Collision-Resistant Unique Identifier)

Cuid2 is designed to minimize the likelihood of collision in distributed systems, providing a URL-safe, human-readable, and collision-resistant ID.

https://www.npmjs.com/package/@paralleldrive/cuid2

Example: skbcvmzbk02217a1ad0m5qhc2
A string beginning with "c" followed by a base-36 encoded timestamp and randomness, ensuring low collision probability.

Pros:

  • High collision resistance, even across distributed systems.
  • Human-readable and URL-friendly.
  • Includes a timestamp and a random counter to ensure uniqueness.
  • “Horizontally scalable: Generate ids on multiple machines without coordination.” — npm library linked above
  • “Offline-compatible: Generate ids without a network connection.” — npm library linked above
  • For a robust, unique identifier with a lower risk of collisions, such as in distributed databases or microservices.

Cons:

  • Larger than simpler ID formats like NanoID or Snowflake.
  • Can expose some details (like timestamps and machine information).
  • More complex to generate compared to sequential or numeric IDs.

9. Others (less common)

Here are some others I found in my research, but these are less common. If you have not found the best ID for your use case, it may be worth checking out:

Flake ID: Includes a timestamp, machine identifier, and sequence number. If you require unique, time-ordered identifiers for easier sorting and debugging.
Example: 304857642123456

Time-based IDs: Often a Unix timestamp concatenated with random bits or other unique data. Where chronological order is important, such as logging or event tracking.
Example: 16972283871234567890 (combines timestamp 1697228387 with a random suffix)

Sequential GUIDs (SQL Server): A GUID optimized for indexing, often beginning with a time-ordered segment. Where GUIDs are necessary, but ordered indexing is needed for better database performance.
Example: 6E4F6A80-4F64-11EE-B4FA-0242AC120002

ShortID: Generates a compact, unique alphanumeric string, often used in URLs. URL shortening or user-friendly identifiers in public URLs.
Example: 2K5czP8

ZUID (Zero-width Unique Identifier): Involves zero-width characters (e.g., zero-width spaces) that are invisible but can be parsed for uniqueness. If you need invisible identifiers for tracking or metadata without impacting visual layout.
Example: Internally may look like \u200B\u200C\u200D

That’s a wrap! We covered a lot of detail on UUID alternatives and I hope this gives you some more background on an obscure topic (for some) that may help you create better apps.

🥰 Thanks for reading my article… please clap, follow and share this article if you like it, thanks! 🚀

Article Image credits: main featured image AI-generated, other repo/site screenshots

About me

I’m a cloud architect, senior developer and tech lead who enjoys solving high-value challenges with innovative solutions.

I’m always open to discussing projects. If you need help, have an opportunity or simply want to chat, you can reach me at csjcode at gmail.

I’ve worked 20+ years in software development, both in an enterprise setting such as NIKE and the original MP3.com, as well as startups like FreshPatents, SystemsArchitect.io, API.cc, and Instantiate.io.

My experience ranges from cloud ecommerce, API design/implementation, serverless, AI integration for development, content management, frontend UI/UX architecture and login/authentication. I give tech talks, tutorials and share documentation of architecting software. Also previously held AWS Solutions Architect certification.

Cloud Ebook Store — check for cloud architect and engineering books at a great value, “Cloud Metrics” (800 pages+) and “Cloud Audit” (800 pages+) and more — https://store.systemsarchitect.io

And my website:

Recently I’m working on Instantiate.io, a value creation experiment tool to help startup planning with AI. I wrote a reference manual on cloud metrics.

Also, an enthusiast of blockchain, I’m active working on applications in the innovative Solana blockchain ecosystem.

--

--

Chris St. John
Chris St. John

Written by Chris St. John

Cloud Architect, Solana/AI enthusiast, dev, entrepreneur, nomad. previously: Senior Dev/FE lead at NIKE. current: founder of store.SystemsArchitect.io

Responses (14)