Distributed Unique ID(Java Version)

George Chou
Javarevisited
Published in
5 min readFeb 11, 2024

In distributed systems, some scenarios require the use of globally unique IDs, both as business identifiers and to satisfy the idempotency design of interfaces.
In the case of a single table, we can directly use the database’s self-incremented ID, but after the library and table can not meet the demand, we need to find ways to achieve this through other means.

For a globally unique ID, the characteristics needed are:

  • Globally unique
  • Incremental
  • Highly available
  • Autonomous
  • Security (will not be exposed)

Database AUTO_INCREMENT id

Implementation

1. Use the MySQL database’s AUTO_INCREMENT id feature to get a sequence.
2. When the server gets the sequence, it sets a threshold of `max=sequence * step`.
3. Immediately after obtaining an ID, it is accumulated until it reaches the maximum threshold.

private static final Long STEP = 1000L;

private Long curId = 0L;

public synchronized Long generateId() {
if (curId >= max) {
Sequence sequence = new Sequence();
mapper.sequence(sequence);
curId = sequence.getId();

max = id * STEP;;
}

return curId++;
}

Pros

  • id incremental
  • simple to implement

Cons

  • Database dependency required
  • Server reboot will lead to id discontinuity

UUID

Implementation

UUID uuid = UUID.randomUUID();
String uuidAsString = uuid.toString();

Pros

  • No need to rely on middleware or third-party services can be generated independently
  • Highly concurrent
  • There are implementation libraries for each language that can be used directly

Cons

  • String storage, space-consuming, poor DB query, and indexing efficiency.
  • Not self-incrementing IDs, poor readability.
  • Different implementations may leak information

Redis

Implementation

Use incr or increby to generate IDs using the single-threaded nature of Redis.

Dependency

<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-data-redis</artifactId>
<version>2.7.2</version>
</dependency>

Java Configuration

@Bean
JedisConnectionFactory jedisConnectionFactory() {
JedisConnectionFactory jedisConFactory = new JedisConnectionFactory();
jedisConFactory.setHostName("localhost");
jedisConFactory.setPort(6379);
return jedisConFactory;
}

@Bean
public RedisTemplate<String, Object> redisTemplate() {
RedisTemplate<String, Object> template = new RedisTemplate<>();
template.setConnectionFactory(jedisConnectionFactory());
return template;
}

Generate Id

@Autowired
private RedisTemplate<String, Object> redisTemplate;

private static final String ID_KEY = "key:id";

public Long generateId() {
if (!redisTemplate.hasKey(ID_KEY)) {
redisTemplate.opsForValue().set(ID_KEY, 0);
}

return redisTemplate.opsForValue().increment(ID_KEY, 1);
}

Pros

  • Much better performance than databases
  • Can be deployed in clusters
  • Generated IDs can have time information

Cons

  • Requires the introduction of middleware, which increases the complexity of the system
  • The service that sends the IDs is also a single point.

Twitter Snowflake

Twitter Snowflake is a dedicated network service for generating 64-bit unique IDs at high scale. The IDs generated by this service are roughly time-sorted.

The IDs are made up of the following components:

  • Epoch timestamp in millisecond precision — 41 bits (gives us 69 years with a custom epoch)
  • Configured machine ID — 10 bits (gives us up to 1024 machines)
  • Sequence number — 12 bits (A local counter per machine that rolls over every 4096)

Implementation
Here is the Java version of the snowflake, from github.

import java.net.NetworkInterface;
import java.security.SecureRandom;
import java.time.Instant;
import java.util.Enumeration;

/**
* Distributed Sequence Generator.
* Inspired by Twitter snowflake: https://github.com/twitter/snowflake/tree/snowflake-2010
*
* This class should be used as a Singleton.
* Make sure that you create and reuse a Single instance of Snowflake per node in your distributed system cluster.
*/
public class Snowflake {
private static final int UNUSED_BITS = 1; // Sign bit, Unused (always set to 0)
private static final int EPOCH_BITS = 41;
private static final int NODE_ID_BITS = 10;
private static final int SEQUENCE_BITS = 12;

private static final long maxNodeId = (1L << NODE_ID_BITS) - 1;
private static final long maxSequence = (1L << SEQUENCE_BITS) - 1;

// Custom Epoch (January 1, 2015 Midnight UTC = 2015-01-01T00:00:00Z)
private static final long DEFAULT_CUSTOM_EPOCH = 1420070400000L;

private final long nodeId;
private final long customEpoch;

private volatile long lastTimestamp = -1L;
private volatile long sequence = 0L;

// Create Snowflake with a nodeId and custom epoch
public Snowflake(long nodeId, long customEpoch) {
if(nodeId < 0 || nodeId > maxNodeId) {
throw new IllegalArgumentException(String.format("NodeId must be between %d and %d", 0, maxNodeId));
}
this.nodeId = nodeId;
this.customEpoch = customEpoch;
}

// Create Snowflake with a nodeId
public Snowflake(long nodeId) {
this(nodeId, DEFAULT_CUSTOM_EPOCH);
}

// Let Snowflake generate a nodeId
public Snowflake() {
this.nodeId = createNodeId();
this.customEpoch = DEFAULT_CUSTOM_EPOCH;
}

public synchronized long nextId() {
long currentTimestamp = timestamp();

if(currentTimestamp < lastTimestamp) {
throw new IllegalStateException("Invalid System Clock!");
}

if (currentTimestamp == lastTimestamp) {
sequence = (sequence + 1) & maxSequence;
if(sequence == 0) {
// Sequence Exhausted, wait till next millisecond.
currentTimestamp = waitNextMillis(currentTimestamp);
}
} else {
// reset sequence to start with zero for the next millisecond
sequence = 0;
}

lastTimestamp = currentTimestamp;

long id = currentTimestamp << (NODE_ID_BITS + SEQUENCE_BITS)
| (nodeId << SEQUENCE_BITS)
| sequence;

return id;
}


// Get current timestamp in milliseconds, adjust for the custom epoch.
private long timestamp() {
return Instant.now().toEpochMilli() - customEpoch;
}

// Block and wait till next millisecond
private long waitNextMillis(long currentTimestamp) {
while (currentTimestamp == lastTimestamp) {
currentTimestamp = timestamp();
}
return currentTimestamp;
}

private long createNodeId() {
long nodeId;
try {
StringBuilder sb = new StringBuilder();
Enumeration<NetworkInterface> networkInterfaces = NetworkInterface.getNetworkInterfaces();
while (networkInterfaces.hasMoreElements()) {
NetworkInterface networkInterface = networkInterfaces.nextElement();
byte[] mac = networkInterface.getHardwareAddress();
if (mac != null) {
for(byte macPort: mac) {
sb.append(String.format("%02X", macPort));
}
}
}
nodeId = sb.toString().hashCode();
} catch (Exception ex) {
nodeId = (new SecureRandom().nextInt());
}
nodeId = nodeId & maxNodeId;
return nodeId;
}

public long[] parse(long id) {
long maskNodeId = ((1L << NODE_ID_BITS) - 1) << SEQUENCE_BITS;
long maskSequence = (1L << SEQUENCE_BITS) - 1;

long timestamp = (id >> (NODE_ID_BITS + SEQUENCE_BITS)) + customEpoch;
long nodeId = (id & maskNodeId) >> SEQUENCE_BITS;
long sequence = id & maskSequence;

return new long[]{timestamp, nodeId, sequence};
}

@Override
public String toString() {
return "Snowflake Settings [EPOCH_BITS=" + EPOCH_BITS + ", NODE_ID_BITS=" + NODE_ID_BITS
+ ", SEQUENCE_BITS=" + SEQUENCE_BITS + ", CUSTOM_EPOCH=" + customEpoch
+ ", NodeId=" + nodeId + "]";
}
}

Pros

  • Local generation, no network consumption, no need to introduce middleware
  • Highly concurrent
  • Contains time information

Cons

  • Depending on the machine clock, there is a risk of duplicate IDs being generated for the same machine if the time is dialed back

These schemes do not satisfy all the global ID characteristics, just like the CAP characteristics of a distributed system, we can only try to satisfy as many of them as possible.

There is no best solution, only the most suitable one.

--

--