Distributed Unique ID(Java Version)

Published in

Javarevisited

5 min readFeb 11, 2024

In distributed systems, some scenarios require the use of globally unique IDs, both as business identifiers and to satisfy the idempotency design of interfaces.
In the case of a single table, we can directly use the database’s self-incremented ID, but after the library and table can not meet the demand, we need to find ways to achieve this through other means.

For a globally unique ID, the characteristics needed are:

Globally unique
Incremental
Highly available
Autonomous
Security (will not be exposed)

Database AUTO_INCREMENT id

Implementation

1. Use the MySQL database’s AUTO_INCREMENT id feature to get a sequence.
2. When the server gets the sequence, it sets a threshold of `max=sequence * step`.
3. Immediately after obtaining an ID, it is accumulated until it reaches the maximum threshold.

private static final Long STEP = 1000L;

private Long curId = 0L;

public synchronized Long generateId() {  
    if (curId >= max) {  
        Sequence sequence = new Sequence();  
        mapper.sequence(sequence);  
        curId = sequence.getId();  
  
        max = id * STEP;;  
    }  
  
    return curId++;  
}

Pros

id incremental
simple to implement

Cons

Database dependency required
Server reboot will lead to id discontinuity

UUID

Implementation

UUID uuid = UUID.randomUUID();
String uuidAsString = uuid.toString();

Pros

No need to rely on middleware or third-party services can be generated independently
Highly concurrent
There are implementation libraries for each language that can be used directly

Cons

String storage, space-consuming, poor DB query, and indexing efficiency.
Not self-incrementing IDs, poor readability.
Different implementations may leak information

Redis

Implementation

Use incr or increby to generate IDs using the single-threaded nature of Redis.

Dependency

<dependency>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-starter-data-redis</artifactId>
    <version>2.7.2</version>
</dependency>

Java Configuration

@Bean
JedisConnectionFactory jedisConnectionFactory() {
    JedisConnectionFactory jedisConFactory = new JedisConnectionFactory();
    jedisConFactory.setHostName("localhost"); 
    jedisConFactory.setPort(6379); 
    return jedisConFactory;
}

@Bean
public RedisTemplate<String, Object> redisTemplate() {
    RedisTemplate<String, Object> template = new RedisTemplate<>();
    template.setConnectionFactory(jedisConnectionFactory());
    return template;
}

Generate Id

@Autowired
private RedisTemplate<String, Object> redisTemplate;

private static final String ID_KEY = "key:id";

public Long generateId() {
 if (!redisTemplate.hasKey(ID_KEY)) {
  redisTemplate.opsForValue().set(ID_KEY, 0);
 }
 
 return redisTemplate.opsForValue().increment(ID_KEY, 1);
}

Pros

Much better performance than databases
Can be deployed in clusters
Generated IDs can have time information

Cons

Requires the introduction of middleware, which increases the complexity of the system
The service that sends the IDs is also a single point.

Twitter Snowflake

Twitter Snowflake is a dedicated network service for generating 64-bit unique IDs at high scale. The IDs generated by this service are roughly time-sorted.

The IDs are made up of the following components:

Epoch timestamp in millisecond precision — 41 bits (gives us 69 years with a custom epoch)
Configured machine ID — 10 bits (gives us up to 1024 machines)
Sequence number — 12 bits (A local counter per machine that rolls over every 4096)

Implementation
Here is the Java version of the snowflake, from github.

import java.net.NetworkInterface;
import java.security.SecureRandom;
import java.time.Instant;
import java.util.Enumeration;

/**
 * Distributed Sequence Generator.
 * Inspired by Twitter snowflake: https://github.com/twitter/snowflake/tree/snowflake-2010
 *
 * This class should be used as a Singleton.
 * Make sure that you create and reuse a Single instance of Snowflake per node in your distributed system cluster.
 */
public class Snowflake {
    private static final int UNUSED_BITS = 1; // Sign bit, Unused (always set to 0)
    private static final int EPOCH_BITS = 41;
    private static final int NODE_ID_BITS = 10;
    private static final int SEQUENCE_BITS = 12;

    private static final long maxNodeId = (1L << NODE_ID_BITS) - 1;
    private static final long maxSequence = (1L << SEQUENCE_BITS) - 1;

    // Custom Epoch (January 1, 2015 Midnight UTC = 2015-01-01T00:00:00Z)
    private static final long DEFAULT_CUSTOM_EPOCH = 1420070400000L;

    private final long nodeId;
    private final long customEpoch;

    private volatile long lastTimestamp = -1L;
    private volatile long sequence = 0L;

    // Create Snowflake with a nodeId and custom epoch
    public Snowflake(long nodeId, long customEpoch) {
        if(nodeId < 0 || nodeId > maxNodeId) {
            throw new IllegalArgumentException(String.format("NodeId must be between %d and %d", 0, maxNodeId));
        }
        this.nodeId = nodeId;
        this.customEpoch = customEpoch;
    }

    // Create Snowflake with a nodeId
    public Snowflake(long nodeId) {
        this(nodeId, DEFAULT_CUSTOM_EPOCH);
    }

    // Let Snowflake generate a nodeId
    public Snowflake() {
        this.nodeId = createNodeId();
        this.customEpoch = DEFAULT_CUSTOM_EPOCH;
    }

    public synchronized long nextId() {
        long currentTimestamp = timestamp();

        if(currentTimestamp < lastTimestamp) {
            throw new IllegalStateException("Invalid System Clock!");
        }

        if (currentTimestamp == lastTimestamp) {
            sequence = (sequence + 1) & maxSequence;
            if(sequence == 0) {
                // Sequence Exhausted, wait till next millisecond.
                currentTimestamp = waitNextMillis(currentTimestamp);
            }
        } else {
            // reset sequence to start with zero for the next millisecond
            sequence = 0;
        }

        lastTimestamp = currentTimestamp;

        long id = currentTimestamp << (NODE_ID_BITS + SEQUENCE_BITS)
                | (nodeId << SEQUENCE_BITS)
                | sequence;

        return id;
    }


    // Get current timestamp in milliseconds, adjust for the custom epoch.
    private long timestamp() {
        return Instant.now().toEpochMilli() - customEpoch;
    }

    // Block and wait till next millisecond
    private long waitNextMillis(long currentTimestamp) {
        while (currentTimestamp == lastTimestamp) {
            currentTimestamp = timestamp();
        }
        return currentTimestamp;
    }

    private long createNodeId() {
        long nodeId;
        try {
            StringBuilder sb = new StringBuilder();
            Enumeration<NetworkInterface> networkInterfaces = NetworkInterface.getNetworkInterfaces();
            while (networkInterfaces.hasMoreElements()) {
                NetworkInterface networkInterface = networkInterfaces.nextElement();
                byte[] mac = networkInterface.getHardwareAddress();
                if (mac != null) {
                    for(byte macPort: mac) {
                        sb.append(String.format("%02X", macPort));
                    }
                }
            }
            nodeId = sb.toString().hashCode();
        } catch (Exception ex) {
            nodeId = (new SecureRandom().nextInt());
        }
        nodeId = nodeId & maxNodeId;
        return nodeId;
    }

    public long[] parse(long id) {
        long maskNodeId = ((1L << NODE_ID_BITS) - 1) << SEQUENCE_BITS;
        long maskSequence = (1L << SEQUENCE_BITS) - 1;

        long timestamp = (id >> (NODE_ID_BITS + SEQUENCE_BITS)) + customEpoch;
        long nodeId = (id & maskNodeId) >> SEQUENCE_BITS;
        long sequence = id & maskSequence;

        return new long[]{timestamp, nodeId, sequence};
    }

    @Override
    public String toString() {
        return "Snowflake Settings [EPOCH_BITS=" + EPOCH_BITS + ", NODE_ID_BITS=" + NODE_ID_BITS
                + ", SEQUENCE_BITS=" + SEQUENCE_BITS + ", CUSTOM_EPOCH=" + customEpoch
                + ", NodeId=" + nodeId + "]";
    }
}

Pros

Local generation, no network consumption, no need to introduce middleware
Highly concurrent
Contains time information

Cons

Depending on the machine clock, there is a risk of duplicate IDs being generated for the same machine if the time is dialed back

These schemes do not satisfy all the global ID characteristics, just like the CAP characteristics of a distributed system, we can only try to satisfy as many of them as possible.

There is no best solution, only the most suitable one.

Distributed Unique ID(Java Version)

Database AUTO_INCREMENT id

Implementation

UUID

Implementation

Redis

Implementation

Twitter Snowflake

Written by George Chou