Generating unique IDs in a distributed environment at high scale

Generating unique IDs in a distributed environment At high scale

Solution from Twitter Snowflake

Twitter snowflake is a dedicated network service for generating 64-bit unique IDs at high scale. The IDs generated by this service are roughly time sortable.

The IDs are made up of the following components:

  • Epoch timestamp in millisecond precision - 41 bits (gives us 69 years with a custom epoch)
  • Configured machine id - 10 bits (gives us up to 1024 machines)
  • Sequence number - 12 bits (A local counter per machine that rolls over every 4096)

Core part of the code

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
private static final int TOTAL_BITS = 64;
private static final int EPOCH_BITS = 42;
private static final int NODE_ID_BITS = 10;
private static final int SEQUENCE_BITS = 12;
public synchronized long nextId() {
long currentTimestamp = timestamp();

if(currentTimestamp < lastTimestamp) {
throw new IllegalStateException("Invalid System Clock!");
}

if (currentTimestamp == lastTimestamp) {
sequence = (sequence + 1) & maxSequence;
if(sequence == 0) {
// Sequence Exhausted, wait till next millisecond.
currentTimestamp = waitNextMillis(currentTimestamp);
}
} else {
// reset sequence to start with zero for the next millisecond
sequence = 0;
}

lastTimestamp = currentTimestamp;
long id = currentTimestamp << (TOTAL_BITS - EPOCH_BITS);
id |= (nodeId << (TOTAL_BITS - EPOCH_BITS - NODE_ID_BITS));
id |= sequence;
return id;
}

Original post can be found from

https://www.callicoder.com/distributed-unique-id-sequence-number-generator/