The problem with random IDs
The first time a database index quietly betrayed me, the symptom was boring: a table that inserted at a few thousand rows a second when it was empty was crawling at a few hundred once it had a few tens of millions of rows. No schema change, no new query, no lock contention I could find. The only thing that table had done was grow. The culprit turned out to be the most innocent-looking column in the whole design — the primary key, a UUIDv4.
A UUIDv4 is 122 bits of randomness wearing a 128-bit costume. That randomness is exactly the point: collisions are astronomically unlikely and you can mint one anywhere — client, server, edge — without coordinating with anyone. But "random" has a cost that doesn't show up until your data outgrows RAM, and it's a cost that lives in the B-tree behind your primary key.
A B-tree keeps its keys in sorted order. When a new key is random, it doesn't land at the end — it lands somewhere in the middle, in whichever leaf page happens to own that slice of the keyspace. If that page is full, the database splits it: allocates a new page, moves half the rows, rewrites the parent. Do that for every insert and you get page splits, parent rewrites, and a working set that no longer fits in memory because every write touches a different, cold part of the index. On MySQL's InnoDB, where the primary key is the clustered index, random keys have been measured to bloat on-disk size by 30–60% versus sequential keys. My slowdown wasn't a mystery. It was physics.
The fix is almost insultingly simple: make the ID sort by the time it was created. If new IDs are always a little bigger than old ones, every insert lands at the right edge of the tree, on the same hot leaf page, which stays in cache and fills up before it ever splits. That single property — sortability — is the whole reason ULID exists, and it's why I now reach for it (or its cousin UUIDv7) by default.
What a ULID actually is
ULID stands for Universally Unique Lexicographically Sortable Identifier. It's the same size as a UUID — 128 bits — but those bits are arranged for a job instead of scattered at random:
- 48 bits of timestamp — Unix time in milliseconds, most-significant first.
- 80 bits of randomness — cryptographically secure, freshly drawn per ID.
Put the time first and a beautiful thing falls out for free: because the most significant bits are the timestamp, sorting the IDs as plain strings sorts them by creation time. No separate created_at column to order by, no composite index — the key is the clock.
Figure 1 — the timestamp goes first, so the leftmost characters carry the most weight; sorting the string sorts by time.
The string form is where ULID quietly out-classes UUID. A UUID is 36 characters with its four hyphens (550e8400-e29b-41d4-a716-446655440000). A ULID is 26 characters, no hyphens (01ARZ3NDEKTSV4RRFFQ69G5FAV), because it's encoded in Crockford's base32 — five bits per character. Crockford's alphabet deliberately drops the letters I, L, O, and U so a human reading an ID aloud or copying it off a screen can't confuse 1/l or 0/O, and there's nothing in the set that spells anything unfortunate. The whole thing is case-insensitive and URL-safe: no escaping, no + or / surprises like base64.
The headline numbers: the spec gives you 1.21 × 10²⁴ unique ULIDs per millisecond, and since 48 bits of milliseconds runs out at epoch
2⁴⁸ − 1, you won't exhaust the timestamp until the year 10889 AD. The largest legal ULID is7ZZZZZZZZZZZZZZZZZZZZZZZZZ; anything above it should be rejected.
Building one in ~40 lines
ULID isn't a library you have to take on faith — the generator I keep in my utils folder is about forty lines, and reading it is the fastest way to understand the format. It starts with the alphabet:
const CROCKFORD = "0123456789ABCDEFGHJKMNPQRSTVWXYZ"; // 32 symbols — no I, L, O, U
const TIME_LEN = 10; // 48 bits of time → 10 base32 chars
const RAND_LEN = 16; // 80 bits of random → 16 base32 charsTen plus sixteen is twenty-six — the full ULID length, accounted for before we write a single function. Encoding the timestamp is just repeated division: peel off five bits at a time (one base32 symbol), least-significant last, building the string right-to-left.
function encodeTime(now: number): string {
let n = now;
let out = "";
for (let i = TIME_LEN - 1; i >= 0; i--) {
const mod = n % 32; // low 5 bits → one Crockford symbol
out = CROCKFORD[mod] + out;
n = (n - mod) / 32; // shift right by 5 bits
}
return out;
}The random half pulls 16 bytes from the platform CSPRNG and maps each into the alphabet:
function encodeRandom(): string {
const bytes = crypto.getRandomValues(new Uint8Array(RAND_LEN));
let out = "";
for (let i = 0; i < RAND_LEN; i++) out += CROCKFORD[bytes[i] % 32]; // 256 = 8×32 → no modulo bias
return out;
}That % 32 deserves a second look, because modulo-into-an-alphabet is a classic place to introduce bias. Here it's safe by arithmetic: a byte is 0..255, and 256 = 8 × 32 exactly, so each of the 32 symbols is hit by precisely eight byte values. The distribution stays uniform. (If the alphabet weren't a divisor of 256 — say base 31 — this would skew toward the low symbols and you'd need rejection sampling.) The whole public surface is two functions:
/** Generate a ULID. `now` is injectable so tests are deterministic. */
export function ulid(now: number = Date.now()): string {
return encodeTime(now) + encodeRandom();
}
/** `<prefix>_<ULID>` — e.g. prefixedId("user") → "user_01ARZ3NDEK..." */
export function prefixedId(prefix: string, now?: number): string {
return `${prefix}_${ulid(now)}`;
}That prefixedId helper is a small habit I'd recommend to anyone: a user_…, order_…, inv_… prefix turns an opaque ID into a self-describing one. You can tell at a glance whether an ID in a log line is a user or an order, and you can never accidentally pass an order ID where a user ID belongs — the prefix screams the mistake. Stripe built half its developer-experience reputation on exactly this.
Why sortable IDs are kind to your database
Back to the slowdown that started this. Swap the random key for a ULID and the access pattern inverts. New IDs are always slightly larger than the ones before them, so every INSERT targets the right-most leaf of the B-tree. That one page stays hot in the buffer pool, fills up sequentially, and only splits when it's genuinely full — no scattered writes into cold pages, no thrashing the cache with parts of the index you'll never read again.
Figure 2 — same tree, two key strategies. The right-hand pattern is what keeps inserts fast as the table grows.
This isn't hand-waving. A 2025 comparative study of identifier schemes and a pile of independent Postgres and MySQL benchmarks all land in the same place: time-ordered keys (ULID and UUIDv7) deliver meaningfully faster inserts — often quoted at 2–5× over UUIDv4 under write load — and smaller indexes, precisely because they cut the random disk I/O that page splits cause. You get a few more wins for free:
- Range scans by time become cheap. "Give me the 100 newest orders" is a tail scan of the primary key, not a sort over a secondary
created_atindex. - Pagination is stable. Keyset pagination (
WHERE id > $last ORDER BY id) walks the data in creation order without a separate sort key. - Debugging gets easier. Eyeball two IDs and the bigger one was created later. That's a surprisingly nice property at 2 a.m.
ULID vs UUIDv4 vs UUIDv7
Here's the honest comparison, because ULID is not automatically the right answer in 2026.
| UUIDv4 | UUIDv7 | ULID | |
|---|---|---|---|
| Bits | 128 (122 random) | 128 (48 time + 74 random) | 128 (48 time + 80 random) |
| Time-sortable | ❌ | ✅ | ✅ |
| String length | 36 (hyphenated) | 36 (hyphenated) | 26 (base32) |
| Standardized | RFC 9562 | RFC 9562 | community spec |
| Native DB type | ✅ uuid | ✅ uuid | ⚠️ store as text/bytes |
| Ecosystem support | universal | broad, growing | library-level |
Figure 3 — the only bits that matter for sort order are the first 48. UUIDv7 spends a little on version/variant; ULID spends nothing.
The plot twist is UUIDv7. When ULID was created, the UUID standard had no time-ordered option — v1 leaked your MAC address, v4 was pure random. ULID filled a real gap. But RFC 9562 (2024) standardized UUIDv7, which puts the same 48-bit millisecond timestamp up front and gets the same index-friendly behavior — while staying inside the existing uuid type that Postgres, MySQL, and every ORM already understand natively. If you're starting fresh and your database has a first-class UUID type, UUIDv7 is the lower-friction default.
So when does ULID still win? When the string is what you handle most: a 26-character, hyphen-free, case-insensitive token is nicer in URLs, log lines, and copy-paste than a 36-character UUID with four hyphens. ULIDs also slot cleanly into systems that store IDs as strings anyway — many document stores, key-value layers, and edge databases — where the "native uuid type" advantage of v7 evaporates. Same core idea, different ergonomics.
The gotchas nobody mentions
Every ID scheme has sharp edges, and time-ordered ones have a specific set. I'd rather you learn them from a blog post than from production.
The big one in my own generator: it's not monotonic. The ULID spec defines an optional monotonic mode — if you mint several IDs in the same millisecond, each one increments the random component of the previous instead of redrawing it, so they still sort in creation order within that millisecond. My forty-line version skips that: it draws fresh randomness every call. The consequence is real and worth stating plainly:
Gotcha: with non-monotonic generation, two ULIDs created in the same millisecond sort in random order relative to each other. Across milliseconds the order is perfect; within one it's a coin flip. For "newest first" feeds this is invisible. If you depend on strict insertion order for IDs minted in a tight loop, you need monotonic mode (or a real sequence).
The second edge is privacy, and it's the same for ULID and UUIDv7: the timestamp is right there in the ID. Anyone holding one can read, to the millisecond, when it was created. That's a feature for your database and a leak for your users — it can expose signup times, order volumes (mint two IDs, diff the timestamps, estimate throughput), and growth rates.
Security note: a ULID is an identifier, not a secret. The 80 random bits make guessing a specific valid ID hard, but the embedded timestamp leaks metadata and a monotonic stream is partially predictable. Never use a ULID as a password-reset token, session token, or capability URL. For secrets, generate dedicated high-entropy tokens.
And the theoretical one, for completeness: if you somehow mint more than 2⁸⁰ ULIDs in a single millisecond (or overflow the random field in monotonic mode), generation is defined to fail rather than hand back a duplicate. You will never hit this — 2⁸⁰ is a trillion trillion — but it's why a correct implementation can return an error here instead of silently wrapping.
When I reach for ULID (and when I don't)
After all of that, my actual decision rule is short.
✅ Reach for a time-ordered ID (ULID or UUIDv7) when:
- It's a primary key on a table that will see heavy inserts and grow large.
- You want cheap "newest first" queries and keyset pagination without a separate sort column.
- You generate IDs in a distributed system and can't lean on a central auto-increment sequence.
✅ Prefer ULID specifically when:
- The ID lives in URLs, logs, or anywhere humans read and copy it — 26 chars beats 36.
- Your storage treats IDs as strings anyway (document/KV/edge stores), so UUIDv7's native-type edge doesn't apply.
- You like prefixed IDs (
user_…,order_…) for self-documenting keys.
❌ Don't use either (or be careful) when:
- The value is a secret or capability token — use real high-entropy tokens instead.
- The creation time is sensitive and the ID is exposed to untrusted parties.
- You're on a stack with a first-class
uuidtype and no string-handling pain — then plain UUIDv7 is the lower-friction pick. - You truly need zero correlation between ID and time — then UUIDv4 is still the right, deliberate choice.
Takeaways
Three ideas worth carrying to your own systems, whatever ID you land on:
- Your primary key has a performance profile, not just a uniqueness guarantee. Random keys fragment B-trees; time-ordered keys append. The cost is invisible until your data outgrows RAM, and then it's the whole game.
- Sortability is a feature you can bake into the identifier itself. Putting a 48-bit timestamp in the high bits turns "order by creation time" from a query concern into a property of the data.
- An identifier is not a secret. The same timestamp that makes ULIDs fast also leaks when they were made. Pick the tool for the job — sortable IDs for keys, dedicated tokens for secrets.
ULID was the right answer to a problem the UUID standard hadn't solved yet; UUIDv7 has since caught up inside the standard. But the lesson outlives the turf war: if your IDs already know what time it is, your database doesn't have to work nearly as hard.
