What is a UUID? A Complete Guide to Universally Unique Identifiers

Every record in a database needs an identifier. For decades, auto-incrementing integers were the default choice — simple, small, and fast. Then distributed systems arrived and exposed the fundamental flaw: an integer counter only works when there is a single authority issuing numbers. UUIDs solve this problem elegantly. This guide explains exactly how.

What is a UUID?

A UUID (Universally Unique Identifier) is a 128-bit identifier designed to be unique across space and time without requiring a central authority to issue them. UUIDs are defined in RFC 4122, published in 2005, though the concept originates from the Apollo NCA and later the Open Software Foundation’s Distributed Computing Environment.

The key property is in the name: universally unique. Any system anywhere in the world can generate a UUID independently and be statistically certain it will not collide with a UUID generated by any other system. This is what enables distributed systems, offline-first applications, and multi-region databases to create records without coordination.

UUID Format

A UUID is 128 bits (16 bytes) of data, conventionally displayed as 32 hexadecimal digits grouped by hyphens in the pattern 8-4-4-4-12:

xxxxxxxx-xxxx-Mxxx-Nxxx-xxxxxxxxxxxx

Example:

550e8400-e29b-41d4-a716-446655440000
  • Total characters: 36 (32 hex digits + 4 hyphens)
  • M position (13th hex digit): encodes the UUID version (1–8)
  • N position (17th hex digit): encodes the UUID variant (RFC 4122 uses 8, 9, a, or b here)

The variant field indicates which standard the UUID conforms to. Nearly all UUIDs in use today are RFC 4122 variant (bits 10xx in the two most significant bits of the N octet).

UUID Versions Explained

RFC 4122 originally defined versions 1, 2, 3, 4, and 5. RFC 9562 (2024) added versions 6, 7, and 8. Here are the versions you will actually encounter:

Version 1: Timestamp + MAC Address

UUID v1 combines the current timestamp (in 100-nanosecond intervals since October 15, 1582) with the MAC address of the generating machine’s network interface.

timestamp-low-timestamp-mid-1xxx-variant-node

Pros: Monotonically increasing (sortable), embeds creation time, extremely low collision risk.

Cons: Leaks the machine’s MAC address (a privacy concern), not truly random (predictable to some degree), requires synchronized clocks in clustered environments.

Use when: You need embedded timestamps and operate in a trusted, controlled environment.

Version 4: Random

UUID v4 is 122 bits of cryptographically random data (the remaining 6 bits are used for the version and variant flags). It is the most widely used version by far.

xxxxxxxx-xxxx-4xxx-yxxx-xxxxxxxxxxxx
(where y is 8, 9, a, or b)

Pros: No information leakage, simple to generate, universally supported, practically zero collision probability.

Cons: Not sortable (random order degrades B-tree index performance over time), slightly larger than integer IDs.

Use when: You need unique identifiers and don’t require ordering or embedded metadata.

Version 5: Namespace + Name (SHA-1)

UUID v5 generates a deterministic UUID from a namespace UUID and a name string, using SHA-1 hashing. Given the same namespace and name, you always get the same UUID v5.

import uuid
# Always produces the same result:
ns = uuid.NAMESPACE_URL
result = uuid.uuid5(ns, "https://example.com/article/42")
# '8edb5c1b-4ddc-5b5f-8b71-4fcf05462be5' (always)

RFC 4122 defines four standard namespaces:

  • NAMESPACE_DNS — domain names
  • NAMESPACE_URL — URLs
  • NAMESPACE_OID — ISO OIDs
  • NAMESPACE_X500 — X.500 distinguished names

Pros: Deterministic — same input always produces the same UUID. No storage required to look up whether something has been seen.

Cons: SHA-1 is considered cryptographically weak (not a security concern for UUIDs, but something to be aware of). UUID v3 (MD5) and v5 (SHA-1) are both deterministic; prefer v5 over v3.

Use when: You need stable, reproducible identifiers for named resources (content-addressable IDs, deduplication keys).

Version 7: Timestamp + Random (Sortable)

UUID v7, defined in RFC 9562 (2024), is the modern replacement for v1. It combines a Unix millisecond timestamp in the most significant bits with random data in the remaining bits.

01951234-abcd-7xxx-yxxx-xxxxxxxxxxxx
^       ^
Unix ms timestamp (48 bits)

Pros: Lexicographically sortable by creation time (unlike v4), embeds a Unix timestamp (compatible with modern tooling), does not leak MAC addresses (unlike v1), suitable for database primary keys.

Cons: Newer — library support is less universal than v4, though adoption is growing rapidly.

Use when: You need UUIDs as database primary keys and want index-friendly ordering. UUID v7 is increasingly the recommended default for new projects.

UUID vs Auto-Increment IDs

PropertyUUID (v4)Auto-Increment Integer
GenerationAny node, offlineRequires database coordination
Collision riskNegligibleNone (sequential)
Size16 bytes (128 bits)4–8 bytes (32–64 bits)
ReadabilityOpaqueSequential, predictable
Enumeration riskNot guessableTrivially guessable (/users/1, /users/2)
Index performancePoor (random inserts)Excellent (sequential inserts)
SortableNo (v4), Yes (v7)Yes
Distributed systemsExcellentRequires coordination
Exposing record countNoYes

The enumeration risk of sequential IDs is often underappreciated. An API using /users/1234 makes it trivially easy for anyone to scrape all user records by incrementing the ID. UUIDs provide free security-by-obscurity — not a substitute for proper authorization, but a useful additional layer.

When to Use UUIDs

Distributed Systems and Microservices

When multiple services or database nodes can independently create records, UUIDs allow each service to generate IDs without communicating with a central sequence generator. A mobile app can create a record offline, assign it a UUID, and sync it to the server later with zero risk of ID collision.

API Resource Identifiers

Public-facing APIs should generally use UUIDs instead of sequential IDs in their URLs and responses. This prevents enumeration attacks and avoids leaking business metrics (how many users you have, how many orders were placed today).

# Avoid (leaks record count, guessable):
GET /api/orders/1042

# Better (opaque, not guessable):
GET /api/orders/f47ac10b-58cc-4372-a567-0e02b2c3d479

Database Primary Keys

UUIDs work well as primary keys, especially in distributed databases (CockroachDB, PlanetScale, distributed PostgreSQL). The trade-off: random UUIDs (v4) cause index fragmentation in B-tree indexes because inserts land at random positions rather than always at the end. UUID v7 solves this by being time-ordered.

Idempotency Keys

Payment processors (Stripe, Adyen) and messaging systems use UUIDs as idempotency keys — you generate a UUID per operation and include it in the request header. If the request is retried, the server recognizes the UUID and returns the original result instead of processing twice.

const idempotencyKey = crypto.randomUUID();
await fetch("/api/payments", {
  method: "POST",
  headers: {
    "Idempotency-Key": idempotencyKey,
    "Content-Type": "application/json",
  },
  body: JSON.stringify({ amount: 9900, currency: "usd" }),
});

Generate UUIDs for any of these purposes instantly with our UUID Generator tool.

Collision Probability

The birthday paradox tells us that collision probability grows with the number of IDs generated. For UUID v4 (122 random bits), the numbers are remarkable:

  • After generating 1 billion UUIDs per second for 100 years, the probability of a single collision is approximately 50%.
  • To have just a 1 in a billion chance of collision, you would need to generate roughly 1 trillion UUIDs.
  • The total number of possible UUID v4 values is 2^122 ≈ 5.3 × 10^36.

For virtually all practical applications, UUID v4 collision probability is zero. The only scenario where it becomes a real concern is generating trillions of identifiers — at which point UUID v7 with additional entropy or a custom scheme is warranted.

The caveat: UUID v4 relies on a cryptographically secure random number generator (CSPRNG). A poor RNG (broken PRNG, insufficient entropy at system boot) can dramatically increase collision probability. Always use platform-provided CSPRNGs (crypto.randomUUID(), os.urandom(), crypto/rand) rather than math-based random functions.

Code Examples

JavaScript

// Modern browsers and Node.js 19+ (Web Crypto API)
const id = crypto.randomUUID();
console.log(id); // "f47ac10b-58cc-4372-a567-0e02b2c3d479"

// Node.js (built-in crypto module, v15.6+)
const { randomUUID } = require("crypto");
const id2 = randomUUID();

// npm package 'uuid' for all versions and namespaces
import { v4 as uuidv4, v5 as uuidv5, v7 as uuidv7 } from "uuid";

const randomId = uuidv4();
const sortableId = uuidv7(); // time-ordered, great for DB PKs

// Deterministic v5 UUID
const NAMESPACE = "6ba7b810-9dad-11d1-80b4-00c04fd430c8"; // NAMESPACE_DNS
const articleId = uuidv5("example.com", NAMESPACE);

// Validate a UUID string
import { validate, version } from "uuid";
validate("550e8400-e29b-41d4-a716-446655440000"); // true
version("550e8400-e29b-41d4-a716-446655440000");  // 4

Python

import uuid

# UUID v4 (random)
random_id = uuid.uuid4()
print(random_id)        # UUID('f47ac10b-58cc-4372-a567-0e02b2c3d479')
print(str(random_id))   # 'f47ac10b-58cc-4372-a567-0e02b2c3d479'

# UUID v1 (timestamp + node)
time_id = uuid.uuid1()

# UUID v5 (deterministic)
article_id = uuid.uuid5(uuid.NAMESPACE_URL, "https://example.com/article/42")
print(article_id)  # Always the same value for the same input

# Parse a UUID string
parsed = uuid.UUID("550e8400-e29b-41d4-a716-446655440000")
print(parsed.version)  # 4
print(parsed.bytes)    # Raw 16 bytes

# Get just the hex string without hyphens
print(random_id.hex)   # 'f47ac10b58cc4372a5670e02b2c3d479'

Go

import (
    "fmt"
    "github.com/google/uuid"
)

// UUID v4
id, err := uuid.NewRandom()
if err != nil {
    panic(err)
}
fmt.Println(id.String()) // "f47ac10b-58cc-4372-a567-0e02b2c3d479"

// UUID v7 (time-ordered)
v7, err := uuid.NewV7()
if err != nil {
    panic(err)
}
fmt.Println(v7.String())

// UUID v5
ns := uuid.NameSpaceURL
articleID := uuid.NewSHA1(ns, []byte("https://example.com/article/42"))
fmt.Println(articleID.String()) // deterministic

// Parse and validate
parsed, err := uuid.Parse("550e8400-e29b-41d4-a716-446655440000")
if err != nil {
    fmt.Println("invalid UUID")
}
fmt.Println(parsed.Version()) // 4

Best Practices

Use lowercase. RFC 4122 specifies that UUID hex digits should be output in lowercase. While comparison is typically case-insensitive, lowercase is the canonical form: f47ac10b-58cc-4372-a567-0e02b2c3d479, not F47AC10B-58CC-4372-A567-0E02B2C3D479.

Store as binary when possible. A UUID string is 36 characters (288 bits). Storing as a 16-byte binary field (BINARY(16) in MySQL, uuid type in PostgreSQL) cuts storage in half and speeds up index lookups significantly.

Prefer UUID v7 for database primary keys. The time-ordered property keeps B-tree indexes efficient and allows you to sort records by insertion order using the ID alone. UUID v4 causes page splits in B-tree indexes as inserts land at random positions.

Don’t use UUIDs as security tokens. UUIDs are for uniqueness, not secrecy. A 128-bit UUID v4 has 122 bits of randomness — strong enough for IDs, but purpose-built security tokens (session IDs, API keys, password reset tokens) should use a dedicated token generation library that generates at least 160 bits of entropy and encodes as base64url or hex.

Include hyphens in display. The hyphenated format (xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx) is the canonical representation and what most systems expect. Strip hyphens only when storing in a compact binary column, and re-add them when exposing to clients.

Use our UUID Generator tool to generate v4 and v7 UUIDs instantly.

FAQ

Can two independently generated UUID v4s ever be the same?

Theoretically yes, but the probability is so low it is treated as impossible in practice. With a high-quality CSPRNG, you would need to generate approximately 2.7 × 10^18 UUIDs before having a 50% chance of a single collision. At one million UUIDs per second, that would take about 85 years. For all practical purposes, UUIDs are collision-free.

Should I use UUID or ULID?

ULID (Universally Unique Lexicographically Sortable Identifier) is a community alternative to UUID that is 26 characters (Crockford Base32), URL-safe, and lexicographically sortable by creation time. It predates UUID v7. Both solve the same problem — sortable unique IDs. UUID v7 is now the RFC-standardized answer to this need, so prefer v7 for new projects for broader ecosystem compatibility.

Why does PostgreSQL have a native uuid type but MySQL doesn’t?

PostgreSQL has a native uuid data type that stores the value as 16 bytes and provides efficient indexing. MySQL lacks a native UUID type; the common workaround is BINARY(16) with application-level conversion, or the UUID_TO_BIN() / BIN_TO_UUID() functions available since MySQL 8.0. MySQL 8.0’s UUID_TO_BIN() also supports swapping the time fields for better index locality.

Is a UUID the same as a GUID?

Essentially yes. GUID (Globally Unique Identifier) is Microsoft’s term for what RFC 4122 calls a UUID. The formats are compatible — a GUID has the same 8-4-4-4-12 structure. One minor difference: Microsoft’s Guid.NewGuid() in .NET uses a different byte ordering (mixed-endian) internally, but the string representation is the same and the two terms are used interchangeably in most contexts.

Related Tools