Almanac of UUIDs · Part II

The Historical Tapestry of the UUID

2.1 Origins: Apollo Computer and the NCS (1980s)

The UUID did not begin as an internet standard but as a proprietary solution within Apollo Computer, a pioneer in the workstation market of the 1980s. Apollo developed the Network Computing System (NCS), one of the first implementations of an object-oriented Remote Procedure Call (RPC) mechanism.

In NCS, objects could reside on any machine in a local network. To invoke a method on a remote object, the system needed a way to address it uniquely, regardless of where it moved or when it was created. Apollo engineers devised the "NCS UUID," a 128-bit value containing a timestamp and the host's network address. This ensured that IDs were globally unique (due to the MAC address) and temporally unique (due to the clock).

Legacy Artifacts:The "Variant" field in modern UUIDs still contains a reserved bit pattern (0xxx) specifically to maintain backward compatibility with these original Apollo NCS identifiers, ensuring that a modern system can distinguish a 1980s Apollo ID from a 2024 RFC 9562 ID.

2.2 The OSF DCE Era (1990s)

The Open Software Foundation (OSF), a consortium formed by DEC, IBM, and HP to compete with Sun Microsystems and AT&T, adopted the Apollo NCS design for its Distributed Computing Environment (DCE).

The DCE specification formalized the UUID structure into the "DCE Variant" (10xx), which remains the standard layout today. The DCE era cemented the UUID's structure:

  • Time-Low
  • Time-Mid
  • Time-Hi
  • Clock-Seq
  • Node

It was during this era that the "Version" field was introduced to allow for different generation algorithms within the same layout structure.

2.3 The Microsoft Fork: COM and GUIDs

Microsoft adopted the DCE RPC standard for its own middleware, initially for OLE (Object Linking and Embedding) and later for COM (Component Object Model). Microsoft referred to these identifiers as Globally Unique Identifiers (GUIDs).

While functionally identical to UUIDs, Microsoft's implementation introduced a persistent source of confusion: Endianness. The original NCS/DCE specification relied on "Network Byte Order" (Big-Endian). However, the Windows ecosystem is built on Intel x86 architecture, which is Little-Endian. Microsoft's GUID structure treats the first three fields as native integers (DWORD, WORD, WORD), storing them in Little-Endian format in memory and on disk, while treating the subsequent bytes as a byte array (Big-Endian).

Impact

A UUID represented as 00112233-4455-6677-8899-aabbccddeeff in a standard POSIX system might appear as 33221100-5544-7766-8899-aabbccddeeff in a Microsoft system. This discrepancy requires careful handling in cross-platform binary data migrations.

2.4 Standardization: IETF RFC 4122 (2005)

As the internet expanded, the need for a non-proprietary specification became clear. The Internet Engineering Task Force (IETF) published RFC 4122 in July 2005. This document unified the various implementations (NCS, DCE, Microsoft) under a single namespace urn:uuid.

RFC 4122 defined the "Classic" versions:

  • Version 1: Time-based (Gregorian) + MAC Address.
  • Version 2: DCE Security (POSIX IDs).
  • Version 3: Name-based (MD5).
  • Version 4: Random.
  • Version 5: Name-based (SHA-1).

For nearly 20 years, RFC 4122 was the definitive standard. However, it failed to anticipate the rise of high-throughput distributed databases (like Cassandra, DynamoDB, and partitioned PostgreSQL), where the random nature of UUIDv4 caused severe indexing inefficiencies.

2.5 The Modern Era: RFC 9562 (2024)

In May 2024, the IETF ratified RFC 9562, formally obsoleting RFC 4122. This new standard addresses the performance and privacy limitations of the earlier versions. It standardizes the "Time-Ordered" UUIDs that developers had been creating ad-hoc (e.g., ULID, KSUID, COMBGUID).

RFC 9562 introduces:

  • Version 6: A reordered v1 for legacy compatibility.
  • Version 7: A Unix-time-based ID optimized for database locality.
  • Version 8: A customizable format for vendor-specific logic.