Title: | Generate Universally Unique 'Lexicographically' 'Sortable' Identifiers |
---|---|
Description: | Universally unique identifiers ('UUIDs') can be sub-optimal for many uses-cases because they are not the most character efficient way of encoding 128 bits of randomness; v1/v2 versions are impractical in many environments, as they require access to a unique, stable MAC address; v3/v5 versions require a unique seed and produce randomly distributed IDs, which can cause fragmentation in many data structures; v4 provides no other information than randomness which can cause fragmentation in many data structures. Providing an alternative, 'ULIDs' (<https://github.com/ulid/spec>) have 128-bit compatibility with 'UUID', 1.21e+24 unique 'ULIDs' per millisecond, support standard (text) sorting, canonically encoded as a 26 character string, as opposed to the 36 character 'UUID', use 'base32' encoding for better efficiency and readability (5 bits per character), are case insensitive, have no special characters (i.e. are 'URL' safe) and have a monotonic sort order (correctly detects and handles the same millisecond). |
Authors: | Bob Rudis [aut] , Suyash Verma [aut] (ULID C++ <https://github.com/suyash/ulid/>), Dirk Eddelbuettel [aut, cre] |
Maintainer: | Dirk Eddelbuettel <[email protected]> |
License: | MIT + file LICENSE |
Version: | 0.4.0.1 |
Built: | 2024-10-29 04:46:07 UTC |
Source: | https://github.com/eddelbuettel/ulid |
generate()
generates a new Universally Unique Lexicographically Sortable Identifier. Several aliases are
available for convience and backwards-compatibility.
This function generates a new Universally Unique Lexicographically Sortable Identifier from a vector of
POSIXct
timestamps.
As described in the ulid specification repo, and slightly edited here, UUID use can be suboptimal for many uses-cases because:(grifted from https://github.com/ulid/spec)
UUID can be suboptimal for many uses-cases because:
It isn't the most character efficient way of encoding 128 bits of randomness
UUID v1/v2 is impractical in many environments, as it requires access to a unique, stable MAC address
UUID v3/v5 requires a unique seed and produces randomly distributed IDs, which can cause fragmentation in many data structures
UUID v4 provides no other information than randomness which can cause fragmentation in many data structures
Instead, an alternative is proposed in ULID:
ulid() // 01ARZ3NDEKTSV4RRFFQ69G5FAV
with the following properties:
128-bit compatibility with UUID
1.21e+24 unique ULIDs per millisecond
Lexicographically sortable!
Canonically encoded as a 26 character string, as opposed to the 36 character UUID
Uses Crockford's base32 for better efficiency and readability (5 bits per character)
Case insensitive
No special characters (URL safe)
Monotonic sort order (correctly detects and handles the same millisecond)
01AN4Z07BY 79KA1307SR9X4MV3 |----------| |----------------| Timestamp Randomness 48bits 80bits
Components
Timestamp
48 bit integer
UNIX-time in milliseconds
Will not run out of space until the year 10889 AD.
Randomness
80 bits
Cryptographically secure source of randomness, if possible
Sorting
The left-most character must be sorted first, and the right-most character sorted last (lexical order). The default ASCII character set must be used. Within the same millisecond, sort order is not guaranteed.
generate(n = 1L) unmarshal(ulids) ts_generate(tsv) ulid(n = 1L) ulid_generate(n = 1L) ULIDgenerate(n = 1L)
generate(n = 1L) unmarshal(ulids) ts_generate(tsv) ulid(n = 1L) ulid_generate(n = 1L) ULIDgenerate(n = 1L)
n |
number of id's to generate (default = |
ulids |
character ULIDs (e.g. created with |
tsv |
vector of |
Note that up until release 0.3.1, the implementations had limitations that resulted in second rather than millisecond resolution. This has been addressed for release 0.4.0 and is now supported as expected.
A data.frame
with two columns ts
and rnd
.
Bob Rudis ([email protected]) wrote the package based on ulid
C++ library by Suyash Verma.
Dirk Eddelbuettel now maintains the package.
The ulid specification provides the reference.
ULIDgenerate() unmarshal(generate()) ts_generate(as.POSIXct("2017-11-01 15:00:00", origin="1970-01-01"))
ULIDgenerate() unmarshal(generate()) ts_generate(as.POSIXct("2017-11-01 15:00:00", origin="1970-01-01"))