Web Design/IDs

How to generate resource IDs

Resource IDs are strings used to uniquely identify an item such as any social media post, comment, playlist, tweet and video.

They allow software to distinguish between each item, even those with the same user-specified names/titles.

It is like a “number plate” of each ressource, and included in the URL.

Current examples

One can take a look at how existing websites generate resource identifiers:

Website name	Character count	Character range	Combinations per character	Total combinations
Twitter tweet IDs	≤19	[0-9] (numeric)	10	10¹⁹
YouTube video IDs	11	[-_0-9A-Za-z]	64	73.786.976.294.838.206.464 (73.7×10¹⁸)
Dailymotion public video IDs	≤8	[0-9a-z]	36	2.821.109.907.456 (2.82×10¹²)
Archive.Today	4 (early 2012) 5 (late 2012 - today)	[0-9A-Za-z]	62	930.909.168 (62⁴+62⁵)

How to do it correctly

Avoid dash characters

A dash character (“-”) is considered a separation character.

In order to make it more convenient for users to highlight strings, avoid using any dashes in IDs.

In addition, dashes impede cursor navigation using ctrl + ← and ctrl + →.

Test highlighting here (double-click on desktop, hold on mobile):

4gSOMba1UdM (none)
4gSOM-a1UdM (with dash)
4gSOM_a1UdM (with underscore)

Some browsers (e.g. Mozilla Firefox) and text editors might also treat an underscore as a separation character.

It is recommended to only use numbers and/or letters to avoid these problems.

Case-insensitive

It is recommended for random identifiers that include alphabetic characters to be case-insensitive, because there is no effective benefit in making the ID case-sensitive.

In addition, case-insensitive resource identifier strings are much more pronounceable and facilitates writing it down to paper.

Confusing characters

The characters “i”, “I”, “l” and “j” look too similar to each other in many font types.

If possible, they should be excluded from the identifier: [^jiIl].

Combinations

When applying these recommendations listed above, one character now has 33 different possibilities.

One character of a YouTube video ID currently has 64 possibilities (also known as “Base64”; not to be confused with Base64 encoding.), but adding just two characters of length to a Base33 video ID (i.e. 13 instead of 11) does already nearly compensate for the restricted number of possible ressource ID strings.
An addition of just three characters (i.e. 14 instead of 11) would already overcompensate it by far while adding convenience in total.

64¹¹= 73.786.976.294.838.206.464 (current)
33¹⁰= 1.531.578.985.264.449 (reference)
33¹¹= 50.542.106.513.726.817 (reference)
33¹²= 1.667.889.514.952.984.961 (reference)
33¹³= 55.040.353.993.448.503.713 (reference)
33¹⁴= 1.816.331.681.783.800.622.529 (reference)

In addition, even 33¹⁰ would already offer an abundant number of combinations.

Calculations

Assuming that YouTube will reach 20.000.000.000 unique videos that have ever been uploaded in near future, that is still 1/2.527.105 of 33¹¹.
Even 33¹⁰ is still abundant, because only 1/76578 of the possible combinations would have been occupied.

Also, the 20 billion video count is a future projection. YouTube is not close to it at the moment.

33¹⁴ would surpass the total number of combinations of 64¹¹ by more than 24 times.

YouTube's comment IDs, comment reply IDs (formerly only numbers) and playlist IDs also already have an eternally abundant length (>30 characters).

Should those limits ever be attained, hypothetically, one can always add one more character of length to the resource ID string.

External links

Video from Tom Scott: “Will YouTube Ever Run Out Of Video IDs?”?