RadixSort was developed in 1887 by Herman Hollerith and was used for sorting numbered punched cards. The algorithm is [w:numerical stability|stable] and fast.

BucketSort in its natural variant is the predecessor of RadixSort. If there were for example bills to be sorted by date, one started by building stacks with bills from the same year. Lets assume there were bills from ten years so we have ten stacks with bills now. Then each of these stacks was subdivided into twelve stacks representing the month (we have 120 stack now on the table). Now each of them was subdivided into 30/31 stacks representing the days of the months. We end up with about 3.600 stacks that have to be recollected in correct order (nobody will ever do it in this way; it shall only show the complexity).

The brilliant idea of Hollerith was to reverse the sorting order. In order to show what his idea was, we sort 20 three digit numbers, presented here.

```387    310     529     174     011     227     020     901     522     771
993    348     974     213     226     004     918     098     107     109
```

In the first run the numbers are sorted by the last digit; since there are 10 different digits we use ten buckets. The result of this sort we have here:

```B-0    B-1     B-2     B-3     B-4     B-5     B-6     B-7     B-8     B-9
--------------------------------------------------------------------------
310    011     522     993     174             226     387     348     529
020    901             213     974                     227     918     109
771                     004                     107     098
```

Now we recollect the numbers from the buckets without changing their order.

```310    020     011     901     771     522     993     213     174     974
004    226     387     227     107     348     918     098     529     109
```

Sorting by the middle digit gives this result:

```B-0    B-1     B-2     B-3     B-4     B-5     B-6     B-7     B-8     B-9
--------------------------------------------------------------------------
901    310     020             348                     771     387     993
004    011     522                                     174             098
107    213     226                                     974
109    918     227
529
```

It is obvious that the content of the buckets is still sorted ascending by the last digit. We recollect again the content of the buckets without changing their order.

```901    004     107     109     310     011     213     918     020     522
226    227     529     348     771     174     974     387     993     098
```

The third and last run gives this distribution to the buckets:

```B-0    B-1     B-2     B-3     B-4     B-5     B-6     B-7     B-8     B-9
--------------------------------------------------------------------------
004    107     213     310             522             771             901
011    109     226     348             529                             918
020    174     227     387                                             974
098                                                                    993
```

After recollecting the numbers the sort is finished.

```004    011     020     098     107     109     174     213     226     227
310    348     387     522     529     771     901     918     974     993
```

The result of Hollerith idea was, that one only needs as many buckets as there are different symbols. Instead of one thousand buckets for a naive BucketSort a set of ten buckets is used three times in order to get the same result.

But what exactly is the trick for this enormous reduction? We have a look at the stack of numbers after the first recollection. We see in the stack a layer with numbers ending on 0, followed by a layer ending on 1 and so on. After the second recollection we have a layer where the second last digit is 0 and this layer could have 10 sub-layers with the last digit being 0, 1 etc. The trick of Hollerith was to substitute the sub-buckets by sub-layers which are automatically in perfect order all the time. And there is no effort necessary for administration of sub-layers. If a sub-layer exists, fine, if not, fine as well; it is not necessary for the algorithm to care about it.

And we see directly the reason why RadixSort is stable. When a layer is divided into sub-layers, the order of the data records is not changed. Maybe they do not have the same neighbors any more, but the position numbers (from the original stack) are still ascending.

We go back to the example with the bills. We want them to be sorted by year, within the years sorted by month and within the months by day. Since RadixSort is stable it becomes very easy. We first sort (in two runs) by days, then we sort (in two runs) by month and at the end (in two or four runs) by year. Sorting done!

## Timing

It is obvious that any digit of any key is only processed once. Hence the number of key handlings is

A = n * d

where n is the number of data records and d is the maximum number of digits in the key. Since the processing of any digit of any key takes the same time RadixSort has a linear characteristic.

In order to be able to compare RadixSort with other algorithms it is better to describe the characteristic by

T = B * n * lg256(N)

where B is a constant reflecting the implementation and the term lg256(N) gives the number of repetitions where N is the maximum number of different key-values.