Inside the Redis distribution you can find "dict.c" that is a very easy to under...

aaronblohowiak · on Oct 1, 2012

https://github.com/antirez/redis/blob/unstable/src/dict.c

antirez · on Oct 1, 2012

Thanks, perhaps the design is even more evident here -> https://github.com/antirez/redis/blob/unstable/src/dict.h

mikeash · on Oct 2, 2012

Just a bit of clarification, when you say "pick a random element", do you mean choosing an arbitrary element, or do you actually mean choosing one at random such that each element has a 1/N probability of being chosen?

rm999 · on Oct 2, 2012

I'd yield to someone who understands the code, but a quick glance at dictGetRandomKey() shows it is randomized, and I believe is very close to 1/N. If there are a large number of elements in a single bucket this may not be true. At an extreme, if there is one element in a bucket and N-1 in another, there is a 50% chance teh first will get picked, and 50/(N-1)% the others will.

antirez · on Oct 2, 2012

Exactly as you said, it's a decent approximation and with large tables it works well enough, but if there are clusters (long chains) in a bucket those elements have a smaller percentage of chances to be picked.

However there is a trick to improve this that I don't use, that is, instead of searching for a non empty bucket, and select a random element from the chain, it is possible to do this, choosing a small M (like M=3):

* Find M non-empty buckets.

* Pick the bucket among the M buckets, with a chance proportional to the chain length at every bucket.

* Finally pick a random element from the bucket.

The bigger M, the more accurate the algorithm becomes.

For instance in the pathological case you shown where a bucket as one element and another all the rest, this would find the two buckets and then pick the 1 element bucket with a much smaller probability that would adjust the difference in chain length.

mikeash · on Oct 2, 2012

Thanks for clarifying. This is a third possibility that hadn't occurred to me, so I'm glad I asked and learned something!