Linux – Can you explain the entropy estimate used in random.c

kernellinuxrandom

/dev/random uses the timings of kernel interupts to add to the entropy pool. The amount of entropy in the pool is tracked in a variable named entropy_count.

Here is the relevant snippet of code from random.c. It represents the time (in jiffies I think) between the last two interupts in variable delta and the differences in deltas as delta2.

delta = time - state->last_time;
state->last_time = time;

delta2 = delta - state->last_delta;
state->last_delta = delta;

if (delta < 0) delta = -delta;
if (delta2 < 0) delta2 = -delta2;
delta = MIN(delta, delta2) >> 1;
for (nbits = 0; delta; nbits++)
  delta >>= 1;

r->entropy_count += nbits;

/* Prevent overflow */
if (r->entropy_count > POOLBITS)
  r->entropy_count = POOLBITS;

It looks like the estimate of the entropy added is essentially the floor (not ceil because of the initial bitshift before the loop) of base 2 logarithm of delta. This makes some intuitive sense, though I'm not sure what assumptions would be needed to make this formally correct.

So, my first question is "what is the reasoning behind this estimate?"

My second question is about delta = MIN(delta, delta2) .... What does this do? Why take the minimum of this delta and the last one? I don't know what this is supposed to achieve – perhaps it makes the estimate better, maybe just more conservative.

Edit: I've found a paper that specifies the estimate, but it doesn't really give a reasoned argument for it (though it does outline some informal conditions that the estimator should meet).

Other resources that have come up in the comments:

Wikipedia on /dev/random and /dev/urandom
A paper that tries to explain it (I'm skeptical about it, see comments)
A blog post about /dev/random with comments from the guy who wrote the code above.
A secutity.SE answer about the /dev/random entropy pool.

Best Answer

delta2 is not the previous delta, but the difference between two successive values of delta. It is a kind of derivative: if delta measures the speed, delta2 is the acceleration.

The intuitive idea behind that estimate is that interrupts occur at more or less random intervals, dictated by unpredictable events from the physical world (e.g. key strokes or arrival of network packets). The longer the delay, the more unpredictable events are involved. However, there are physical systems which fire interrupts at a fixed rate; the delta2 measure is a protection mechanism which detects such occurrences (if interrupts occur at fixed intervals, hence decidedly predictable, all delta will have the same value, hence delta2 will be zero).

I said "intuitive" and there is not much more to say. In fact, in the "random physical events" model, counting the bits is wrong; if an hardware event occurs with probability p for each time unit, and you get a delay expressed over n bits, then the entropy contribution should be accounted as n/2 bits, not n bits. But we know that in reality the physical events don't occur at exactly random moments; the delta2 mechanism admits as much.

So in practice, the "entropy estimate" is exactly that: an estimate. Its security value does not come from a well-reasoned, mathematically exact justification, but from the usual source of security: nobody seems to have found a way to abuse it (yet).

This page was written by someone who got fed up with the myths about /dev/random and its entropy estimator, and I think it explains things well, with enough details. It is important to get some basic ideas right when dealing with RNG.

Best Answer

Related Solutions

Is /dev/random data a psuedo-random AES cypher, and where does the entropy come from

Linux – Can entropy consumption at program start be prevented

Related Question