David Honig (honig@sprynet.com)
Mon, 12 Apr 1999 09:11:57 -0700
I dug into the /dev/random code this weekend to try to answer WHGIII's
query about analyzing that subsystem. The entropy estimation
strikes me as more generous than one would like.
An Analysis of the FreeBSD /dev/random:
Entropy_count Calculation is Probably Too Generous
David Honig <honig@sprynet.com> rev A
Intro
This document discusses BSD's implementation of /dev/random.
The entropy_count estimation appears to be insufficiently conservative.
This is impossible to observe from its output, which is whitened with MD5.
Random Pool
The file /usr/src/sys/i386/isa/random_machdep.c implements BSD's /dev/random
and corresponding system calls. This subsystem can use system interrupts
designated by the "rndcontrol" utility or during kernel configuration.
These interrupts yield raw entropy which is mixed, along with timing
entropy, into a global pool of 128 32-bit integers. (All integers will be
32 bits hereon.)
Random Output
To obtain random values, this pool is fed to MD5. 9 hashes are required
for every 16 bytes out of /dev/random. The number of bytes requested and
the time of the request is also used in one final stir just before using
the pool.
Interrupt Processing
When a designated interrupt occurs, the following happens. A number
representing the interrupt's origin (e.g., the keyboard character) is
passed as an integer to add_timer_randomness. This function also mixes in
two timers: one derived from "timercounter", the other from "ticks". These
are kernel timers with resolutions XXX
Entropy Count
The add_timer_randomness() call also increases the global "entropy_count",
which is used to limit the number of random bytes emitted by the blocking
random reads. (We ignore nonblocking access, where estimated entropy is
ignored. In this case, /dev/random is just a
PRNG and not cryptographically strong.) Add_timer_randomness() makes two
calls to add_entropy_word(), which quickly stirs the entropy pool with a
LFSR variant.
Entropy Estimation
It is this "entropy_count" estimate that I examine. The entropy count
controls the "compression" or "distillation-factor": for every 32 bits
(plus two timers) passed to add_timer_randomness(), we emit N bits. The
minimal entropy_count increment is 2.
More bits may be allowed depending on the size of the smallest of the last
two time deltas. A crude log (base 2) of the smallest delta (actually,
half the smallest delta, don't know why) is added to entropy_count. For
instance, suppose the smallest delta is 1. Then no extra output bits are
allowed. But if the smallest delta is tens of thousands of clocks, then
dozens of *output* bits will be permitted by this single interrupt.
This looks generous. The entropy estimation also fails to take into
account periodicity, ie, if delta is close to last_delta.
Suggestions
The code in add_timer_randomness() could easily be changed, however, this
code is executed during kernel interrupt handling, so what you can do there
is limited. The entropy-estimating portion of add_timer_randomness() could
be removed; better to err conservatively here.
If necessary, the size of the random pool could be enlarged. (The maximum
entropy_count is the size of the pool, in bits.) This is essentially an
entropy caching strategy. However, the extract_entropy() runtime depends
on the pool size: the MD5 function is run POOLSIZE/16 times for each 16
bytes output. Also, you would have to choose a different polynomial (the
taps) in add_entropy_word().
Other strategies include acquring more data, see Gutmann on Stronger PRNGs
where system data structures (heaps, performance counters, etc.) are mined.
Or using a higher bandwidth cheap physical sources, e.g., a media stream.
You could also xor N /dev/random bits for every bit of "true" entropy that
you want, where N depends on how much entropy you think is *really* there.
With this technique you could even assume fractional-bits per interrupt.
This way you can override the assumptions built
into the kernel without kernel mods.
Although this seems paranoid, the raw entropy input has not been
characterized.
Therefore, one is subject to the hazard of overestimating input uncertainty.
NB: to measure the entropy of the actual raw input to /dev/random, you'd
modify add_entropy_word(r,num) to store "num" somewhere; eventually dump
the stored values to a file for analysis. Or you could implement MUST in
that function; but be careful, this is interrupt processing.
Distribution: Unlimited
Security Implications: Well, duh. :-) MD5 hides a lot.
The following archive was created by hippie-mail 7.98617-22 on Thu May 27 1999 - 23:44:22