Random thoughts
The next release will include a completely overhauled version of the random
number facility, the RAND
API. The default RAND method is now based
on a Deterministic Random Bit Generator (DRBG) implemented according to
the NIST recommendation 800-90A.
We have also edited the documentation, allowed
finer-grained configuration of how to seed the generator, and updated
the default seeding mechanisms.
There will probably be more changes before the release is made, but they should be comparatively minor.
Read on for more details.
Background; NIST
The National Institute of Standards and Technologies (NIST) is part of the US Federal Government. Part of their mission is to advance standards. NIST has a long history of involvement with cryptography, including DES, the entire family of SHA digests, AES, various elliptic curves, and so on. They produce a series of Special Publications. One set, SP 800, is the main way they publish guidelines, recommendations, and reference materials in the computer security area.
NIST SP 800-90A rev1
is titled
Recommendation for Random Number Generation Using Deterministic Random Bit Generators.
Yes, that’s a mouthful! But note that if you generate enough random bits,
you get a random byte, and if you generate enough bytes you can treat it
as a random number, often a BN
in OpenSSL terminology.
So when you see “RBG” think “RNG.” :)
A non-deterministic RBG (NRBG) has to be based on hardware where every bit output is based on an unpredictable physical process. A fun example of this is the old Silicon Graphics LavaRand, now recreated and running at CloudFlare, which takes pictures of a lava lamp and digitizes them.
A deterministic RBG (DRBG) uses an algorithm to generate a sequence of bits from an initial seed, and that seed must be based on a true randomness source. This is a divide and conquer approach: if the algorithm has the right properties, the application only needs a small input of randomness (16 bytes for our algorithm) to generate many random bits. It will also need to be periodically reseeded, but when that needs to happen can also be calculated.
The NIST document provides a framework for defining a DRBG, including
requirements on the operating environment and a lifecycle
model that is slightly different from OpenSSL’s XXX_new()/XXX_free()
model. NIST treats creation as relatively heavyweight, and allows
a single DRBG to be instantiated or not during its lifetime.
NIST SP 800-90 defined four DRBG algorithms. One of these was “Dual Elliptic Curve” which was later shown to be deliberately vulnerable. For a really good explanation of this, see Steve Checkoway’s talk at the recent IETF meeting. An update to the document was made, the above-linked 90A revision 1, and Dual-EC DRBG was removed.
OpenSSL currently implements the AES-counter version, which is also what Google’s BoringSSL and Amazon’s s2n use. Our tests include the NIST known answer tests (KATs), so we are confident that the algorithm is pretty correct. It also uses AES in a common way, which increases out confidence in its correctness.
What we did
First, we cleaned up our terminology in the documentation and code comments.
The term entropy is both highly technical and confusing. It is used
in 800-90A in very precise ways, but in our docs it was usually misleading,
so we modified the documents to use the more vague but accurate term
randomness. We also tightened up some of the implementation and made
some semantics more precise; e.g., RAND_load_file
now only reads regular
files.
Next, we imported the AES-based DRBG from the OpenSSL FIPS project, and
made it the default RAND
method. The old method, which tried an ad
hoc set of methods to get seed data, has been removed. We added a new
configuration parameter, --with-rand-seed
, which takes a comma-separated
list of values for seed sources. Each method is tried in turn, stopping
when enough bits of randomness have been collected. The types supported are:
os
which is the default and only one supported on Windows, VMS and some older legacy platforms. Most of the time this is the right value to use, and it can therefore be omitted.getrandom
which uses the getrandom() system calldevrandom
which tries special devices like/dev/urandom
egd
which uses the Entropy Gathering Daemon protocolnone
for no seeding (don’t use this)rdcpu
for X86 will try to useRDSEED
andRDRAND
instructionslibrandom
currently not implemented, but could use things like arc4random() when available.
If running on a Linux kernel, the default of os
will turn on
devrandom
. If you know you have an old kernel and cannot upgrade,
you should think about using rdcpu
. Implementation details can be
seen by looking at the files in the crypto/rand
directory, if you’re
curious. They’re relatively well-commented, but the implementation could
to change in the future. Also note that seeding happens automatically,
so unless you have a very special environment, you should not ever need
to call RAND_add()
, RAND_seed()
or other seeding routines. See the
manpages for details.
It’s possible to “chain” DRBG’s, so that the ouput of one becomes the
seed input for another. Each SSL
object now has its own DRBG, chained
from the global one. All random bytes, like the pre-master secret and
session ID’s, are generated from that one. This can reduce lock contention,
and might result in needing to seed less often.
We also added a separate global DRBG for private key generation and added
API’s to use it. This object isn’t reachable directly, but it is used
by the new BN_priv_rand
and BN_priv_rand_range
API’s. Those API’s,
in turn, are used by all private-key generating functions. We like the
idea of keeping the private-key DRBG “away from” the general public one;
this is common practice. We’re not sure how this idea, and the per-SSL
idea, interact and we’ll be evaluating that for a future release. One
possibility is to have two DRBG’s per-thread, and remove the per-SSL one.
We’d appreciate any suggestions on how to evaluate and decide what to do.
What we didn’t do
The RAND_DRBG
API isn’t public, but it probably will be in a future
release. We want more time to play around with the API and see what’s
most useful and needed.
It’s not currently possible to change either global DRBG; for example,
to use an AES-256 CTR which is also implemented. This is also something
under consideration for the future. The proper way to do both of these
might be to make the existing RAND_METHOD
datatype opaque; we missed
doing so in the 1.1.0 release.
Acknowledgements
This recent round of work started with some posts on the cryptography mailing list. There was also discussion on the openssl-dev mailing list and in a handful of PR’s. Many people participated, but in particular we’d like to thank (in no particular order): Colm MacCarthaigh, Gilles Van Assche, John Denker, Jon Callas, Mark Steward, Chris Wood, Matthias St. Pierre, Peter Waltenberg, and Ted T’so.