Random thoughts

The next release will include a completely overhauled version of the random number facility, the RAND API. The default RAND method is now based on a Deterministic Random Bit Generator (DRBG) implemented according to the NIST recommendation 800-90A. We have also edited the documentation, allowed finer-grained configuration of how to seed the generator, and updated the default seeding mechanisms.

There will probably be more changes before the release is made, but they should be comparatively minor.

Read on for more details.

Background; NIST

The National Institute of Standards and Technologies (NIST) is part of the US Federal Government. Part of their mission is to advance standards. NIST has a long history of involvement with cryptography, including DES, the entire family of SHA digests, AES, various elliptic curves, and so on. They produce a series of Special Publications. One set, SP 800, is the main way they publish guidelines, recommendations, and reference materials in the computer security area.

NIST SP 800-90A rev1 is titled Recommendation for Random Number Generation Using Deterministic Random Bit Generators. Yes, that’s a mouthful! But note that if you generate enough random bits, you get a random byte, and if you generate enough bytes you can treat it as a random number, often a BN in OpenSSL terminology. So when you see “RBG” think “RNG.” :)

A non-deterministic RBG (NRBG) has to be based on hardware where every bit output is based on an unpredictable physical process. A fun example of this is the old Silicon Graphics LavaRand, now recreated and running at CloudFlare, which takes pictures of a lava lamp and digitizes them.

A deterministic RBG (DRBG) uses an algorithm to generate a sequence of bits from an initial seed, and that seed must be based on a true randomness source. This is a divide and conquer approach: if the algorithm has the right properties, the application only needs a small input of randomness (16 bytes for our algorithm) to generate many random bits. It will also need to be periodically reseeded, but when that needs to happen can also be calculated.

The NIST document provides a framework for defining a DRBG, including requirements on the operating environment and a lifecycle model that is slightly different from OpenSSL’s XXX_new()/XXX_free() model. NIST treats creation as relatively heavyweight, and allows a single DRBG to be instantiated or not during its lifetime.

NIST SP 800-90 defined four DRBG algorithms. One of these was “Dual Elliptic Curve” which was later shown to be deliberately vulnerable. For a really good explanation of this, see Steve Checkoway’s talk at the recent IETF meeting. An update to the document was made, the above-linked 90A revision 1, and Dual-EC DRBG was removed.

OpenSSL currently implements the AES-counter version, which is also what Google’s BoringSSL and Amazon’s s2n use. Our tests include the NIST known answer tests (KATs), so we are confident that the algorithm is pretty correct. It also uses AES in a common way, which increases out confidence in its correctness.

What we did

First, we cleaned up our terminology in the documentation and code comments. The term entropy is both highly technical and confusing. It is used in 800-90A in very precise ways, but in our docs it was usually misleading, so we modified the documents to use the more vague but accurate term randomness. We also tightened up some of the implementation and made some semantics more precise; e.g., RAND_load_file now only reads regular files.

Next, we imported the AES-based DRBG from the OpenSSL FIPS project, and made it the default RAND method. The old method, which tried an ad hoc set of methods to get seed data, has been removed. We added a new configuration parameter, --with-rand-seed, which takes a comma-separated list of values for seed sources. Each method is tried in turn, stopping when enough bits of randomness have been collected. The types supported are:

  • os which is the default and only one supported on Windows, VMS and some older legacy platforms. Most of the time this is the right value to use, and it can therefore be omitted.
  • getrandom which uses the getrandom() system call
  • devrandom which tries special devices like /dev/urandom
  • egd which uses the Entropy Gathering Daemon protocol
  • none for no seeding (don’t use this)
  • rdcpu for X86 will try to use RDSEED and RDRAND instructions
  • librandom currently not implemented, but could use things like arc4random() when available.

If running on a Linux kernel, the default of os will turn on devrandom. If you know you have an old kernel and cannot upgrade, you should think about using rdcpu. Implementation details can be seen by looking at the files in the crypto/rand directory, if you’re curious. They’re relatively well-commented, but the implementation could to change in the future. Also note that seeding happens automatically, so unless you have a very special environment, you should not ever need to call RAND_add(), RAND_seed() or other seeding routines. See the manpages for details.

It’s possible to “chain” DRBG’s, so that the ouput of one becomes the seed input for another. Each SSL object now has its own DRBG, chained from the global one. All random bytes, like the pre-master secret and session ID’s, are generated from that one. This can reduce lock contention, and might result in needing to seed less often.

We also added a separate global DRBG for private key generation and added API’s to use it. This object isn’t reachable directly, but it is used by the new BN_priv_rand and BN_priv_rand_range API’s. Those API’s, in turn, are used by all private-key generating functions. We like the idea of keeping the private-key DRBG “away from” the general public one; this is common practice. We’re not sure how this idea, and the per-SSL idea, interact and we’ll be evaluating that for a future release. One possibility is to have two DRBG’s per-thread, and remove the per-SSL one. We’d appreciate any suggestions on how to evaluate and decide what to do.

What we didn’t do

The RAND_DRBG API isn’t public, but it probably will be in a future release. We want more time to play around with the API and see what’s most useful and needed.

It’s not currently possible to change either global DRBG; for example, to use an AES-256 CTR which is also implemented. This is also something under consideration for the future. The proper way to do both of these might be to make the existing RAND_METHOD datatype opaque; we missed doing so in the 1.1.0 release.

Acknowledgements

This recent round of work started with some posts on the cryptography mailing list. There was also discussion on the openssl-dev mailing list and in a handful of PR’s. Many people participated, but in particular we’d like to thank (in no particular order): Colm MacCarthaigh, Gilles Van Assche, John Denker, Jon Callas, Mark Steward, Chris Wood, Matthias St. Pierre, Peter Waltenberg, and Ted T’so.