Engine Building Lesson 2: An Example MD5 Engine

Coming back after a month and two weeks, it’s time to resume with the next engine lesson, this time building an engine implementing a digest.

It doesn’t matter much what digest algorithm we choose. Being lazy, I’ve chosen one with a well defined reference implementation, MD5 (reference implementation is found in RFC 1321)

In this example, I’ve extracted the three files global.h, md5.h and md5c.c into rfc1321/. According to the license, I need to identify them as “RSA Data Security, Inc. MD5 Message-Digest Algorithm”, hereby done. [^1]

From an engine point of view, there are three things that need to be done to implement a digest:

  1. Create an OpenSSL digest method structure with pointers to the functions that will be OpenSSL’s interface to the reference implementation.
  2. Create the interface functions.
  3. Create a digest selector function.

Let’s begin with the structure by example:

#include <openssl/evp.h>
#include "rfc1321/global.h"
#include "rfc1321/md5.h"

static const EVP_MD digest_md5 = {
  NID_md5,                      /* The name ID for MD5 */
  0,                            /* IGNORED: MD5 with private key encryption NID */
  16,                           /* Size of MD5 result, in bytes */
  0,                            /* Flags */
  md5_init,                     /* digest init */
  md5_update,                   /* digest update */
  md5_final,                    /* digest final */
  NULL,                         /* digest copy */
  NULL,                         /* digest cleanup */
  EVP_PKEY_NULL_method,         /* IGNORED: pkey methods */
  64,                           /* Internal blocksize, see rfc1321/md5.h */
  sizeof(MD5_CTX),
  NULL                          /* IGNORED: control function */
};

NOTE: the EVP_MD will become opaque in future OpenSSL versions, starting with version 1.1, and the structure init above will have to be replaced with a number of function calls to initialise the different parts of the structure. More on that later.

A slightly complicating factor with this structure is that it also involves private/public key ciphers. I’m ignoring the associated fields in this lesson, along with the control function (all marked “IGNORED:”) and will get back to them in a future lesson where I’ll get into public/private key algo implementations.

The numerical ID is an OpenSSL number that identifies the digest algorithm, and is an index to the OID database.
The flags are not really used for pure digests and can be left zero.
The copy and cleanup functions are hooks to be used if there’s more to copying and cleanup EVP_MD_CTX for our implementation than OpenSSL can handle on its own. The MD5_CTX structure in the reference MD5 implementation has nothing magic about it, and there is therefore no need to use the copy and cleanuphooks.
The internal block size is present for optimal use of the algorithm.

On to implementing the interface functions md5_init, md5_update and md5_final:

static int md5_init(EVP_MD_CTX *ctx)
{
  MD5Init(ctx->md_data);
  return 1;
}

static int md5_update(EVP_MD_CTX *ctx, const void *data, size_t count)
{
  MD5Update(ctx->md_data, data, count);
  return 1;
}

static int md5_final(EVP_MD_CTX *ctx, unsigned char *md)
{
  MD5Final(md, ctx->md_data);
  return 1;
}

With the reference implementation, things are very straight forward, no errors can happen, all we do is pass data back and forth. A note: ctx->md_data is preallocated by OpenSSL using the size given in the EVP_MD structure. In a more complex implementation, these functions are expected to return 0 on error, 1 on success.

The selector function deserves some special attention, as it really performs two separate functions. The prototype is as follows:

int digest_selector(ENGINE *e, const EVP_MD **digest,
                    const int **nids, int nid);

OpenSSL calls it in the following ways:

  1. with digest being NULL. In this case, *nids is expected to be assigned a zero-terminated array of NIDs and the call returns with the number of available NIDs. OpenSSL uses this to determine what digests are supported by this engine.
  2. with digest being non-NULL. In this case, *digest is expected to be assigned the pointer to the EVP_MD structure corresponding to the NID given by nid. The call returns with 1 if the request NID was one supported by this engine, otherwise 0.

The implementation would be this for our little engine:

static int digest_nids[] = { NID_md5, 0 };
static int digests(ENGINE *e, const EVP_MD **digest,
                   const int **nids, int nid)
{
  int ok = 1;
  if (!digest) {
    /* We are returning a list of supported nids */
    *nids = digest_nids;
    return (sizeof(digest_nids) - 1) / sizeof(digest_nids[0]);
  }

  /* We are being asked for a specific digest */
  switch (nid) {
  case NID_md5:
    *digest = &digest_md5;
    break;
  default:
    ok = 0;
    *digest = NULL;
    break;
  }
  return ok;
}

What remains to do is the following call as part of setting up this engine:

  if (!ENGINE_set_digests(e, digests)) {
    printf("ENGINE_set_name failed\n");
    return 0;
  }

Combine with the code from last lesson results in this:

{% include_code engine-lesson-2/md5-engine.c %}

Building is just as easy as last time:

$ gcc -fPIC -o rfc1321/md5c.o -c rfc1321/md5c.c
$ gcc -fPIC -o md5-engine.o -c md5-engine.c
$ gcc -shared -o md5-engine.so -lcrypto md5-engine.o rfc1321/md5c.o

Let’s start with checking that OpenSSL loads it correctly. Remember what Lesson 1 taught us, that we can pass an absolute path for an engine? Let’s use that again.

$ openssl engine -t -c `pwd`/md5-engine.so
(/home/levitte/tmp/md5-engine/md5-engine.so) A simple md5 engine for demonstration purposes
Loaded: (MD5) A simple md5 engine for demonstration purposes
 [MD5]
     [ available ]

Finally, we can compare the result of a run with OpenSSL’s own implementation.

$ echo whatever | openssl dgst -engine `pwd`/md5-engine.so -md5
engine "MD5" set.
(stdin)= d8d77109f4a24efc3bd53d7cabb7ee35
$ echo whatever | openssl dgst -md5
(stdin)= d8d77109f4a24efc3bd53d7cabb7ee35

And there we have it, a functioning implementation of an MD5 engine.

Just as in the previous lesson, I made a slightly more elaborate variant of this engine, again using the GNU auto* tools. This variant also has a different way of initialising digest_md5 by taking a copy of the OpenSSL builtin structure and just replacing the appropriate fields. This allows it to be used for operations involving public/private keys as well.

Next lesson will be about adding a symmetric cipher implementation.


[^1] I needed to make a small change in rfc1321/global.h

 /* UINT2 defines a two byte word */
-typedef unsigned short int UINT2;
+typedef uint16_t UINT2;
 
 /* UINT4 defines a four byte word */
-typedef unsigned long int UINT4;
+typedef uint32_t UINT4;
 

The reason is that I’m running on a 64-bit machine, so UINT4 ended up not being 32 bits, and the reference MD5 gave me faulty results.