Linux – gnupg 2.1.16 blocks waiting for entropy

encryptiongpglinux-kernel

Releases of gnupg from 2.1.16 (currently 2.1.17) block waiting for entropy only on first invocation.

Note: this isn't an attempt to generate a key, just to decrypt a file and start the agent.

The first time gpg-agent is started, either directly with gpg2 file.gpg or using an application like pass, pinentry appears and once I enter my passphrase and hit Enter it hangs for around 15s.

All subsequent calls, within the window of the default-cache-ttl, are executed immediately.

Running in --debug-all mode, the period where the hang occurs prints1:

gpg: DBG: chan_6 <- S PROGRESS need_entropy X 30 120
gpg: DBG: chan_6 <- S PROGRESS need_entropy X 120 120
gpg: DBG: chan_6 <- S PROGRESS need_entropy X 30 120
gpg: DBG: chan_6 <- S PROGRESS need_entropy X 120 120
gpg: DBG: chan_6 <- S PROGRESS need_entropy X 30 120
...

I installed rng-tools to supplement the entropy pool:

cat /proc/sys/kernel/random/entropy_avail 
4094

and compared with a machine with the same version of gnupg that did not have rng-tools or haveged installed, that exhibits no delay:

cat /proc/sys/kernel/random/entropy_avail
3783

So there appears to be sufficient entropy in the pool. This was tested on kernels 4.8.13 and 4.9.

Does gpg use a different pool? How can I provide sufficient entropy, or otherwise eliminate the 15s delay when starting the agent?


1. The full debug log.

Best Answer

I think I know what's going on. In gnupg's agent/gpg-agent.c, this function processes messages from libgcrypt.

/* This is our callback function for gcrypt progress messages.  It is
   set once at startup and dispatches progress messages to the
   corresponding threads of the agent.  */
static void 
agent_libgcrypt_progress_cb (void *data, const char *what, int printchar,
                             int current, int total)
{
  struct progress_dispatch_s *dispatch;
  npth_t mytid = npth_self ();

  (void)data;

  for (dispatch = progress_dispatch_list; dispatch; dispatch = dispatch->next)
    if (dispatch->ctrl && dispatch->tid == mytid)
      break;
  if (dispatch && dispatch->cb)
    dispatch->cb (dispatch->ctrl, what, printchar, current, total);

  /* Libgcrypt < 1.8 does not know about nPth and thus when it reads
   * from /dev/random this will block the process.  To mitigate this
   * problem we take a short nap when Libgcrypt tells us that it needs
   * more entropy.  This way other threads have chance to run.  */
#if GCRYPT_VERSION_NUMBER < 0x010800 /* 1.8.0 */
  if (what && !strcmp (what, "need_entropy"))
    npth_usleep (100000); /* 100ms */
#endif
}

That last part with npth_usleep was added between 2.1.15 and 2.1.17. Since this is conditionally compiled if libgcrypt is older than 1.8.0, the straightforward fix would be recompiling gnupg against libgcrypt 1.8.0 or later… unfortunately that version doesn't seem to exist yet.

The weird thing is, that comment about libgcrypt reading /dev/random is not true. Stracing the agent reveals it's reading from /dev/urandom and using the new getrandom(2) syscall, without blocking. It does however send many need_entropy messages, causing npth_usleep to block. Deleting those lines fixes the issue.

I should mention that npth seems to be some kind of cooperative multitasking library, and npth_usleep is probably its way to yield, so it might be better to just significantly reduce that delay, just in case libgcrypt decides to block some day. (1ms is not noticeable)

Related Question