technology from back to front

Optimising compilers as adversaries

Suppose that you want to handle some secret data in C and, in the wake of some high-profile vulnerability or other, want to take precautions against your secret being leaked. Perhaps you’d write something along these lines:

#include <string.h>

typedef struct {
  char password[16];
} secret_t;

void get_secret(secret_t* secret);
void use_secret(secret_t* secret);

void wipe_secret(secret_t* secret) {
  memset(secret, 0, sizeof(secret_t));
}

int main() {
  secret_t secret;
  get_secret(&secret);
  use_secret(&secret);
  wipe_secret(&secret);
  return 0;
}

I think you could be forgiven for assuming that this does what it says. However, if you have what John Regehr calls ‘a proper sense of paranoia’, you might actually check. Here’s an excerpt of what I got when I ran clang -S -O2 -emit-llvm on this example:

define i32 @main() #0 {
  %secret = alloca %struct.secret_t, align 1
  call void @get_secret(%struct.secret_t* %secret) #4
  call void @use_secret(%struct.secret_t* %secret) #4
  ret i32 0
}

As if by magic, wipe_secret has completely disappeared.

If you want to know where it went, you can pass -mllvm -print-after-all to clang, and it will print the LLVM IR after each optimisation pass. If your system is anything like mine, you’ll see that clang starts off by calling wipe_secret normally, then at some point inlines the memset, before eventually removing it during a pass called “Dead Store Elimination”.

In other words, what has happened is that the combination of the following perfectly reasonable optimisations has produced something quite surprising:

  • If a function is small enough, then inline it rather than calling it
  • If a memory location gets written to but then never read from again, don’t bother actually performing the write

How can we stop this happening? Perhaps the most obvious solution is to turn down the optimisation level: if you try this example with -O1 you should see that the call to wipe_secret is preserved. There are several problems with this approach:

  • It’s not remotely portable
  • There’s no guarantee that the authors of clang won’t tweak the passes for -O1 in the future in such a way that wipe_secret will be removed again
  • Users can’t be expected to predict that setting their default CFLAGS to include -O2 will have security implications. The clang docs describe -O2 as ‘moderate’, for example, rather than ‘dangerous’.

In an ideal world, everyone would have a modern compiler and libraries, and we could just use memset_s from C11, whose entire purpose in life is to avoid this problem. I strongly suspect that in reality, someone would ‘port’ the code to older systems by adding -Dmemset_s=memset to their CFLAGS, thereby breaking the code again. (This is exactly what has happened with explicit_bzero, the OpenBSD equivalent of memset_s.)

I’ve been unable to find a satisfying solution to this problem in the wild. I was disturbed to discover that OpenSSH tries to do it by putting its equivalent of wipe_secret in a separate file, presumably in the hope that the compiler isn’t doing whole-program optimisation. I hope that I’m missing something, but I’m quite worried that the problem would come back if I happened to pass -flto to clang.

By far the most horrifying attempt I’ve seen, though, and the one that inspired this post, is from a certain infamous SSL library. Their implementation keeps a kind of running checksum of all memory ever wiped in a global variable, performing useless work in an attempt to trick the compiler into not optimising it away. What happens when compilers become smart enough to identify that it’s useless work? My guess is that no-one will notice at first, and the library will silently gain (another) vulnerability. When the vulnerability is exploited, perhaps the arms race against the optimiser will escalate.

Perhaps the simplest solution to this problem is just to minimise the use of C. Higher-level languages have their own side-channel attacks, for sure, but for example, RabbitMQ was not vulnerable to Heartbleed because it uses SSL handshake logic written in Erlang: buffer over-reads are simply not possible in Erlang.

After more than 40 years, why do we still use C at all? I’ll be considering that in my next post.

by
ash
on
30/06/14
  1. Ismail Donmez
    on 01/07/14 at 8:45 am

    If you define the wipe_secret as

    __attribute__((optnone)) void wipe_secret(secret_t* secret) {
    memset(secret, 0, sizeof(secret_t));
    }

    It won’t get optimized. See http://clang.llvm.org/docs/LanguageExtensions.html#extensions-for-selectively-disabling-optimization

 
 


+ one = 8

2000-14 LShift Ltd, 1st Floor, Hoxton Point, 6 Rufus Street, London, N1 6PE, UK+44 (0)20 7729 7060   Contact us