XFI

Required reading: XFI: software guards for system address spaces.

Introduction

Problem: how to use untrusted code (an "extension") in a trusted program?

What bad things can the extension do?

What is it probably OK for an extension to do?

Possible solutions approaches:

Software-based sandboxing

Sandboxer. A compiler or binary-rewriter sandboxes all unsafe instructions in an extension by inserting additional instructions. For example, every indirect store is preceded by a few instructions that compute and check the target of the store at runtime.

Verifier. When the extension is loaded in the trusted program, the verifier checks if the extension is appropriately sandboxed (e.g., all direct stores/calls refer to extension's memory, all indirect stores/calls sandboxed, no privileged instructions). If not, the extension is rejected. If yes, the extension is loaded, and can run. If the extension runs, the instruction that sandbox unsafe instructions check if the unsafe instruction is used in a safe way.

The verifier must be trusted, but the sandboxer doesn't. We can do without the verifier, if the trusted program can establish that the extension has been sandboxed by a trusted sandboxer.

You can think of sandboxing as a software version of the memory protection you get w/ page-tables or segments.

Software fault isolation

SFI by Wahbe et al. explored out to use sandboxing for fault isolation extensions; that is, use sandboxing to control that stores and jump stay within a specified memory range (i.e., they don't overwrite and jump into addresses in the trusted program unchecked). They implemented SFI for the MIPS RISC processor. The MIPS simplified SFI: every instruction is 32 bits wide, you can only jump/call 32-bit aligned targets, and there are 32 registers so a few could be dedicated to sandboxing.

The extension is loaded into a specific range (called a segment) within the trusted application's address space. The segment is identified by the upper bits of the addresses in the segment. Separate code and data segments are necessary to prevent an extension overwriting its code.

It's easy for the verifier to check that direct calls/jumps and stores refer to addresses inside the segment (since such instructions have the address embedded within them). PC-relative branches are also easy to check. The verifier probably has a table of legal call targets that lie in trusted code. Similarly the verifier can detect privileged instructions.

Indirect jumps/calls and indirect stores are harder; the address is in a register and the verifier may not be able to predict the register's value. We'll call these "unsafe" instructions.

Suppose the original unsafe instruction is:

  STORE R0, R1 (i.e. write R1 to Mem[R0])

Here's how we could sandbox the STORE:

  Ra <- R0
  Rb <- Ra >> Rc // Rb = segment ID of target
  CMP Rb, Rd     // Rd holds extension's data segment ID
  BNE fault      // Rd != Rb, branch to error handling code
  STORE Ra, R1

The verifier must check that every unsafe instruction is preceded by the right check code.

But what if the extension jumps directly to the STORE, bypassing the check instructions?

Solution: Ra, Rc, and Rd are dedicated: they cannot be used by the extension code. The verifier must check that the extension doesn't use the dedicated registers. Rc is a scratch register, and doesn't have to be dedicated. Now the extension can jump to the store, but 1) it can't set Ra and 2) the sandbox code always leaves a legal segment address in Ra. So the extention can only store to its own memory.

This implementation costs 4 registers, and 4 check instructions for each unsafe instruction. One could do better:

  Ra <- R0 & Re // zero out segment ID in Ra
  Ra <- Ra | Rf // replace with the valid segment DI
  STORE Ra, R1
This code forces the segment part of the address bits to be correct. It doesn't catch illegal addresses; it just ensures that illegal addresses are within the segment, harming the extension but no other code.

Optimizations include:

Summary of SFI properties:

CFI

Problem: how to do SFI on x86? x86 doesn't have spare registers to dedicate to SFI. And, if the program can jump anywhere in its code, it is hard to reason about what instructions it might execute. That is, instructions are variable length, the verifier can't easily see the boundaries, and the program could jump into the middle of an instruction.

For example if the binary contains:

  25 CD 80 00 00   # AND eax, 0x80CD
and an adversary can arrange to jump to the second byte, then the adversary calls system call on Linux, which has the binary representation CD 80.

Could the sandboxer/verifier check every "instruction" at every byte boundary? No: would reject many legal programs, and inserted check instructions would make no sense.

Solution: Control Flow Integrity (CFI). CFI figures out (or enforces) all the instruction boundaries that can actually be executed. It does this by scanning the instructions starting at the entry and following branch/jump/call/return. It identifies every source and target instruction. It can then scan forward from each target to find instructions boundaries. It must check that nearby targets have consistent instruction boundaries.

What about indirect jumps (via registers)? Calls to C function pointers, or C++ virtual methods, or any RTN. Verifier can't always predict run-time destination.

Solution: CFI knows every function whose address is taken, and thus could be the target of an indirect call. Puts a special 32-bit number before entry point of each such function. Inserts guard instructions before every indirect call to check that register points to a place which is immediately preceded by that number. Look at Figure 2.

So, suppose a buffer overrun modifies a C function pointer that's stored in a stack variable. When CFI-protected code issues the indirect CALL, the guard instruction will (probably) not see the magic number before the target address, and will fault.

Can an extension bypass the guard by jumping directly to the CALL EBX in Figure 2?

An extension can only do indirect calls to functions whose addresses the program has taken. CFI is actually more precise than this, partitioning functions by where they can be called indirectly from and using a different number for each partition.

What if magic number occurs for some other reason in the instruction stream: can malicious code jump to those places too?

CFI originally used the magic number scheme for RTN, but XFI protects the stack from writes, so RTN doesn't need to be checked (and is also precise).

CFI works in a way that's similar to the binary rewriting we saw for virtual machines.

CFI enforces these properties that are useful to XFI's memory protection:

XFI

XFI's main goal is memory address space protection for untrusted extensions, much like SFI. XFI uses CFI to help it understand x86 instructions. XFI is more powerful and useful than SFI in a number of ways, as we'll see.

XFI's "address space" model allows an extension to read, write, and/or execute in any of a number of regions (not just one as in SFI). Typically: code, heap, scoped stack, allocation stack, specific areas in the trusted code. XFI's verifier can check direct memory references statically, using instruction boundary information from CFI.

How does XFI handle loads and stores? It inserts guards as in Figure 3.

Note: A+L and B-H probably embedded in instructions (no registers required).

Note: one check can protect multiple uses of EAX if CFI reports no intervening branch target.

Note: XFI can put complex code into the guards because CFI ensures the guards will be atomic, and because XFI's own memory protection allows it to keep state on the scoped stack. XFI needn't be limited by available dedicatable registers.

How does XFI handle subroutine returns?

Suppose we use XFI for address space protection for a loadable kernel module. How will the module make legitimate calls to the kernel? What if a kernel routine returns a pointer that the extension needs to dereference?

Can XFI protect against the attack discussed in last lecture?

    id = malloc(32);
    read(msg);
    n = msg[0]
    memcpy(id, msg + 1, n);
Will XFI prevent the actual overrun in memcpy()? At what point is XFI likely to signal a fault?

Are there attacks that XFI does not protect against?

How much does XFI slow down applications? How many more instructions are executed? (see Tables 1-4)