6.858 Fall 2010 Lab 2: Binary instrumentation

Handed out: Wednesday, September 15, 2010

Part 1 due: Friday, September 24, 2010 (11:59pm)

All parts due: Friday, October 1, 2010 (11:59pm)

Introduction

This lab will introduce you the binary instrumentation technique, in the context of a tool called zookie. You will develop zookie on top of the DynamoRIO platform, to prevent buffer overflow exploits of the zookws web server from the previous lab. The zookie tool enforces control-flow integrity, that is, the runtime execution of zookws must follow a static control-flow graph determined ahead of time. You will also implement shadow stack in zookie to make zookws resist buffer overflow exploits.

In this and future labs, you will progressively build on your web server. We will also provide you with some additional source code for each lab. To fetch the new source code, use Git to commit your Lab 1 solutions, fetch the latest version of the course repository, and then create a local branch called lab2 based on our lab2 branch, origin/lab2:

httpd@vm-6858:~$ cd lab
httpd@vm-6858:~/lab$ git commit -am 'my solution to lab1'
[lab1 c54dd4d] my solution to lab1
 1 files changed, 1 insertions(+), 0 deletions(-)
httpd@vm-6858:~/lab$ git pull
Already up-to-date.
httpd@vm-6858:~/lab$ git checkout -b lab2 origin/lab2
Branch lab2 set up to track remote branch lab2 from origin.
Switched to a new branch 'lab2'
httpd@vm-6858:~/lab$

The package contains a skeleton of zookie, the binary rewriting tool you will be developing, with a copy of DynamoRIO and the original zookws web server. In this lab we only consider binaries that have non-executable stacks. Before you proceed with this lab assignment, make sure you can compile the source code, as follows:

httpd@vm-6858:~/lab$ make
cc -m32 -g -std=c99 -fno-stack-protector -Wall -Werror -D_GNU_SOURCE   -c -o zookld.o zookld.c
cc -m32 -g -std=c99 -fno-stack-protector -Wall -Werror -D_GNU_SOURCE   -c -o http.o http.c
cc -m32 -g  zookld.o http.o  -lcrypto -o zookld
cc -m32 -g -std=c99 -fno-stack-protector -Wall -Werror -D_GNU_SOURCE   -c -o zookfs.o zookfs.c
cc -m32 -g  zookfs.o http.o  -lcrypto -o zookfs
cp zookfs zookfs-exstack
execstack -s zookfs-exstack
cc -m32 -g -std=c99 -fno-stack-protector -Wall -Werror -D_GNU_SOURCE   -c -o zookd.o zookd.c
cc -m32 -g  zookd.o http.o  -lcrypto -o zookd
cp zookd zookd-exstack
execstack -s zookd-exstack
cc -m32   -c -o shellcode.o shellcode.S
shellcode.S: Assembler messages:
shellcode.S:18: Warning: using `%al' instead of `%eax' due to `b' suffix
objcopy -S -O binary -j .text shellcode.o shellcode.bin
cc -D_GNU_SOURCE -fPIC -Idr/include -DX86_32 -DLINUX \
          -fno-stack-protector -o zookie.o -c zookie.c -m32
cc -fPIC -fno-stack-protector -shared -nostartfiles -nodefaultlibs \
          -lgcc -shared -Wl,-soname,libzookie.so -o libzookie.so zookie.o \
          dr/lib32/release/libdynamorio.so.2.0 -m32
rm shellcode.o
httpd@vm-6858:~/lab$

Part 1: Control-flow graph

In the first part, you will implement a version of zookie that enforces control-flow integrity. The goal will be to prevent an adversary from taking control of program execution through buffer overflows that corrupt a return address or function pointer. You can assume that the attacker cannot modify the program's code by exploiting buffer overflows. Thus, we only need to consider indirect branch instructions for this part of the lab, since the adversary has no way of changing the target addresses of direct branch instructions.

To determine the set of legitimate control flow transitions in a program (like our web server), we will profile a normal execution of that program, when it is processing "expected" requests, and no attack is taking place. Once the set of legal control flow transitions has been recorded, we will switch zookie into enforcement mode, where it will check that every indirect control flow transfer is a legal one (i.e., it matches one of the previously-recorded jumps). If not, we will assume that an attack is taking place, and abort execution to ensure that the program is not compromised.

To help you get started, the skeleton zookie code already contains hooks to intercept indirect branches taken by the program that you run. For example, after building the lab 2 source code, you should be able to run ./run-profile.sh, and see the list of indirect branches being taken by our web server:

httpd@vm-6858:~/lab$ ./run-profile.sh
+ ulimit -s unlimited
+ dirname -- .
+ cd -P -- .
+ pwd -P
+ DIR=/home/httpd/lab
+ exec env - PWD=/home/httpd/lab PATH=/usr/sbin:/usr/bin:/sbin:/bin SHLVL=0 HOME=/tmp ./dr/bin32/drrun -debug -client libzookie.so 0 -p ./zookld zook.conf
zookie: zookld.528
zookie: Log zookld.528.log
profile_at_mbr() called: 0x5568c4f6 -> 0x555636dc
profile_at_mbr() called: 0x55563730 -> 0x555637f9
profile_at_mbr() called: 0x5556c82e -> 0x55563678
profile_at_mbr() called: 0x55563730 -> 0x555637f9
profile_at_mbr() called: 0x5556c82e -> 0x55563678
profile_at_mbr() called: 0x55563730 -> 0x555637f9
profile_at_mbr() called: 0x55563806 -> 0x5555588f
profile_at_mbr() called: 0x55555898 -> 0x80491d0
profile_at_mbr() called: 0x8048f38 -> 0x8048f3e
...
^C
httpd@vm-6858:~/lab$

To understand how this profiling works, look at the source code in zookie.c. The dr_init function is invoked by the DynamoRIO runtime when it first starts. This function looks at the opt variable to determine the options passed to it by the shell script; if the -p option is passed, it goes into profiling mode, and otherwise, it goes into enforcement mode. Finally, dr_init calls dr_register_bb_event, a function provided by DynamoRIO, to register our own function that should be called to instrument every basic block in the program.

Whenever dr_register_bb_event is called by DynamoRIO, it traverses all of the instructions in the basic block, and, if the instruction is a multi-way branch (instr_is_mbr), we call dr_insert_mbr_instrumentation to tell DynamoRIO to invoke our function (either profile_at_mbr or enforce_at_mbr) at that instruction. These functions, in turn, will either record the branch taken, or enforce our control flow graph, respectively. The output you see when running ./run-profile.sh on the initial lab 2 code is being printed by profile_at_mbr, in particular.

You can learn more about DynamoRIO by looking at the DynamoRIO documentation. For example, look at the documentation for the functions we mentioned above, such as dr_register_bb_event, instr_is_mbr, and dr_insert_mbr_instrumentation.

Exercise 1. Implement the profiling mode for zookie, by recording all indirect control flow transfers into a log file. You get to design your own log file format for this exercise. Check that your web server works in profiling mode, and that your logging code appears to be recording all indirect control flow transfers.

Note that zookie follows the child processes spawned by zookld, and instruments each of them in turn. In the code we have provided, zookie creates a new log file for the the execution of each process, named progname.pid.log. In profile_at_mbr, you can use cfg_log to access the log file for the currently running process. After you have run ./run-profile.sh, you should end up with several log files, including one for zookd and one for zookfs.

Now that you have recorded all indirect control flow transfers, you will need to compute a control-flow graph based on these log entries, and to implement the enforcement part of zookie, namely, checking that all indirect control flow transfers at runtime match the recorded control flow graph.

Exercise 2. Implement the merge script, merge.py, which combines the log files for each of the three web server programs (zookld, zookd, and zookfs) into control flow graph files for each of those processes (named zookld.cfg, zookd.cfg, and zookfs.cfg). You may devise any representation for your control flow graph. We have provided a skeleton structure of merge.py for you, but you are free to change it as you see fit. (Please keep the original file names for .log and .cfg files, though.)

Second, implement the enforcement mode of zookie, by loading your control flow graph when zookie starts (see function enforce_init), and checking the control flow graph at every indirect control flow transfer (in enforce_at_mbr). Be sure to print a string starting with zookie: Invalid branch, using dr_fprintf(STDERR, ...) when you catch an invalid branch in enforce_at_mbr; this is how our grading script will check your code.

To check that your code works, first run it in profiling mode, to generate an expected call graph, then run ./merge.py, and then run ./run-enforce.sh to run zookie in enforcement mode. You will want to try to cover all code paths in the web server when browsing the web site in profiling mode; otherwise, enforcement mode may report false alarms due to incomplete control-flow graphs.

To verify that your code meets our grading standards, you can use the grading script ./grade-lab2-part1.sh.

Submit your answers to this part of the lab assignment by running make handin, and upload the resulting lab2-handin.tar.gz file at http://pdos.csail.mit.edu/cgi-bin/858handin.

Part 2: Stack memory protection

One downside of current zookie implementation is that it requires a precise and complete control flow graph to be computed ahead of time. If you forgot to exercise some part of your web server during the profiling phase (such as requesting a non-existent file), the new control flow associated with that code can lead to a false positive at enforcement time, and kill your server when no attack was taking place.

In this exercise, we will consider an alternative approach to protecting the web server from buffer overflow exploits, which requires no training or profiling execution ahead of time, but that is a little less precise. In particular, we will try to prevent overwrites of return addresses on the stack, by remembering the locations of return addresses on the stack, and preventing other memory writes to those locations. Of course, this approach will not be able to prevent function pointer overwrites, but it will be easier to use (i.e., no profiling required).

To enforce this return address protection scheme, we will need to perform three steps. First, we will need to remember the location of all return addresses placed on the stack, by intercepting call instructions, and remembering where they wrote the return address. Second, we will need to intercept all other memory writes, and ensure that they do not write to a memory location previously used by a call instruction to save the return address. Third, we will need to clean up these protected memory locations once the function returns, by intercepting instructions that might remove values from the stack, such as pop and ret.

Exercise 3. Implement the stack return address protection scheme described above, in zookie.c. We have provided hooks that intercept all memory writes (look at enforce_mem_write). This function gets called for both call instructions (which themselves generate a memory write, by storing the return address on the stack), and for other instructions that write to memory. You will need to devise your own data structure to store the list of protected locations on the stack. You will also need to also intercept other instructions to clean up return address locations in your data structures after they have been popped off the stack.

To test whether your code works, you can invoke zookie in stack-checking mode by running ./run-stackcheck.sh. You should first ensure that you can access the web server normally, and then check that buffer overflows that corrupt the return address are caught.

You can run our grading script, ./grade-lab2-part2.sh, as well. Be sure to print a string starting with zookie: Return overwrite, using dr_fprintf(STDERR, ...) when you catch a memory write to a return address in enforce_mem_write; this is how our grading script will check your code.

Challenge! For extra credit, optimize the memory write checks from Exercise 3, by doing binary rewriting instead of using dr_insert_clean_call. Look at dr/samples/memtrace.c for an example of how you might do it.

You are done! Run make handin and follow instructions to upload the resulting file.

Handed out:	Wednesday, September 15, 2010
Part 1 due:	Friday, September 24, 2010 (11:59pm)
All parts due:	Friday, October 1, 2010 (11:59pm)