6.824 Lab 2: A TCP Proxy Server

6.824 Lab 2: TCP Proxy

Due date: Thursday September 20, 1 p.m.

Introduction

In this lab you'll write a TCP Proxy using the same C++ asynchronous library as in the multifinger lab. You'll learn how to implement both a TCP client and server in this lab.

A TCP proxy server is a server which acts as an intermediary between a client and another server, called the destination server. Clients establish connections to the TCP proxy server, which then establishes a connection to the destination server. These paired connections are connected by the dotted lines in the figure below. The proxy server sends data received from the client to the destination server and forwards data received from the destination server to the client. Interestingly, the TCP proxy server is actually both a server and a client.

A TCP proxy server can be useful to get around services which restrict connections based on the network addresses or to forward services through firewall servers.

The assignment

The proxy server you will build for this lab will be invoked at the command line as follows:

% ./tcpproxy destination-host destination-port listen-port

For example, to redirect all connections to port 3000 on your local machine to yahoo's web server, run:

% ./tcpproxy www.yahoo.com 80 3000 &

The proxy server will accept connections from multiple clients and forward them using multiple connections to the server. No client or server should be able to hang the proxy server by refusing to read or write data on its connection. For instance, if one client suddenly stops reading from the socket to the proxy, other clients should not notice interruptions of service through the proxy. You will need asynchronous behavior, described in Using TCP Through Sockets.

The proxy must also handle hung clients and servers. In particular, if one end keeps transmitting data but the the other stops reading, the proxy must not buffer an unlimited amount of data. If the proxy has buffered data in one direction and is unable to write any of it for 10 seconds, it should abort both connection pairs.

Connection termination

The proxy must handle end-of-file conditions as transparently as possible. If it reads end-of-file from one socket, it should pass the condition along to the other socket (using shutdown) after writing any remaining buffered data. However, the proxy should continue to forward data in the other direction. The proxy should terminate a connection pair and close the file descriptors under the following two circumstances:

The proxy has read an end-of-file (or experienced a read error other than EAGAIN) in both directions and has written all remaining buffered data.
The proxy experiences a write error (other than EAGAIN) in either direction.

The reason for giving up more easily on write errors is that they signify some failure of the higher-level protocol. A read end-of-file can be a legitimate part of a protocol, whereas when a program writes data to the network, it indicates a serious problem if no one is there to read it.

Setting up your project directory

We have provided a skeletal project directory to get you started. Untar it into your home directory from /home/to2/labs/tcpproxy.tar.gz.

benb@blood [~] > tar xzf /home/to2/labs/tcpproxy.tar.gz
benb@blood [~] > cd tcpproxy

Place your source code in the (empty) file proxy.C. Finally, when you are ready to compile, you can proceed as you did with multifinger:

benb@blood [~] > ./setup
+ chmod +x setup
+ aclocal
...
+ set +x
benb@blood [~] > ./configure --with-sfs=/home/to2/sfs-debug
creating cache ./config.cache
checking for a BSD compatible install... /usr/bin/install -c
checking whether build environment is sane... yes
...
creating Makefile
creating config.h
benb@blood [~] > gmake
...

Testing

You should test your proxy to make sure that it continues to forward data even when some connections aren't responding. Here's one test you should be able to pass.

First, run the proxy and point it at pdos.lcs.mit.edu's HTTP port.

% ./tcpproxy pdos.lcs.mit.edu 80 1234

Note that you may have to pick a local port other than 1234 if someone is already using 1234. Now, in another window, use telnet to fetch /cgi-bin/big through the proxy:

% telnet 127.0.0.1 1234
Trying 127.0.0.1...
Connected to localhost (127.0.0.1).
Escape character is '^]'.
GET /big

Watch the data go by for a while, then interrupt the output by typing control-], after which telnet should stop and print telnet>. Now check that the proxy hasn't been hung because telnet isn't reading data; open another window and fetch something else:

% telnet 127.1 1234
Trying 127.0.0.1...
Connected to localhost (127.0.0.1).
Escape character is '^]'.
GET /ok
You were able to fetch the data.
Connection closed by foreign host.
%

If you see "You were able to fetch the data," your program passes the test.

Once your proxy passes some basic tests, you can test it with the automated program tprox. You can find this program on the class machines in /home/to2/labs/tprox. This is a different tester than was originally distributed. The new tester fixes several important bugs. Assuming your proxy is in ./tcpproxy, you can test it as follows:

% /home/to2/labs/tprox ./tcpproxy
Single echo connection: passed
Two echo connections: passed
20 echo connections: passed
Bulk data, 20 connections: passed
Mix of blocked and normal: passed
One-way shutdown: passed
Early close: passed
Non-timeout of active client: passed
Timeout of lazy client: passed
%

Your program should pass all phases of the tests. If all goes well, tprox should completely finish in a minute or two. If it spends more than about 30 seconds on any one test, there is probably something wrong with your proxy.

The source for tprox is located in /home/to2/labs/tprox.C if you wish to examine it or build it on another platform.

How/What to hand in

September 20 - TCP proxy

You should submit a complete software distribution of your tcpproxy program. As in the previous lab, you should build a software distribution as follows:

% gmake dist
rm -rf tcpproxy-0.0
mkdir tcpproxy-0.0
chmod 777 tcpproxy-0.0
...
================================================
tcpproxy-0.0.tar.gz is ready for distribution
================================================
%

To turn in your distribution, copy the file tcpproxy-0.0.tar.gz to the directory ~/handin/lab2/ where username is your username:

% mkdir -p ~/handin/lab2/
% cp tcpproxy-0.0.tar.gz ~/handin/lab2/
%

If you have any problems about submission, please contact fdabek@mit.edu. The lab is due before class on Thursday, September 20th.

FAQ

Q) Why does my program fail with the error 'Out of memory while malloc-ing 8176 bytes from '../../sfs/async/suio++.h:84'?

A) The short answer to this question is: the program crashes because, as it says, it is out of memory. Many of you are buffering as much data as the client will send as fast as it is sending it. Since the client can send quite a lot of data quite fast during the test, your program quickly runs out of memory. The solution here is to bound the amount of data you are willing to buffer while waiting for the write to complete. Those of you using suios are almost certainly falling victim to this unless you are explicitly dealing with it. There are a number good reasons why not to buffer an infinite amount of data, the fact that machines have only a finite amount of memory being the first.

Q) I can't accept connections / my accept callback doesn't get called / where do I put amain in my "accept loop" / I'm having trouble with fork ()?

A) Many people are having problems understanding how to accept connections asynchronously. This is documented in "Using TCP with Sockets" but I'll run through a quick example here since it is a popular question. First, there is _no_ accept "loop" since we are using asynchronous events. Also, it should _not_ be necessary to call fork () to produce a working proxy; we are encouraging you to use asynchronous events instead. Here are the steps to accept connections: - create an async socket bound to the port you wish to accept connections on - call listen on that socket (man listen for more info) - register a callback for readability on the listening socket - in that callback call accept on the listening socket Accept will return a file descriptor associated with the new connection. Note: If I use "file descriptor" and "socket" and "connection" interchangeably, that's because they (mostly) are. A file descriptor names a socket which is the endpoint of a network connection.

Q) What do these test cases test exactly? Will this tester be used to grade the problem set? Does the tester leak file descriptors?

A) Each test case is designed to exercise some part of your proxy, but in fact exercises many parts. As a result failing a particular test can lead to confusion. For example, many proxies crash in the timeout test not because the timeout handling is wrong but because buffers are managed poorly. With that advice in mind, here is a short description of each test case.

For all of the tests, tprox creates a TCP socket on a randomly chosen port, and accepts connections on that port. By default, it just echos back data it reads from those connections, but it can be told to do other things instead. tprox then starts your proxy program, telling it to forward connections to the above-mentioned port. This means that tprox is connecting through your proxy to itself.

- Single echo connection: tprox creates one connection through your proxy, and sends 10 4-byte messages on the connection. tprox waits for the echoed reply before sending each new message. tprox verifies that the correct data was echoed.

- Two/20 echo connections: tprox creates N connections through your proxy. At the server end, tprox just writes data it reads back to the connection. At the client end, tprox copies data read from connection i to connection i+1. tprox sends some 4-byte messages through this chain and verifies that they arrive at the other end.

- Bulk data, 20 connections: As above, tprox sets up a chain of 20 connections. It then sends 2 megabytes of data through the chain, and verifies that the same data arrives at the other end of the chain.

- Mix of blocked and normal: tprox runs the above bulk test. In addition, it sets up a connection through the proxy but doesn't read data from the server end of the connection. tprox writes as much data as it can to that connection. tprox expects that your proxy will forward all of the bulk data; it doesn't explicitly check how you

- One-way shutdown: tprox sets up a connection, and then calls shutdown(s, SHUT_WR). On the server side, tprox waits for an end of file (read() == 0), and then writes one byte to the connection. tprox verifies that your proxy forwards that byte.

- Early close: In this test, the server side closes the connection immediately after accepting it. Then the client write a few bytes to the connection, separated by one-second pauses. tprox expects that your proxy will eventually close the connection.

- Non-timeout of active client: tprox sets up a connection and writes to it periodically, with multi-second pauses between writes. tprox expects that your proxy will leave the connection open.

- Timeout of lazy forward client: The server end of the connection does not read any data, but the client sends data as fast as it can. tprox expects that your proxy will close the connection after 10 seconds.

- Timeout of lazy reverse client: As above, but in the reverse direction (the server generates data, but the client does not read it).

tprox will be used to grade your lab assignment. And yes, tprox does use up a lot of file descriptors. This means that if your tester aribtrarily bounds the number of file descriptors it allocates you may fail the test even though it is otherwise correct. We will not enforce a limit when we test the proxy, you should not limit your use of valid file descriptors (this does not mean that you should leak file descriptors).

Q) Why did you kill my proxy?

A) Many of you have written proxies that "spin" (use all available CPU time). If I see these proxies running for long periods of time on the class machines I will kill them (they make the machine less responsive to other users). This behavior is indicative of an error in your proxy (most likely a callback is registered for an event that is always true) and may cause you to lose points.

Q) What does... EAGAIN mean? the system call XXX return?

A) couple of you asked about return values from functions, specificially EAGAIN. EAGAIN means "try again later, this operation would block now" and on some systems is named EWOULDBLOCK. The ability to return this error is what distinguishes non-blocking operations from blocking ones. All of the systm calls are documented in the man pages. (i.e. man accept). Note that most of them return -1 for a variety of errors. The real error can be discovered by reading the "errno" global variable. The perror () function will turn errno into a human-readable string; man perror for more info.

Q) What about wierd case close case X?

A) When to close the connections from the proxy can be confusing (what if one side closes and the other one never writes, what if a machine crashes, what about a network partition etc.). If you follow the two simple rules outlined in the assignment your proxy will be correct.

Close a connection if - your proxy sees an EOF in both directions - your proxy sees a write error

Q) I'm getting non-deterministic results/Broken Pipe messages when I test. Is your tester broken?

A) It is certainly possible to write a tcpproxy that fails our tests sometimes and not others; these proxies most likely contain errors. Some of you also saw "broken pipe" errors. These errors result from writing to a closed file descriptor. This may happen in practice, treat it as you would any other write error. You should never receive the SIGPIPE signal that accompanies these errors since libasync will catch and ignore it.

Q) Is it acceptable to have more than one source file?

A) Yes, list them on the "tcpproxy_SOURCES = " line in Makefile.am. You may have to re-run ./setup and/or configure to update your makefiles.

Q) Can I use cb_check () and those other routines in the notes?

A) Those functions have been replaced by the current libAsync. Please use the C++ version.