6.824 - Fall 2007

6.824 Lab 7: Replicated State Machine

Last change: $Date: 2007/11/07 14:29:09 $

Due: Wednesday, November 7th, 11:59pm.


Introduction

In this lab you will replicate your lock server using the replicated state machine approach (see Schneider's RSM paper for a good, but non-required, reference). In the replicated state machine approach, one machine is the master; the master receives requests from clients and executes them on all replicas. To ensure that all replicas have identical state, the replicas must execute all requests in the same order and all requests must produce the same result on all replicas (i.e., the handlers must be deterministic). When the master fails, any of the replicas can take over its job, because they all should have the same state as the failed master. (We'll be discussing an example replicated state machine in Lecture 13.)

The primary task in the lab is building a replicated state machine library (RSM) on top of the existing RPC library for YFS. RSM allows you to plug in any RPC program you have written so far and replicate it as a replicated state machine. To ensure the appropriate behavior, however, there are certain constraints on the RPC handlers. Most importantly, the RPC handlers must run to completion without blocking, and be deterministic. These constraints will ensure that all replicas execute all requests in the same order and with the same result.

Once you have built the replicated state machine library we will plug in the caching lock server you built in Lab 5. In this lab it should become clear why we asked you to write the caching lock server in a way that there are no blocking handlers for the server.

To test your implementation you will need to run the tests for the previous labs, but you may want to start just testing your code with lock_tester. You should test your replicated lock server using the lock_tester first without any failing lock servers. Once that works, you should test by killing some of the lock servers (using the kill command or control-C) while the test is running. We provide you with a testing program, rsm_tester.pl, to make this easier. Even though servers fail, lock_tester should complete successfully. Note that there are no requirements for running with RPC_LOSSY in this lab, though Lab 8 may require it. You also need to ensure that the file system tests for Labs 2, 3, and 4 still complete successfully when the lock_server runs as a replicated state machine.

In this lab you will not start new lock servers after a failure or after the system has been running for a while. To handle these cases we need more sophisticated protocols---for example, we need to ensure that all slaves agree on a single master, despite the presence of network failures, and transfer state to bring new members up to speed. These more sophisticated protocols will be be the topic of the final lab.

RSM: Replicated state machine library

The RSM library has a client object and a server object.

The RSM client takes in its constructor the address of the master, and immediately contacts the master to learn the addresses of all the other replicas. A program can invoke a call method on the client object (just as if it were an RPC client), which will marshall a request. The RSM client sends this request as an argument in another RPC (invoke) to the master. If this RPC fails or times out, the client contacts the replicas one by one until it finds a master to accept the request. If a replica receives an invoke RPC and it isn't master, it should refuse to execute the client request. (The client should automatically fail over to the next replica until it is the master.)

A server can create an instance of the library by creating an RSM object. The constructor takes as arguments the address of the curent master and the address of this server. The first RSM object created typically is the master (i.e., the address of the master is the same as the address of this server). Each RSM object has a view object that tracks the current set of replicas, and the current sequence number for the group. For this lab, the view constructor simply reads in a static list of replicas from a file called "config"; we will improve this in the final lab. Once the RSM object is created, a server will typically register some RPC handlers with the RSM (instead of registering them with an rpcs object).

When the master gets a new request from an RSM client, it assigns the request a sequence number, and issues an invoke RPC on each replica. The replicas unmarshall the request, and invoke the handler registered for the RPC number, supplying an argument and results structure.

Getting Started

Since you are building on the past labs, ensure the code in your Lab 6 directory passes all tests for Labs 1, 2, 3, 4, 5 and 6 before starting in on this lab.

Begin by initializing your Lab 7 directory with your implementation from Lab 6 and then updating it with some new infrastructure code in http://pdos.csail.mit.edu/6.824/labs/yfs-lab7.tgz for Lab 7.

% wget -nc http://pdos.csail.mit.edu/6.824/labs/yfs-lab7.tgz
% rsync -av l6/* l7
% tar xzvf yfs-lab7.tgz

This will add these new files to your l7/ directory:

The new code will overwrite the Makefile and lock_demo.cc files from your previous labs, so you may have to redo any changes in those files.

Testing

Our measure of success is surviving failed replicas, including failures of the master. For this lab, you'll need to pass tests 1, 2, and 3 of rsm_tester.pl (as well as making sure all the file system tests from previous labs work). rsm_tester.pl runs four different tests:

  1. Replicated lock server, no failures: this test just starts up three replicas of the lock server, and runs the standard lock_tester tests.
  2. Replicated lock server, failed replica: this test does the same as the first test, but it kills one of the non-master replicas during the test.
  3. Replicated lock server, failed master: this test does the same as the first test, but it kills the master replica during the test.
  4. Replicated lock server, join replica, kill master: this test starts two lock server replicas, then later joins a third, and then finally kills the master during the test.
The tester picks random ports for the lock server replicas, and starts them up. It redirects output to log files, named as lock_server-[master_port] [my_port].out. The log for the tester is lock_tester-[master_port].out. For this lab, you need to pass the first three tests. You should execute the command as follows, and see similar output:
% ./rsm_tester.pl 1 2 3
test1: start 3-process replicated state machine, run lock_tester
spawned: 2530 2543 2544
./lock_tester: passed all tests successfully
test2: start 3-process rsm, kill slave while tester is running
spawned: 3244 3257 3261
kill: 3257
./lock_tester: passed all tests successfully
test3: start 3-process rsm, kill master while tester is running
spawned: 3899 3912 3916
kill: 3899
./lock_tester: passed all tests successfully
tests done
%
which runs the first three tests above (you can also run the tests individually by just specifying a single test number). We will get to the fourth test (and possibly others) in the final lab.

Step One: replicated state machine without failures

As a first step, just get the basic replicated state machine working. Assume for now that none of the replicas will fail. The basic protocol is: This will involve filling in the methods in rsm.cc mentioned above, as well editing your existing caching lock server/client to use the RSM objects, rather than the RPC objects. In particular:

Remember that only the master lock_server process should be communicating directly with the client. Therefore, you should protect any communication from lock_server_cache to lock_client_cache with a call to rsm::amimaster().

Once this works, you should be able to pass './rsm_tester.pl 1'.

Step Two: slave failures

Next, handler the case where the master tries to invoke an RPC on a slave, but the slave has failed. This should cause the master to remove the replica from its list of replicas.

Once this works, you should be able to pass './rsm_tester.pl 1 2'.

Step Three: RSM client handles master failures

We've supplied you with the heartbeat code that allows the slaves to pick a new master if the current master dies. All that's left to do in rsm_client is to fail over to a new master if an RPC invocation times out.

Important: Unfortunately, because the outgoing RPC calls of the slaves and the master are not completely synchronized, it could be the case that at some moment in time, the slaves believe they have successfully sent a revoke or a grant RPC, but the master has not yet sent the RPC. If the master dies before it sends the RPC, and one of the slaves becomes the master, that message will never be sent to the client. The easiest way to fix this problem is to ensure that your lock_client_cache objects retry their acquire RPCs after some timeout whenever they get a RETRY status code returned from an acquire call.

More specifically, in Lab 5, when the lock_client_cache received a RETRY, it was enough to wait on a conditional variable forever until it received a grant for that lock. Now, in Lab 7, when the master fails, that grant may never come. So, your lock_client_cache should take the RETRY message literally: instead of calling pthread_cond_wait, it should call pthread_cond_timedwait using some small timeout value (say 2 seconds). This call will return either when the conditional variable is signalled (as usual), or when the timeout happens, whichever happens first. The return value of this call will be ETIMEDOUT if the timeout occurred, but if the condition variable was signalled it will be zero. If a timeout did occur, your code should retry the acquire. Check rpc.cc for an example use of pthread_cond_timedwait.

Once this works, you should be able to pass './rsm_tester.pl 1 2 3' many times in a row.

Hints

Challenges

Here are a few things you can do if you finish the lab early and feel like improving your code. These are not required, and there are no associated bonus points, but some of you may find them interesting.

Collaboration policy

You must write all the code you hand in for the programming assignments, except for code that we give you as part of the assignment. You are not allowed to look at anyone else's solution (and you're not allowed to look at solutions from previous years). You may discuss the assignments with other students, but you may not look at or copy each others' code.

Handin procedure

You will need to email your completed code as a gzipped tar file to 6.824-submit@pdos.csail.mit.edu by the deadline at the top of the page. To do this, execute these commands:
% cd ~/yfs/l7
% make clean
% rm core*
% rm *log
% cd ..
% tar czvf `whoami`-lab7.tgz l7/
That should produce a file called [your_user_name]-lab7.tgz in your yfs/ directory. Attach that file to an email and send it to the 6.824 submit address.
For questions or comments, email 6.824-staff@pdos.csail.mit.edu.
Back to 6.824 home.