6.824 - Spring 2004

6.824 Lab 4: File Server Part One

Due: Thursday March 4th, 1:00pm.

Introduction

This is the first in a sequence of labs in which you'll build a multi-server file system in the spirit of Frangipani. In this lab you'll implement file deletion in a partially-implemented server we supply, as a warm-up exercise. In subsequent labs you'll add the caching and locking required for multiple file servers to achieve good performance while maintaining correctness.

You'll be extending a loop-back NFS file server, labeled CCFS in the diagram below. CCFS mounts itself on a sub-directory of /classfs, and arranges for the NFS client code in the FreeBSD kernel to send it NFS v3 RPCs. Thus CCFS acts as a server only for processes on the local host. CCFS serves a file system stored in a network block server called blockdb. blockdb can run on any host, and can serve blocks to multiple instances of CCFS running on multiple client hosts.

-----------------    ------------
|               |    |          |
|    App  CCFS--|----|--blockdb |
|     |    |    |    |          |
|---------------|    ------------
|     |    |    |
|     Kernel    |
|   NFS Client  |
|               |
-----------------

This architecture is appealing because (in principle) it shouldn't slow down as you add client hosts. Most of the complexity is in the per-client CCFS server, so new clients make use of their own CPUs rather than competing with existing clients for the server's CPU. The blockdb is shared, but hopefully it's simple and fast enough to handle a large number of clients. In contrast, a conventional NFS server is pretty complex (it has a complete file system implementation) so it's more likely to be a bottleneck when shared by many NFS clients. The only fly in the ointment is that multiple CCFS's sharing a single file system stored in a blockdb would need a locking protocol to avoid inconsistent updates -- this will be your job in the next few labs.

The CCFS Server

Download the lab starter files from http://pdos.lcs.mit.edu/6.824/labs/fs-lab-1.tgz to get going.

% wget http://pdos.lcs.mit.edu/6.824/labs/fs-lab-1.tgz
% tar xzvf fs-lab-1.tgz
% cd fs-lab-1
% ./setup
% ./configure --with-dmalloc --with-classfs=/u/6.824/classfs-0.0 i386-freebsd
% gmake

Now you should start the block server on one of the class machines. You'll need to choose a UDP port number that other students aren't using. If, for example, you choose to run the block server on host blood on port 3772, you should type this on blood:

blood% ./blockdbd 3772 &

At this point you can start up the file server. You need to tell it a directory under /classfs on which to mount itself, and tell it the host name and port number of the block server. By default, the file server initializes a new file system in the block server, chooses a new random file handle for the root directory, and prints out the root handle. You can optionally specify an existing root handle. (The block server keeps its state in memory, so you can't use a root handle after you re-start blockdb.) Here's how to start the file server and tell it to mount on /classfs/dir:

anguish% ./ccfs dir blood 3772 &
root file handle: 2d1b68f779135270
anguish% echo hi > /classfs/dir/foo
anguish% ls -lt /classfs/dir/.
total 0
-rw-rw-r--  1 root  wheel  3 Feb 20 14:32 foo
anguish%

Though you don't need to do it for this assignment, you can now run a second copy of your file server with the same root file handle, either under a different directory name on anguish, or on a different host:

suffering% ./ccfs dir blood 3772 2d1b68f779135270 &
suffering% cat /classfs/dir/foo
hi
suffering%

The interesting part of the CCFS source is in fs.C. It's a partial NFS file server, and is missing the following features:

The SYMLINK, MKNOD, REMOVE, RMDIR, RENAME, LINK, READDIRPLUS, PATHCONF, and COMMIT RPCs.
File owners and permissions.
Parent ".." directory entries.
File link counts.
Internal locking for atomic operations such as CREATE and REMOVE.

If you want to run CCFS on your own machine, you'll need the source to SFS and libasync, and the source to classfs.

File System Block Format

The block server stores key/value pairs. Both keys and values are byte arrays; the block server does not interpret them. The block server supports put(key,value), get(key), and remove(key) RPCs. CCFS stores three kinds of blocks: i-node, file data, and directory entry.

An i-node block's key is the file handle, and its content is an fattr3 structure (the NFS v3 file attributes). Use get_fh() and put_fh() to read and write i-node blocks, and remove_fh() to delete them from the block server.

A regular file has zero or more 8192-byte data blocks. The size field of the i-node fattr3 structure determines the real length of a file. The key of a file's i'th data block is the file handle with i appended. Use get_data() and put_data() to read and write file data blocks.

A directory has zero or more directory entry blocks. Each directory entry block contains information about one file name in the directory: an "allocated" flag, the file's handle, and the file's name. The key of a directory's i'th entry is the directory's file handle with i appended. Use pack_dirent() and unpack_dirent() to format and parse directory entry blocks, and get_data() and read_data() to read and write them. The number of entries in a directory is the number of consecutive directory entries that exist. For this reason you'll want to clear an entry's allocated flag rather than deleting the entry from the block server.

Your Job

Your job is to implement the REMOVE NFS RPC. REMOVE takes a directory file handle and a file name as arguments. Your implementation should check that the file handle refers to a directory and that the file name really exists. It should overwrite the file's directory entry with a deallocated entry (using pack_dirent() and put_data()), de-allocate the file's i-node (using remove_fh()), de-allocate the file's data blocks (using free_blocks() or remove_data()), and update the directory's mtime (using put_fh()).

You don't need to implement sophisticated error handling. If something unexpected goes wrong, send an error reply to the RPC.

Please modify only fs.C and fs.h. Please don't modify the block server, since we will test your file server against our own copy of the block server.

Testing

You can test your file server using the cdtest.pl script, supplying your directory under /classfs as the argument. Here's what a successful run of cdtest.pl looks like:

anguish% ./cdtest.pl /classfs/dir
...
Passed all tests!

If cdtest.pl exits without printing "Passed all tests!", then it thinks something is wrong with your file server. For example, if you run cdtest.pl on the fs.C we give you, you'll probably see an error message like this:

unlink xcd-45429-0.226790197675466 failed: Protocol not supported

You are not required to fix any of the other missing features of CCFS listed above; you need only add a REMOVE RPC implementation.

Hints

You'll probably want to start by understanding a few of the RPC handler procedures in fs.C, particularly nfs3_lookup() and nfs3_create(). This code will show you how to get hold of RPC arguments and how to send replies. You can look at nfs3_setattr_cb2() for an example of how to send a reply from an RPC with the same reply type as NFS3_REMOVE.

You can find descriptions of the NFS v3 protocol in RFC1813 and the book NFS Illustrated by Brent Callaghan.

CCFS is written using libasync, whose source you can find in /u/6.824/sfs1 or at www.fs.net. CCFS uses NFS RPC definitions from nfs3_prot.x or in /usr/local/include/sfs-0.7.2/nfs3_prot.x. The output of the RPC compiler is in nfs3_prot.h.

The most powerful debugging tool in the world is printf().

You may be able to find memory allocation errors with dmalloc by typing this before running ccfs:

% setenv DMALLOC_OPTIONS debug=0x24f41d03,inter=100

You can see a trace of all RPC requests that your server receives, and its responses, by setting the ASRV_TRACE environment variable, like this:

env ASRV_TRACE=10 ./ccfs ...

Collaboration policy

You must write all the code you hand in for the programming assignments, except for code that we give you as part of the assigment. You are not allowed to look at anyone else's solution (and you're not allowed to look at solutions from previous years). You may discuss the assignments with other students, but you may not look at or copy each others' code.

Handin procedure

You should hand in a gzipped tarball fs-lab-1-handin.tgz produced by gmake dist. Copy this file to ~/handin/fs-lab-1-handin.tgz. We will use the first copy of the file that we can find after the deadline.