10/26 --- File Systems: network file systems - Goal: allow users on different computers to share files, like multiple users on a single computer can share files. - Approach: make file access location transparent: /afs/u/kaashoek/README.txt means on each machine the same thing. - Making file access location transparents involves: - Adding names of remote files to client's name space: Import (Plan 9) Attach (AFS) Mounting (NFS) NFS: - two files: 1. on server exports file (lists exported files and their name) 2. on local server (storing which file systems are stored where) - Remote file access (e.g., NFS3 and 9P): - Retrieve directories from remote location - Retrieve inodes from remote location - Retrieve file blocks from remote location - Organization: - client/server - multiple clients, one server (e.g., NFS, AFS) cache consistency problem a client reads a remote block x from server s, modifies it, another client also reads x from server s---what does it see? crash consistency problem a client reads a remote block x from server s, modifies it, crashes, another clients also reads x from server s---what does it see? - multiple clients, multiple servers load balancing problem which clients talks to which server? - peer-to-peer (e.g., xFS and Bayou) - every client is a server - Cache consistency - The network file system should behave the same as when multiple users share a local file system: - read returns the result of the last write - What does that mean in a file system? Case 1: C1 C2 ls -l x chmod +x x ls -l x ls -l x what does "ls x" on C2 return? Ideal: same as the second ls on C1 Case 2: C1 C2: open f read f write f close f open f open f read f read f what does "read f" on C2 return? Ideal: same as the second read on C2? Case 3: C1 C2: open f read f close f open f read f open f write f read f what does "read f" on C2 return? Ideal: result of last write to f on C1. Case 4: C1 C2: open f read f1,f2 open f write f1 read f1,f2 write f2 what does "read f" on C2 return? Ideal: value before write f on C1 or value after write f on C1 (but not mixed). (This follows directly from the rule.) - Consistency is achieved through a cache consistency protocol - Example 1: NFSv2 datastructures: - attribute cache - in memory buffer cache - name cache (in vnode layer) when opening file f, fetch inode from server (with getattr RPC) - if mod time on server is more recent, delete f's cached blocks locally - insert attributes in attribute cache (timeout of 30 seconds) when reading a block of f, check local buffer cache: - if present, use it - otherwise, fetch block and store in cache (replace some other block) after modifying a block of f (or creating a new block), write it through to server after modifying attributes of f (or creating a new file), write it through to server - Example 2: AFS datastructures: - in-memory buffer cache - on-disk file cache when opening file f, fetch inode from server server: - if opening for reading, add client to read list - if opening for reading and there is a writer, invalidate writer (writer flushes modified buffers) - if opening for writing, invalidate the caches of the clients on the read list - server responds with file on response: - stick file into on-disk cache when reading a block of f, check local buffer and disk cache: - if present use it - otherwise, fetch it when writing a block of f, write it to local buffer and disk cache when closing file f, write f (inodes and blocks) to server (and keep a local copy) asynchronously - Cases: NFS AFS 1 might fail ok 2 ok ok 3 fails ok 4 ?? ?? - Crash consistency - Ideal: if system calls return, the written value is stable - Practice: too expensive to write everything to disk, so relax: - some ops are atomic so that application writer can ensure that data stable by using the appropriate sequence of ops. - What is local behavior after reboot FFS Linux Ext2 LFS 1. write no no no 2. close no no no 3. fsync yes yes yes 4. rename yes yes no 5. create yes no no 6. unlink yes no no 7. creat f, creat g f and g stable anything if g stable, f stable no means there is a window. if the client stays up long enough, data will be stable. - What is ideal in network file system: - if system calls returns on client, the written is stable at the server - Practice: too expensive NFS: - Goal: server reboots are not visible to client - when write system call returns and client stays up, then data will be stable. - when write system call return and client fails, the data might become stable - order of writes is not guaranteed - when close through unlink system calls returns, it is stable AFS: - server reboots are visible to client