6.824 2004 Lecture 8: Tutorial on Cache Consistency and Locking lecture overview a tutorial to help you with lab 5 lab 5 intertwines cache consistency and atomicity i'll try to explain them separately first overall goal: ccfs-based distributed file system try to increase number of clients supported by single block server assume that (usually) clients work w/ different files so let's make this case efficient using caching but let's also preserve correctness start with your lab 4 ccfs [draw picture: two ccfs servers, one block server] hard to support many ccfs hosts: every put/get goes to block server "client" and "server" confusing; ccfs is an NFS server, but a block client add block caching to ccfs to improve performance (we've done this for you in lab 5) get() checks local cache first, adds to cache after miss put() *just* adds to cache, marks block as dirty now what do we have? high performance when different users use different files/directories once cache populated, never need to talk to block server incorrect when users modify the same directory! I create a file, you don't see it. we want "cache consistency" informally, a read sees the most recent write can we fix the cache by making it write-through? i.e. put updates local cache *and* sends to the server? no! W1(x), R2(x), W2(x), R1(x) : second read gets stale data we need a way to notify #1 that cache isn't valid #1 could check w/ block server before each get(): too slow! A simple cache consistency protocol. (which you are *not* going to implement!) Invariant: at most one client has a cached copy of a block. Implementation: One token per block, which a client can hold. If you hold the token, put() and get() in local cache. If you don't hold the token, put() and get() the server, no cache. Server remembers who has the token. When server gets put(), sends invalidate to token holder. Client gives up the token when not using the file. Works very well if one client uses a block at a time. Less well if blocks are shared. You can imagine shared read-tokens and exclusive write-tokens. Does this protocol enforce our read-sees-latest-write rule? Not really! #2 holds token, W1(x), R2(x) : W1 message in flight when R2 issues Even local file system has this problem; I may call write() before you call read(), but your read() may finish first. cache consistency is not enough to make ccfs work "correctly" think about NFS3_CREATE 1. reads dir entry blocks to see if the file already exists 2. finds an empty dir entry block 3. writes new file info at least two potential problems concurrent creates for the same name both decide it doesn't exist concurrent creates for different names both write same empty dir ent what we want is atomicity w.r.t. actions by other ccfs's. (this is not related to atomicity w.r.t. crashes) suppose two ccfs's execute NFS RPCs at about the same time result should be the same as if one at a time, in some order if you enforce atomicity, and each operation is correct when run alone, then the whole system is correct don't need to reason specifically about every concurrent interleaving how to achieve atomicity for concurrent operations? wrap operations with acquire/release of a lock build lock server that gives lock to at most one client at a time what should each lock protect? whole file system? no: prevents concurrency that would have been OK. just one block? maybe, but then need one lock per dirent for NFS3_CREATE. i-node + contents: perhaps this will match atomic operation granularity. so let's have locks with name == file handle what operations need to be atomic in ccfs? certainly any NFS RPC that has multiple put()s e.g. NFS3_WRITE append to file + update size what if another NFS3_WRITE sneaks in between? any RPC that wants to prevent changes between get() and put() check dir ent is empty + write dir ent what if another NFS3_CREATE sneaks in between? what about NFS3_READ? or NFS3_LOOKUP? perhaps there's no atomicity issue here. but wait... how to implement locks? lock server keeps a table, indexed by lock name, w/ owner and a queue of clients waiting for each named lock ACQUIRE(name) it not owned, add client to table, send GRANT if owned, add client to queue RELEASE(name) delete owner from table look in queue for waiting client, send GRANT this means GRANT is *not* simply the RPC reply to ACQUIRE ACQUIRE RPC reply has no meaning, other than "I hear you" we supply you with this lock server and corresponding client this approach is still a bit slow: lots of chatter to server one of your jobs will be to cache locks locally so that release() just marks lock locally as released if you acquire() it again, no need to talk to lock server lock server sends you REVOKE if some other client is waiting then you need to send a RELEASE RPC to the server note that lock server is very similar to the token server local caching, server tracks owner, server sends revocation msgs locks and tokens (almost) refer to the same data (i-node + data blocks) and client would typically acquire+get and release+put at same time! can't cache i-node &c if someone else holds lock on that i-node doesn't make sense to hold lock if someone else owns i-node token if you're caching the lock, might as well blocks the data too the lab 5 plan: we supply simple block cache in client (no consistency) and a simple lock server (no lock caching) you add locks to fs.C to make NFS RPCs atomic you add lock caching to our lock client you drive block cache consistency from lock protocol so no separate block cache consistency protocol Details block/lock relationship: invariant: only cache blocks that you've cached the lock for maintaining the invariant: 1. when you get a lock REVOKE, look at the i-node+data blocks the lock protects, put() dirty blocks to server, delete all blocks from your cache, then send RELEASE to the lock server 2. only call get() or put() after acquire() of the lock that protects the corresponding i-node exact sequencing when interacting with locks client #1 is caching the lock and dirty blocks client #2 calls acquire() #2 -> LS : ACQUIRE LS -> #2 : reply LS -> #1 : REVOKE #1 -> LS : reply #1 -> BS : put(k, v) BS -> #1 : reply #1 -> LS : RELEASE LS -> #1 : reply LS -> #2 : GRANT #2 -> LS : reply #1 must ensure the block server has the dirty data before releasing! lab 5 quirks NFS3_READ must take the lock, not for atomicity, but to get latest data NFS3_REMOVE may need the file i-node lock to force stale file handles if you only lock the directory, you leave i-node in other caches NFS3_CREATE may need to grab lock on new i-node to force others to read from our cache NFS3_RENAME would require at least two locks: deadlock? If we had hard links, NFS3_REMOVE would require two locks. single-machine ccfs needs internal locks two local processes call creat() concurrently... What you're *not* responsible for: atomicity w.r.t. crashes recovering lock state after lock server reboot replicating the block server client crash while holding locks: un-do partial operations? look at Frangipani and Petal for a complete system