A Coherent Distributed File Cache with Directory Write-Behind Mann, Birrell, Hisgen, Jerian, Swart 1993, SRC TR and ToCS Only read Sections 1 to 4. Maybe should ask for 6.1 and 6.5/6.6? context lots of workstations sharing a file server each workstation has a cache for performance, want to perform writes into local cache, send to server disk later this is "write-behind" trying to support sophisticated distributed apps (e.g. Vesta version control) what semantics? [write on board] A. C1 writes, C2 sees results, write must be stable. (doesn't apply to apps on same client...) B. writes to a given object become stable in the order they are issued. (applies across clients...) C. writes are stable when forced by fsync(). what are objects and "writes"? whole file, by name (block writes, create, rename, &c) not e.g. individual file blocks whole directory (create, rename, &c on contents) why are these semantics useful for applications? how can an application even tell if a write is stable? Example, running on client C1: text editor wants to replace a file safely 1. fd=create("f.new"); 2. write(fd, ...); 3. rename("f.new", "f"); how do the ordering semantics apply to this example? all the operations mention "f", so they are ordered by #B that is, if #3 is on server disk, so are #2 and #1 what does echo guarantee in this example? f has either complete old contents, or complete new contents even if C1 crashes or loses its network so what might C2 see if C1 crashes? (or C1 after restart or reconnection) empty f.new, old f f.new with contents, old f new f what about 1. vi f.c 2. cc -o f.o f.c if I crash just after #2, could another client see my new .o file but not see my changes to f.c? yes! fix with forder(f.c, f.o) before step 2 compiler could do this automatically if we just didn't have any cache at all, would we get the semantics? send writes to server disk when issued does this guarantee #A? #B? #C? so you could argue that these are natural semantics actually weaker than sync writes does a UNIX local disk file system provide Echo's semantics? directory operations are ordered because of sync writes but create/write/rename doesn't work write may be delayed behind rename that is, there is write-behind for just data so create/write/fsync/rename! does FSD provide Echo's semantics? yes for directory operations: append to log in order but file writes occur synchronously! FSD is opposite of UNIX/FFS! write-behind for metatdata, sync data. Example: write(f2, "a"); rename(f2, f3); write(f3, "b"); might see file "f2" with contents "b" why Echo's particular -> rules? some writes are ordered, but many aren't so forder() is often required, e.g. write files then rename directory why not order all writes by time? why not order none by default, require forder()? why do apps need to know when semantics fail? when do they notify process P? P issued a write but it can't be sent to the server. P has read a write that can't be sent to the server. by rule #A, must have been from this client. can the crash+reboot of client C1 cause notifications on client C2? probably not: C2 can't have read any unstable writes. what is the application supposed to do when notified? what does Echo tell the application? how do we think Echo implements the semantics? clients have write-back caches server gives just one client a token saying it can modify a file if client C2 wants to read: asks server for token server asks C1 to write back and release token C1 keeps an ordered list of writes for each object C1 sends writes to server in order also writes to other files on which they depend show up as rename() or forder() in list C1 gives up token to server server sends token to C2 C2 reads from the server What are some driving applications? I.e. who cares about their ordering guarantees? Does NFS have write-behind? Does FSD? Why isn't NFS enough? Can you work around NFS problems? Why isn't FSD logging enough? Why is it all so complex? Why can't it be localized? What's the hard part? Order on a single machine is easy? Tracking causality through multiple workstations. Actually easy if no cache-cache transfers? Enough to just issue operations from each client in order? And to have coherent client caches? I.e. are there any system-wide issues? Hmm, they *don't* have to order all writes sent to server. If client C1 needs block B1, dirty in C2's cache, C2 needs only flush B1 and blocks it depends on. Not it's whole cache/log prior to last B1 write. Do you really need this kind of ordering? Why the weird over-write rules? Orders appends, but not overwrites. Why not order all writes? What if I read dirty file F1 from local cache, then write file F2. Because double-arrow only holds if single-arrow holds. and single-arrow only applies to writes to same file/directory. When would programs encounter lost write-behind errors? Just workstation disconnected from server? Can't be caused by some other workstation crashing, right?