File systems

Required reading: soft updates.


A key problem in designing file systems is how to obtain performance on file system operations while providing consistency and stability of file systems.

The performance bottleneck in a file system is the disk. The bandwidth to a disk is reasonable high for large transfer, but latency is low, because of the cost of moving the disk arm(s). Thus, in general, we want to:

These techniques improve throughput. To improve latency, we count on caching, read-ahead, and writing asynchronously. Today we will focus on write operation (i.e., operations that modify data that needs to be written back at some point).

The naive solution is for the file system to delay writes as long as possible, because that results in big writes, high absorption, and many dirty blocks that can be written at the same time, leading to good disk scheduling. For example, the file systems waits until it all blocks in memory are dirty and it needs a new clean block. Then, it selects the oldest one and writes it to the disk.

This strategy has two potential problems: (1) stability guarantees are weak; if the power fails before the system flushes a block, the modifications are lost; (2) loss of consistency; if the oldest block, for example, contains an inode that points to dirty data block, then after the write the disk will contain an inode pointing to old data blocks. When the file system recovers the inode may be inconsistent with what is on disk (wrong data blocks, wrong size, etc.).

The solution to the first problem is pushed to the applications in UNIX. If an application desires that a file is stable, it must call fsync, or sync (to flush all dirty blocks). To keep of the window of vunerability small, the file system also flushes ditry blocks every 30 seconds.

The second problems can be solved by ensuring that the file system respects dependencies between operations:

The paper calls these dependencies update dependencies.

This paper

The paper surveys some of the existing techniques and introduces a new to achieve the goal of performance and integrity for write operations in the context of a UNIX file system.

Techniques possible:

Soft updates is the solution explored in this paper. It doesn't require NVRAM, is simpler than complete logging (e.g., no changes on disk datastructures), and performs as well as the naive strategy of keep all dirty block in main memory.

Soft updates is a sophisticated variant of flusher-enforced ordering. Instead of maintaining dependencies on the block-level, it maintains dependencies on file structure level (per inode, per directory, etc.), reducing circular dependencies. Furthermore, it breaks any remaining circular dependencies by undo changes before writing the block and then redoing them to the block after writing. (See figure 2 for an example).

Pseudocode for some of the file operations:

create (f) {
   allocate inode in block i  (assuming inode is available)
   add i to directory data block d  (assuming d has space)
   mark d has dependent on i, and create undo/redo record
   update directory inode in block di
   mark di has dependent on d

rename (from, to) {
   i = namei(from);
   increase refcnt in inode in block i
   add "to" directory data block td a reference to inode i
   mark td dependent on block i
   update directory inode "to" tdi
   mark tdi as dependent on td
   remove "from" directory data block fd a reference to inode i
   mark td as dependent on tdi
   decrease refcnt in inode in block i
   mark block i as dependent on td 
   update directory inode in block fdi
   mark fdi as dependent on fd

Pseudocode for the flusher:

flushblock (b)
  lock b;
  for all dependencies that b is relying on
    "remove" that dependency by undoing the change to b
    mark the dependency as "unrolled"
  write b 

write_completed (b) {
  remove dependencies that depend on b
  reapply "unrolled" dependencies that b depended on
  unlock b

Apply flush algorithm to example in figure 2:

What needs to be done on recovery? (Inspect every statement in rename and see what inconsistencies could exist on the disk; e.g., refcnt inode could be too high.) None of these inconsitencies require fixing before the file system can operate; they can be fixed by a background file system repairer.

Paper discussion

Do soft updates perform any useless writes? (A useless write is a write that will be immediately overwritten.) (Answer: yes.) Fix syncer to becareful with what block to start. Fix cache replacement to selecting LRU block with no pendending dependencies.

Does FFS provide better stability than soft updates? (Answer: yes. FFS writes metadata updates synchronously. Before create() returns the file has been created on disk!

Can a log-structured file system implement rename better? (Answer: yes, since it can get the refcnts right).

Discuss all graphs.