6.824 Fall 2007 Paper Questions

Please remember to print your answer for each day's paper question on a sheet of paper and hand it in at the beginning of class.

An introduction to Programming with C# threads
Will your lock server for lab 1 use mutexes? Condition variables? Why or why not? Give a specific example in your answer.
Implementing remote procedure calls The RPC package for the 6.824 labs doesn't provide at-most-once semantics, unlike the one described in the paper (see page 49, last paragraph). How does this difference effect the implementation of your lock server for lab 1?
Dryad: Distributed data-parallel programs from sequential building blocks Suppose you want to implement a distributed program that counts word frequencies, taking as input a collection of files with words in them. Sketch how this program might be implemented using Dryad.
A distributed object model for the Java System Suppose that you re-implemented lab1 using using Java RMI. In what ways might your code become simpler compared to your C++ implementation? Give one or two specific examples.
Chord: a scalable peer-to-peer lookup service for Internet applications In chord a query takes log(n) hops to get from a source to the destination. Are these log(n) hops along the shortest path in the Internet from the source to the destination? Explain briefly why or why not.
Democratizing content publication with Coral The Coral paper mentions in Sections 2.3 and 4.2 that Coral's ``sloppy'' DHT can store multiple different values for each key. What would break, or not work as well, if Coral's DHT only stored one value for each key?
Incentives Build Robustness in BitTorrent and The BitTorrent P2P File-Sharing System: Measurements and Analysis Would the BitTorrent file distribution protocol achieve faster file download times for clients without the tit-for-tat protocol? Briefly explain your answer.
Memory Coherence in Shared Virtual Systems The write fault handler in Section 3.1 ends by sending a confirmation to the manager, and that the "Write server" code on the manager waits for this confirmation. Suppose you eliminated this confirmation (both the send and the wait) from the system. Describe a scenario in which lack of the confirmation would cause the system to behave incorrectly. You can assume that the network delivers all messages, and that none of the computers fail.
Distributed Shared Memory on Standard Workstations and Operating Systems Suppose that a simplified version of Treadmarks, called Dreadmarks, simply sent all modifications of variables between an acquire and a release to the next processor to acquire the same lock. Deadmarks doesn't otherwise communicate any changes to memory. Outline a specific simple situation in which Treadmarks would provide more useful or intuitive memory behavior than Dreadmarks.
File Synchronization with Vector Time Pairs Why does Tra have two vectors times (modification time and synchronization time)?
Reimplementing the Cedar File System Using Logging and Group Commit At the end of Section 4, the paper says that during a one-byte file create FSD writes the leader+data page synchronously to the disk, but records the update to the file name table in memory and only writes it back to disk later. Why do you suppose the FSD designers decided to write the data page synchronously? What (if anything) might go wrong if FSD instead wrote the file's data in the in-memory disk cache, and only wrote it to disk later?
Hypervisor-based Fault-tolerance Suppose the primary fails due to a power failure, the backup takes over, and then power to the primary is restored. You would like the primary to resume replicated execution so that the system could tolerate a subsequent failure of the backup. Is this likely to be easy with hypervisor-based fault-tolerance? Why or why not?
Guardians and Actions: Linguistic Support for Robust, Distributed Programs In Figure 4, read_mail deletes email from the mailbox. If new mail arrives just before the delete, will the email be deleted but not returned? Please briefly explain your answer.
Paxos Made Simple After running Paxos, do all participants agree on the same value? If so, briefly explain how this happens. If not, what do the participants agree on?
Frangipani: A Scalable Distributed File System Suppose a server modifies an i-node, appends the modification to its log, then another server modifies the same i-node, and then the first server crashes. The recovery system will see the i-node modification in the crashed server's log, but should not apply that log entry to the i-node, because that would un-do the second server's change. How does Frangipani avoid or cope with this situation?
Shark: Scaling File Servers via Cooperative Caching In Shark, how can a client be assured that when it downloads chunks of a file from different proxies, none of the proxies tampered with the contents of the file?
Secure Untrusted Data Repository (SUNDR) In the simple straw-man, both fetch and modify operations are placed in the log and signed. Suppose an alternate design that only signs and logs modify operations. Does this allow a malicious server to break fetch-modify consistency or fork consistency? Why or why not?
Practical Byzantine Fault Tolerance Compare and constrast SUNDR and BFT NFS. They both provide a secure network file systems. In what ways do their security properties differ?
Tor: The Second-Generation Onion Router Consider users that employ TOR to read their email via an unencrypted IMAP connection. Does TOR guarantee complete anonymity for these users? If so, please explain why. If not, what information could the attacker learn about the user, and how might it do so?
Dynamo: Amazon's Highly Available Key-value Store Suppose Ben Bitdiddle wants to use Dynamo for a service that requires extremely high levels of consistency and durability. To achieve this, he sets N to be large, and then sets W = R = N. Does this guarantee that he will never see divergent copies of an object? Why or why not?
Experiences with a Distributed, Scalable, Methodological File System: AnalogicFS In many ways, this experiences paper raises more questions than it answers. Please answer one of the following questions, taking into consideration the rich history of AnalogicFS and the spirit in which the paper was written:

a) The analysis of A* search shown in Figure 1 claims to be an introspective visualization of the AnalogicFS methodology; however, not all decisions are depicted in the figure. In particular, if I <= P, what should be the next node explored such that all assumptions in Section 2 still hold? Show your work.

b) Despite the authors' claims in the introduction that AnalogicFS was developed to study SCSI disks (and their interaction with lambda calculus), the experimental setup detailed in Section 4.1 involves decommissioned Gameboys instead, which use cartridge-based, Flash-like memory. If the authors had used actual SCSI disks during the experiments, how exactly might have their results changed quantitatively? (Note that Figure 2 actually shows the effective throughput, not the 10th-percentile.)

c) AnalogicFS shows rather unstable multicast algorithm popularity (Figure 5), especially compared with some of the previous systems we've read about in 6.824. Give an example of another system that would have a more steady measurement of popularity pages, especially in the range of 0.1-0.4 decibels of bandwidth. Explain in detail why this is the case.

d) For his 6.824 Lab 9 project, Ben Bitdiddle chose to extend YFS such that it faithfully emulates the constant expected seek time across LISP machines, as AnalogicFS does. Upon implementation, however, he immediately ran into the need to cap the extent size to 400 nm, rather than 676 nm. Explain what assumptions made for the AnalogicFS implementation do not hold true for YFS, and why that changes the maximum extent size.

Questions or comments regarding 6.824? Send e-mail to 6.824-staff@pdos.csail.mit.edu.

Top // 6.824 home //