2014

VerSum: Verifiable computations over large public logs Jelle van den Hooff, M. Frans Kaashoek, and Nickolai Zeldovich. CCS 2014.

VerSum allows lightweight clients to outsource expensive computations over large and frequently changing data structures, such as the Bitcoin or Namecoin blockchains, or a Certificate Transparency log. VerSum clients ensure that the output is correct by comparing the outputs from multiple servers. VerSum assumes that at least one server is honest, and crucially, when servers disagree, VerSum uses an efficient conflict resolution protocol to determine which server(s) made a mistake and thus obtain the correct output.

VerSum's contribution lies in achieving low server-side overhead for both incremental re-computation and conflict resolution, using three key ideas: (1) representing the computation as a functional program, which allows memoization of previous results; (2) recording the evaluation trace of the functional program in a carefully designed computation history to help clients determine which server made a mistake; and (3) introducing a new authenticated data structure for sequences, called SeqHash, that makes it efficient for servers to construct summaries of computation histories in the presence of incremental re-computation. Experimental results with an implementation of VerSum show that VerSum can be used for a variety of computations, that it can support many clients, and that it can easily keep up with Bitcoin's rate of new blocks with transactions.

@inproceedings{versum:ccs14,
  title        = {{VerSum}: Verifiable computations over large public
    logs},
  author       = {Jelle van den Hooff and M. Frans Kaashoek and
    Nickolai Zeldovich},
  booktitle    = {Proceedings of the 21st {ACM} Conference on Computer
    and Communications Security ({CCS 2014})},
  year         = 2014,
  month        = nov,
  address      = {Scottsdale, AZ},
}
Identifying information disclosure in web applications with retroactive auditing Haogang Chen, Taesoo Kim, Xi Wang, Nickolai Zeldovich, and M. Frans Kaashoek. OSDI 2014.

Rail is a framework for building web applications that can precisely identify inappropriately disclosed data after a vulnerability is discovered. To do so, Rail introduces retroactive disclosure auditing: re-running the application with previous inputs once the vulnerability is fixed to determine what data should have been disclosed. A key challenge for Rail is to reconcile state divergence between the original and replay executions, so that the differences between executions precisely correspond to inappropriately disclosed data. Rail provides application developers with APIs to address this challenge, by identifying sensitive data, assigning semantic names to non-deterministic inputs, and tracking dependencies.

Results from a prototype of Rail built on top of the Meteor framework show that Rail can quickly and precisely identify data disclosure from complex attacks, including programming bugs, administrative mistakes, and stolen passwords. Rail incurs up to 22% throughput overhead and 0.5 KB storage overhead per request. Porting three existing web applications required fewer than 25 lines of code changes per application.

@inproceedings{rail:osdi14,
  title        = {Identifying information disclosure in web
    applications with retroactive auditing},
  author       = {Haogang Chen and Taesoo Kim and Xi Wang and Nickolai
    Zeldovich and M. Frans Kaashoek},
  booktitle    = {Proceedings of the 11th {USENIX} {S}ymposium on
    {O}perating {S}ystems {D}esign and {I}mplementation ({OSDI} '14)},
  year         = 2014,
  month        = oct,
  address      = {Broomfield, Colorado},
}
Jitk: A trustworthy in-kernel interpreter infrastructure Xi Wang, David Lazar, Nickolai Zeldovich, Adam Chlipala, and Zachary Tatlock. OSDI 2014.

Modern operating systems run multiple interpreters in the kernel, which enable user-space applications to add new functionality or specialize system policies. The correctness of such interpreters is critical to the overall system security: bugs in interpreters could allow adversaries to compromise user-space applications and even the kernel.

Jitk is a new infrastructure for building in-kernel interpreters that guarantee functional correctness as they compile user-space policies down to native instructions for execution in the kernel. To demonstrate Jitk, we implement two interpreters in the Linux kernel, BPF and INET-DIAG, which are used for network and system call filtering and socket monitoring, respectively. To help application developers write correct filters, we introduce a high-level rule language, along with a proof that Jitk correctly translates high-level rules all the way to native machine code, and demonstrate that this language can be integrated into OpenSSH with tens of lines of code. We built a prototype of Jitk on top of the CompCert verified compiler and integrated it into the Linux kernel. Experimental results show that Jitk is practical, fast, and trustworthy.

@inproceedings{jitk:osdi14,
  title        = {Jitk: A Trustworthy In-Kernel Interpreter
    Infrastructure},
  author       = {Xi Wang and David Lazar and Nickolai Zeldovich and
    Adam Chlipala and Zachary Tatlock},
  booktitle    = {Proceedings of the 11th {USENIX} {S}ymposium on
    {O}perating {S}ystems {D}esign and {I}mplementation ({OSDI} '14)},
  year         = 2014,
  month        = oct,
  address      = {Broomfield, Colorado},
}
Phase reconciliation for contended in-memory transactions Neha Narula, Cody Cutler, Eddie Kohler, and Robert Morris. OSDI 2014.

Multicore main-memory database performance can collapse when many transactions contend on the same data. Contending transactions are executed serially—either by locks or by optimistic concurrency control aborts—in order to ensure that they have serializable effects. This leaves many cores idle and performance poor. We introduce a new concurrency control technique, phase reconciliation, that solves this problem for many important workloads. Doppel, our phase reconciliation database, repeatedly cycles through joined, split, and reconciliation phases.

Joined phases use traditional concurrency control and allow any transaction to execute. When workload contention causes unnecessary serial execution, Doppel switches to a split phase. There, updates to contended items modify per-core state, and thus proceed in parallel on different cores. Not all transactions can execute in a split phase; for example, all modifications to a contended item must commute. A reconciliation phase merges these per-core states into the global store, producing a complete database ready for joined-phase transactions. A key aspect of this design is determining which items to split, and which operations to allow on split items.

Phase reconciliation helps most when there are many updates to a few popular database records. Its throughput is up to 38x higher than conventional concurrency control protocols on microbenchmarks, and up to 3x higher on a larger application, at the cost of increased latency for some transactions.

@inproceedings{doppel:osdi14,
  title        = {Phase Reconciliation for Contended In-Memory
    Transactions},
  author       = {Neha Narula and Cody Cutler and Eddie Kohler and
    Robert Morris},
  booktitle    = {Proceedings of the 11th {USENIX} {S}ymposium on
    {O}perating {S}ystems {D}esign and {I}mplementation ({OSDI} '14)},
  year         = 2014,
  month        = oct,
  address      = {Broomfield, Colorado},
}
A differential approach to undefined behavior detection Xi Wang. Ph.D. thesis, Massachusetts Institute of Technology, September 2014.

This thesis studies undefined behavior arising in systems programming languages such as C/C++. Undefined behavior bugs lead to unpredictable and subtle systems behavior, and their effects can be further amplified by compiler optimizations. Undefined behavior bugs are present in many systems, including the Linux kernel and the Postgres database. The consequences range from incorrect functionality to missing security checks.

This thesis proposes a formal and practical approach, which finds undefined behavior bugs by finding “unstable code” in terms of optimizations that leverage undefined behavior. Using this approach, we introduce a new static checker called Stack that precisely identifies undefined behavior bugs. Applying Stack to widely used systems has uncovered 161 new bugs that have been confirmed and fixed by developers.

@phdthesis{xi-phd,
  title        = {A Differential Approach to Undefined Behavior
    Detection},
  author       = {Xi Wang},
  school       = {Massachusetts Institute of Technology},
  year         = 2014,
  month        = sep,
}
ScaleFS: A multicore-scalable file system Rasha Eqbal. Master's thesis, Massachusetts Institute of Technology, August 2014.

It is difficult to achieve durability and crash consistency in file systems along with multicore scalability. Commutative file system operations, which should scale according to the Scalable Commutativity Property, conflict on shared resources like coarse-grained locks and pages present in the page cache or buffer cache. Furthermore, data structures that are on separate cache lines in memory (e.g., directory entries) are grouped together when the file system writes them to disk for durability. This grouping results in additional conflicts.

This thesis introduces a new design approach that decouples the in-memory file system from the on-disk file system, using per core operation logs. This facilitates the use of highly concurrent data structures for the in-memory representation, which is essential for commutative operations to proceed conflict free and hence scale perfectly. The in-memory representation does not propagate updates to the disk representation immediately, instead it simply logs the operation in a per core logical log. A sync or an fsync call processes these operations and applies them to the disk. Techniques based on time stamping linearization points of file system operations ensure crash consistency, and dependency tracking ensures good disk performance.

A prototype file system, ScaleFS, implements this new approach and techniques. Experiments using Commuter and ScaleFS show that the implementation is conflict free for 99% of test cases involving commutative operations.

@mastersthesis{rashae-sm-thesis,
  title        = {{ScaleFS}: A Multicore-Scalable File System},
  author       = {Rasha Eqbal},
  school       = {Massachusetts Institute of Technology},
  year         = 2014,
  month        = aug,
}
Why does cryptographic software fail? A case study and open problems David Lazar, Haogang Chen, Xi Wang, and Nickolai Zeldovich. APSys 2014.

Mistakes in cryptographic software implementations often undermine the strong security guarantees offered by cryptography. This paper presents a systematic study of cryptographic vulnerabilities in practice, an examination of state-of-the-art techniques to prevent such vulnerabilities, and a discussion of open problems and possible future research directions. Our study covers 269 cryptographic vulnerabilities reported in the CVE database from January 2011 to May 2014. The results show that just 17% of the bugs are in cryptographic libraries (which often have devastating consequences), and the remaining 83% are misuses of cryptographic libraries by individual applications. We observe that preventing bugs in different parts of a system requires different techniques, and that no effective techniques exist to deal with certain classes of mistakes, such as weak key generation.

@inproceedings{cryptobugs:apsys14,
  title        = {Why does cryptographic software fail? {A} case study
    and open problems},
  author       = {David Lazar and Haogang Chen and Xi Wang and
    Nickolai Zeldovich},
  booktitle    = {Proceedings of 5th ACM SIGOPS Asia-Pacific Workshop
    on Systems (APSys 2014)},
  year         = 2014,
  month        = jun,
  address      = {Beijing, China},
}
Providing a shared file system in the Hare POSIX multikernel Charles Gruenwald III. Ph.D. thesis, Massachusetts Institute of Technology, June 2014.
Hare is a new multikernel operating system that provides a single system image for multicore processors without cache coherence. Hare allows applications on different cores to share files, directories, file descriptors, sockets, and processes. The main challenge in designing Hare is to support shared abstractions faithfully enough to run applications that run on traditional shared-memory operating systems with few modifications, and to do so while scaling with an increasing number of cores. To achieve this goal, Hare must support shared abstractions (e.g., file descriptors shared between processes) that appear consistent to processes running on any core, but without relying on hardware cache coherence between cores. Moreover, Hare must implement these abstractions in a way that scales (e.g., sharded directories across servers to allow concurrent operations in that directory). Hare achieves this goal through a combination of new protocols (e.g., a 3-phase commit protocol to implement directory operations correctly and scalably) and leveraging properties of non-cache coherent multiprocessors (e.g., atomic low-latency message delivery and shared DRAM). An evaluation on a 40-core machine demonstrates that Hare can run many challenging Linux applications (including a mail server and a Linux kernel build) with minimal or no modifications. The results also show these applications achieve good scalability on Hare, and that Hare's techniques are important to achieving scalability.
@phdthesis{charlesg3-phd,
  title        = {Providing a Shared File System in the {Hare} {POSIX}
    Multikernel},
  author       = {Charles Gruenwald III},
  school       = {Massachusetts Institute of Technology},
  year         = 2014,
  month        = jun,
}
The scalable commutativity rule: Designing scalable software for multicore processors Austin T. Clements. Ph.D. thesis, Massachusetts Institute of Technology, June 2014.

What fundamental opportunities for multicore scalability are latent in software interfaces, such as system call APIs? Can scalability opportunities be identified even before any implementation exists, simply by considering interface specifications? To answer these questions this dissertation introduces the scalable commutativity rule: Whenever interface operations commute, they can be implemented in a way that scales. This rule aids developers in building scalable multicore software starting with interface design and carrying on through implementation, testing, and evaluation.

This dissertation formalizes the scalable commutativity rule and defines a novel form of commutativity named SIM commutativity that makes it possible to fruitfully apply the rule to complex and highly stateful software interfaces.

To help developers apply the rule, this dissertation introduces an automated method embodied in a new tool named Commuter, which accepts high-level interface models, generates tests of operations that commute and hence could scale, and uses these tests to systematically evaluate the scalability of implementations. We apply Commuter to a model of 18 POSIX file and virtual memory system operations. Using the resulting 26,238 scalability tests, Commuter systematically pinpoints many problems in the Linux kernel that past work has observed to limit application scalability and identifies previously unknown bottlenecks that may be triggered by future hardware or workloads.

Finally, this dissertation applies the scalable commutativity rule and Commuter to the design and implementation of a new POSIX-like operating system named sv6. sv6's novel file and virtual memory system designs enable it to scale for 99% of the tests generated by Commuter. These results translate to linear scalability on an 80-core x86 machine for applications built on sv6's commutative operations.

@phdthesis{aclements-phd,
  title        = {The scalable commutativity rule: Designing scalable
    software for multicore processors},
  author       = {Austin T. Clements},
  school       = {Massachusetts Institute of Technology},
  year         = 2014,
  month        = jun,
}
Automatic intrusion recovery with system-wide history Taesoo Kim. Ph.D. thesis, Massachusetts Institute of Technology, June 2014.

Compromises of our computer systems are inevitable. New software vulnerabilities are discovered and exploited daily, but even if the software is bug-free, administrators may inadvertently make mistakes in configuring permissions, or unaware users may click on buttons in application installers with little understanding of its consequences. Unfortunately, recovering from those inevitable compromises leads to days and weeks of wasted effort by users or system administrators, with no conclusive guarantee that all traces of the attack have been cleaned up.

This dissertation presents , an automatic recovery system that repairs a computer after an adversary compromises it, by undoing the adversary's changes while preserving legitimate user actions, with minimal user involvement. During normal operation, records an action history graph to describe the system's execution, enabling to trace the adversary's changes and their effects. During repair, uses the action history graph to undo an unwanted action and its indirect effects by first rolling back its direct effects, and then re-executing legitimate actions that were influenced by that change. To minimize re-execution and user involvement, uses predicates to selectively re-execute only actions that were semantically affected by the adversary's changes, uses refinement to represent high level semantics into the action history graph, and uses compensating actions to handle external effects.

An evaluation of a prototype of for Linux with 2 real-world attacks, 2 synthesized challenge attacks, and 6 attacks from previous work, shows that can handle a wide range of real attacks with minimal user involvement, and preserve user's changes by efficiently re-executing parts of an action history graph. These benefits come at the cost of 35–127% in execution time overhead and of 4–150 GB of log space per day, depending on the workload. For example, a HotCRP paper submission web site incurs 35% slowdown and generates 4 GB of logs per day under the workload from 30 minutes prior to the SOSP 2007 deadline. We believe those overheads are acceptable in systems whose integrity is critical in their operations.

@phdthesis{taesoo-phd,
  title        = {Automatic intrusion recovery with system-wide
    history},
  author       = {Taesoo Kim},
  school       = {Massachusetts Institute of Technology},
  year         = 2014,
  month        = jun,
}
Scaling address-space operations on Linux with TSX Christopher R. Johnson. Master's thesis, Massachusetts Institute of Technology, June 2014.

Concurrent programming is important due to increasing core counts, but scalable concurrency control is difficult and error prone to implement. Hardware Transactional Memory (HTM) addresses this problem by providing hardware support for concurrently executing arbitrary read-modify-write memory transactions. Intel released Transactional Synchronization eXtensions (TSX), a HTM implementation, in select processors to support scalable concurrency control.

This thesis contributes a case study in applying TSX to the Linux virtual memory system, which currently serializes address-space operations with a lock. TSX should provide scalability by supporting concurrent address-space operations. Achieving scalability with TSX, however, turned out to be difficult due to transactional aborts. This thesis details how to identify and resolve abort problems, and it describes the necessary modifications to make address-space operations scale in Linux.

This thesis also describes a new TLB shootdown algorithm, TxShootDown, which removes TLB shootdown from a transactional critical section while avoiding races due to concurrent address-space operations.

@mastersthesis{txvm:crjohns-thesis,
  title        = {Scaling Address-Space Operations on {Linux} with
    {TSX}},
  author       = {Christopher R. Johnson},
  school       = {Massachusetts Institute of Technology},
  year         = 2014,
  month        = jun,
}
Fast bug finding in lock-free data structures with CB-DPOR Jelle van den Hooff. Master's thesis, Massachusetts Institute of Technology, May 2014.

This thesis describes CB-DPOR, an algorithm for quickly finding bugs in lock-free data structures. CB-DPOR is a combination of the CHESS and DPOR model checking algorithms. CB-DPOR performs similar to the concurrently developed preemption-bounded BPOR algorithm.

Codex is a tool for finding bugs in lock-free data structures. Codex implements CB-DPOR and this thesis demonstrates how to use Codex to find bugs. This thesis describes new bugs in open-source lock-free data structures, and compares the performance of CB-DPOR with the earlier model checking algorithms CHESS, DPOR, and PCT. CB-DPOR find bugs one to two orders of magnitude faster than earlier algorithms.

@mastersthesis{codex:jelle-meng,
  title        = {Fast Bug Finding in Lock-Free Data Structures with
    {CB-DPOR}},
  author       = {Jelle van den Hooff},
  school       = {Massachusetts Institute of Technology},
  year         = 2014,
  month        = may,
}
Building web applications on top of encrypted data using Mylar Raluca Ada Popa, Emily Stark, Jonas Helfer, Steven Valdez, Nickolai Zeldovich, M. Frans Kaashoek, and Hari Balakrishnan. NSDI 2014.

Web applications rely on servers to store and process confidential information. However, anyone who gains access to the server (e.g., an attacker, a curious administrator, or a government) can obtain all of the data stored there. This paper presents Mylar, a platform for building web applications, which protects data confidentiality against attackers with full access to servers. Mylar stores sensitive data encrypted on the server, and decrypts that data only in users' browsers. Mylar addresses three challenges in making this approach work. First, Mylar allows the server to perform keyword search over encrypted documents, even if the documents are encrypted with different keys. Second, Mylar allows users to share keys and encrypted data securely in the presence of an active adversary. Finally, Mylar ensures that client-side application code is authentic, even if the server is malicious. Results with a prototype of Mylar built on top of the Meteor framework are promising: porting 6 applications required changing just 36 lines of code on average, and the performance overheads are modest, amounting to a 17% throughput loss and a 50 ms latency increase for sending a message in a chat application.

@inproceedings{mylar:nsdi14,
  title        = {Building web applications on top of encrypted data
    using {Mylar}},
  author       = {Raluca Ada Popa and Emily Stark and Jonas Helfer and
    Steven Valdez and Nickolai Zeldovich and M. Frans Kaashoek and
    Hari Balakrishnan},
  booktitle    = {Proceedings of the 7th {USENIX} {S}ymposium on
    {N}etworked {S}ystems {D}esign and {I}mplementation ({NSDI} '10)},
  year         = 2014,
  month        = apr,
  address      = {San Jose, CA},
}
Easy freshness with Pequod cache joins Bryan Kate, Eddie Kohler, Michael Kester, Neha Narula, Yandong Mao, and Robert Morris. NSDI 2014.
Pequod is a distributed application-level key-value cache that supports declaratively defined, incrementally maintained, dynamic, partially-materialized views. These views, which we call cache joins, can simplify application development by shifting the burden of view maintenance onto the cache. Cache joins define relationships among key ranges; using cache joins, Pequod calculates views on demand, incrementally updates them as required, and in many cases improves performance by reducing client communication. To build Pequod, we had to design a view abstraction for volatile, relationless key-value caches and make it work across servers in a distributed system. Pequod performs as well as other inmemory key-value caches and, like those caches, outperforms databases with view support.
@inproceedings{pequod:nsdi2014,
  title        = {Easy Freshness with {Pequod} Cache Joins},
  author       = {Bryan Kate and Eddie Kohler and Michael Kester and
    Neha Narula and Yandong Mao and Robert Morris},
  booktitle    = {Proceedings of the 7th {USENIX} {S}ymposium on
    {N}etworked {S}ystems {D}esign and {I}mplementation ({NSDI} '10)},
  year         = 2014,
  month        = apr,
  address      = {San Jose, CA},
}

2013

Towards optimization-safe systems: Analyzing the impact of undefined behavior Xi Wang, Nickolai Zeldovich, M. Frans Kaashoek, and Armando Solar-Lezama. SOSP 2013.

This paper studies an emerging class of software bugs called optimization-unstable code: code that is unexpectedly discarded by compiler optimizations due to undefined behavior in the program. Unstable code is present in many systems, including the Linux kernel and the Postgres database. The consequences of unstable code range from incorrect functionality to missing security checks.

To reason about unstable code, this paper proposes a novel model, which views unstable code in terms of optimizations that leverage undefined behavior. Using this model, we introduce a new static checker called Stack that precisely identifies unstable code. Applying Stack to widely used systems has uncovered 160 new bugs that have been confirmed and fixed by developers.

@inproceedings{stack:sosp13,
  title        = {Towards Optimization-Safe Systems: Analyzing the
    Impact of Undefined Behavior},
  author       = {Xi Wang and Nickolai Zeldovich and M. Frans Kaashoek
    and Armando Solar-Lezama},
  booktitle    = {Proceedings of the 24th ACM Symposium on Operating
    Systems Principles (SOSP 2013)},
  year         = 2013,
  month        = nov,
  address      = {Farmington, Pennsylvania},
}
Asynchronous intrusion recovery for interconnected web services Ramesh Chandra, Taesoo Kim, and Nickolai Zeldovich. SOSP 2013.

Recovering from attacks in an interconnected system is difficult, because an adversary that gains access to one part of the system may propagate to many others, and tracking down and recovering from such an attack requires significant manual effort. Web services are an important example of an interconnected system, as they are increasingly using protocols such as OAuth and REST APIs to integrate with one another. This paper presents Aire, an intrusion recovery system for such web services. Aire addresses several challenges, such as propagating repair across services when some servers may be unavailable, and providing appropriate consistency guarantees when not all servers have been repaired yet. Experimental results show that Aire can recover from four realistic attacks, including one modeled after a recent Facebook OAuth vulnerability; that porting existing applications to Aire requires little effort; and that Aire imposes a 19–30% CPU overhead and 6–9 KB/request storage cost for Askbot, an existing web application.

@inproceedings{aire:sosp13,
  title        = {Asynchronous Intrusion Recovery for Interconnected
    Web Services},
  author       = {Ramesh Chandra and Taesoo Kim and Nickolai Zeldovich},
  booktitle    = {Proceedings of the 24th ACM Symposium on Operating
    Systems Principles (SOSP 2013)},
  year         = 2013,
  month        = nov,
  address      = {Farmington, Pennsylvania},
}
The scalable commutativity rule: Designing scalable software for multicore processors Austin T. Clements, M. Frans Kaashoek, Nickolai Zeldovich, Robert T. Morris, and Eddie Kohler. SOSP 2013.

What fundamental opportunities for scalability are latent in interfaces, such as system call APIs? Can scalability opportunities be identified even before any implementation exists, simply by considering interface specifications? To answer these questions this paper introduces the following rule: Whenever interface operations commute, they can be implemented in a way that scales. This rule aids developers in building more scalable software starting from interface design and carrying on through implementation, testing, and evaluation.

To help developers apply the rule, a new tool named Commuter accepts high-level interface models and generates tests of operations that commute and hence could scale. Using these tests, Commuter can evaluate the scalability of an implementation. We apply Commuter to 18 POSIX calls and use the results to guide the implementation of a new research operating system kernel called sv6. Linux scales for 68% of the 13,664 tests generated by Commuter for these calls, and Commuter finds many problems that have been observed to limit application scalability. sv6 scales for 99% of the tests.

@inproceedings{commutativity:sosp13,
  title        = {The Scalable Commutativity Rule: Designing Scalable
    Software for Multicore Processors},
  author       = {Austin T. Clements and M. Frans Kaashoek and
    Nickolai Zeldovich and Robert T. Morris and Eddie Kohler},
  booktitle    = {Proceedings of the 24th ACM Symposium on Operating
    Systems Principles (SOSP 2013)},
  year         = 2013,
  month        = nov,
  address      = {Farmington, Pennsylvania},
}
Optimizing communication bottlenecks in multiprocessor operating system kernels Silas Boyd-Wickizer. Ph.D. thesis, Massachusetts Institute of Technology, September 2013.

One difficulty of programming multicore processors is achieving performance that scales with the number of cores in the system. A common performance optimization is to increase inter-core parallelism. If the application is sufficiently parallelized, developers might hope that performance would scale as core count increases. Unfortunately for some applications, such as operating system kernels, parallelization reveals inter-core communication as a performance bottleneck. When data is updated on one core and read or written on other cores, the cache coherence protocol serializes accesses to the data. The result is that each access to the shared data can take hundreds to thousands of cycles, depending on how many cores are contending for the data.

This dissertation focuses on optimizing communication bottlenecks caused by update-heavy workloads, where a data structure is frequently updated but rarely read. Such data structures are commonly used for operating system kernel bookkeeping, such as LRU lists, reverse maps in virtual memory, and file system notification queues. This dissertation identifies bottlenecks in the Linux kernel caused by update-heavy data structures, presents a general approach for optimizing communication in update-heavy data structures, and presents a library called OpLog that embodies this approach and helps developers achieve good scalability for update-heavy data structures. OpLog achieves scalability by logging update operations in per-core logs, and combining the logs only when required by a read to the data structure. Measurements on a 48-core AMD server show that applying OpLog to update-heavy data structures in the Linux kernel significantly improves application performance under certain workloads.

@phdthesis{sbw-phd,
  title        = {Optimizing Communication Bottlenecks in
    Multiprocessor Operating System Kernels},
  author       = {Silas Boyd-Wickizer},
  school       = {Massachusetts Institute of Technology},
  year         = 2013,
  month        = sep,
}
Processing analytical queries over encrypted data Stephen Tu, M. Frans Kaashoek, Sam Madden, and Nickolai Zeldovich. VLDB 2013.

Monomi is a system for securely executing analytical workloads over sensitive data on an untrusted database server. Monomi works by encrypting the entire database and running queries over the encrypted data. Monomi introduces split client/server query execution, which can execute arbitrarily complex queries over encrypted data, as well as several techniques that improve performance for such workloads, including per-row precomputation, space-efficient encryption, grouped homomorphic addition, and pre-filtering. Since these optimizations are good for some queries but not others, Monomi introduces a designer for choosing an efficient physical design at the server for a given workload, and a planner to choose an efficient execution plan for a given query at runtime. A prototype of Monomi running on top of Postgres can execute most of the queries from the TPC-H benchmark with a median overhead of only 1.24× (ranging from 1.03× to 2.33×) compared to an un-encrypted Postgres database where a compromised server would reveal all data.

@inproceedings{monomi:vldb13,
  title        = {Processing Analytical Queries over Encrypted Data},
  author       = {Stephen Tu and M. Frans Kaashoek and Sam Madden and
    Nickolai Zeldovich},
  booktitle    = {Proceedings of the 39th International Conference on
    Very Large Data Bases (VLDB 2013)},
  year         = 2013,
  month        = aug,
}
Security bugs in embedded interpreters Haogang Chen, Cody Cutler, Taesoo Kim, Yandong Mao, Xi Wang, Nickolai Zeldovich, and M. Frans Kaashoek. APSys 2013.

Because embedded interpreters offer flexibility and performance, they are becoming more prevalent, and can be found at nearly every level of the software stack. As one example, the Linux kernel defines languages to describe packet filtering rules and uses embedded interpreters to filter packets at run time. As another example, the RAR archive format allows embedding bytecode in compressed files to describe reversible transformations for decompression. This paper presents an analysis of common pitfalls in embedded interpreter implementations, which can lead to security vulnerabilities, and their impact. We hope that these results are useful both in augmenting existing embedded interpreters and in aiding developers in building new, more secure embedded interpreters.

@inproceedings{vm:apsys13,
  title        = {Security Bugs in Embedded Interpreters},
  author       = {Haogang Chen and Cody Cutler and Taesoo Kim and
    Yandong Mao and Xi Wang and Nickolai Zeldovich and M. Frans
    Kaashoek},
  booktitle    = {Proceedings of 4th ACM SIGOPS Asia-Pacific Workshop
    on Systems (APSys 2013)},
  year         = 2013,
  month        = jul,
  address      = {Singapore},
}
Optimizing RAM-latency dominated applications Yandong Mao, Cody Cutler, and Robert Morris. APSys 2013.

Many apparently CPU-limited programs are actually bottlenecked by RAM fetch latency, often because they follow pointer chains in working sets that are much bigger than the CPU’s on-chip cache. For example, garbage collectors that identify live objects by tracing inter-object pointers can spend much of their time stalling due to RAM fetches.

We observe that for such workloads, programmers should view RAM much as they view disk. The two situations share not just high access latency, but also a common set of approaches to coping with that latency. Relatively general-purpose techniques such as batching, sorting, and "I/O" concurrency work to hide RAM latency much as they do for disk.

This paper studies several RAM-latency dominated programs and shows how we apply general-purpose approaches to hide RAM latency. The evaluation shows that these optimizations improve performance by a factor up to 1.4×. Counter-intuitively, even though these programs are not limited by CPU cycles, we found that adding more cores can yield better performance.

@inproceedings{ram-latency:apsys13,
  title        = {Optimizing {RAM}-latency Dominated Applications},
  author       = {Yandong Mao and Cody Cutler and Robert Morris},
  booktitle    = {Proceedings of 4th ACM SIGOPS Asia-Pacific Workshop
    on Systems (APSys 2013)},
  year         = 2013,
  month        = jul,
  address      = {Singapore},
}
Optimizing unit test execution in large software programs using dependency analysis Taesoo Kim, Ramesh Chandra, and Nickolai Zeldovich. APSys 2013.

Tao is a system that optimizes the execution of unit tests in large software programs and reduces the programmer wait time from minutes to seconds. Tao is based on two key ideas: First, Tao focuses on efficiency, unlike past work that focused on avoiding false negatives. Tao implements simple and fast function-level dependency tracking that identifies tests to run on a code change; any false negatives missed by this dependency tracking are caught by running the entire test suite on a test server once the code change is committed. Second, to make it easy for programmers to adopt Tao, it incorporates the dependency information into the source code repository. This paper describes an early prototype of Tao and demonstrates that Tao can reduce unit test execution time in two large Python software projects by over 96% while incurring few false negatives.

@inproceedings{tao:apsys13,
  title        = {Optimizing Unit Test Execution in Large Software
    Programs using Dependency Analysis},
  author       = {Taesoo Kim and Ramesh Chandra and Nickolai Zeldovich},
  booktitle    = {Proceedings of 4th ACM SIGOPS Asia-Pacific Workshop
    on Systems (APSys 2013)},
  year         = 2013,
  month        = jul,
  address      = {Singapore},
}
Practical and effective sandboxing for non-root users Taesoo Kim, and Nickolai Zeldovich. USENIX 2013.

Mbox is a lightweight sandboxing mechanism for non-root users in commodity OSes. Mbox's sandbox usage model executes a program in the sandbox and prevents the program from modifying the host filesystem by layering the sandbox filesystem on top of the host filesystem. At the end of program execution, the user can examine changes in the sandbox filesystem and selectively commit them back to the host filesystem. Mbox implements this by interposing on system calls and provides a variety of useful applications: installing system packages as a non-root user, running unknown binaries safely without network accesses, checkpointing the host filesystem instantly, and setting up a virtual development environment without special tools. Our performance evaluation shows that Mbox imposes CPU overheads of 0.1–45.2% for various workloads. In this paper, we present Mbox's design, efficient techniques for interposing on system calls, our experience avoiding common system call interposition pitfalls, and Mbox's performance evaluation.

@inproceedings{mbox:usenix13,
  title        = {Practical and Effective Sandboxing for Non-root
    Users},
  author       = {Taesoo Kim and Nickolai Zeldovich},
  booktitle    = {Proceedings of the 2013 USENIX Annual Technical
    Conference (USENIX '13)},
  year         = 2013,
  month        = jun,
  address      = {San Jose, CA},
}
Graduate admissions at MIT & comparison-based rank aggregation: A case study Katherine Szeto. Master's thesis, Massachusetts Institute of Technology, June 2013.
@mastersthesis{szeto-meng,
  title        = {Graduate Admissions at {MIT} \& Comparison-based
    Rank Aggregation: A Case Study},
  author       = {Katherine Szeto},
  school       = {Massachusetts Institute of Technology},
  year         = 2013,
  month        = jun,
}
Finding linearization violations in lock-free concurrent data structures Sebastien Dabdoub. Master's thesis, Massachusetts Institute of Technology, May 2013.

Finding bugs in lock-free concurrent programs is hard. This is due in part to the difficulty of reasoning about the correctness of concurrent algorithms and the timing-sensitive nature of concurrent programs. One of the most widely used tools for reasoning about the correctness of concurrent algorithms is the linearization property. This thesis presents a tool for automatic dynamic checking of concurrent programs under the Total-Store-Order (TSO) memory model and a methodology for finding linearization violations automatically with the tool.

@mastersthesis{codex:sdabdoub-meng,
  title        = {Finding Linearization Violations in Lock-Free
    Concurrent Data Structures},
  author       = {Sebastien Dabdoub},
  school       = {Massachusetts Institute of Technology},
  year         = 2013,
  month        = may,
}
RadixVM: Scalable address spaces for multithreaded applications (revised 2014-08-05) Austin T. Clements, M. Frans Kaashoek, and Nickolai Zeldovich. EuroSys 2013.

RadixVM is a new virtual memory system design that enables fully concurrent operations on shared address spaces for multithreaded processes on cache-coherent multicore computers. Today, most operating systems serialize operations such as mmap and munmap, which forces application developers to split their multithreaded applications into multiprocess applications, hoard memory to avoid the overhead of returning it, and so on. RadixVM removes this burden from application developers by ensuring that address space operations on non-overlapping memory regions scale perfectly. It does so by combining three techniques: 1) it organizes metadata in a radix tree instead of a balanced tree to avoid unnecessary cache line movement; 2) it uses a novel memory-efficient distributed reference counting scheme; and 3) it uses a new scheme to target remote TLB shootdowns and to often avoid them altogether. Experiments on an 80 core machine show that RadixVM achieves perfect scalability for non-overlapping regions: if several threads mmap or munmap pages in parallel, they can run completely independently and induce no cache coherence traffic.

@inproceedings{radixvm:eurosys13,
  title        = {{RadixVM}: Scalable address spaces for multithreaded
    applications (revised 2014-08-05)},
  author       = {Austin T. Clements and M. Frans Kaashoek and
    Nickolai Zeldovich},
  booktitle    = {Proceedings of the ACM EuroSys Conference (EuroSys
    2013)},
  year         = 2013,
  month        = apr,
  address      = {Prague, Czech Republic},
}

2012

Improving integer security for systems with KINT Xi Wang, Haogang Chen, Zhihao Jia, Nickolai Zeldovich, and M. Frans Kaashoek. OSDI 2012.
@inproceedings{kint:osdi12,
  title        = {Improving Integer Security for Systems with {KINT}},
  author       = {Xi Wang and Haogang Chen and Zhihao Jia and Nickolai
    Zeldovich and M. Frans Kaashoek},
  booktitle    = {Proceedings of the 10th {USENIX} {S}ymposium on
    {O}perating {S}ystems {D}esign and {I}mplementation ({OSDI} '12)},
  year         = 2012,
  month        = oct,
  address      = {Hollywood, California},
}
Efficient patch-based auditing for web application vulnerabilities Taesoo Kim, Ramesh Chandra, and Nickolai Zeldovich. OSDI 2012.

Poirot is a system that, given a patch for a newly discovered security vulnerability in a web application, helps administrators detect past intrusions that exploited the vulnerability. Poirot records all requests to the server during normal operation, and given a patch, re-executes requests using both patched and unpatched software, and reports to the administrator any request that executes differently in the two cases. A key challenge with this approach is the cost of re-executing all requests, and Poirot introduces several techniques to reduce the time required to audit past requests, including filtering requests based on their control flow and memoization of intermediate results across different requests.

A prototype of Poirot for PHP accurately detects attacks on older versions of MediaWiki and HotCRP, given subsequently released patches. Poirot's techniques allow it to audit past requests 12–51× faster than the time it took to originally execute the same requests, for patches to code executed by every request, under a realistic MediaWiki workload.

@inproceedings{poirot:osdi12,
  title        = {Efficient Patch-based Auditing for Web Application
    Vulnerabilities},
  author       = {Taesoo Kim and Ramesh Chandra and Nickolai Zeldovich},
  booktitle    = {Proceedings of the 10th {USENIX} {S}ymposium on
    {O}perating {S}ystems {D}esign and {I}mplementation ({OSDI} '12)},
  year         = 2012,
  month        = oct,
  address      = {Hollywood, California},
}
Undefined behavior: What happened to my code? Xi Wang, Haogang Chen, Alvin Cheung, Zhihao Jia, Nickolai Zeldovich, and M. Frans Kaashoek. APSys 2012.
@inproceedings{ub:apsys12,
  title        = {Undefined Behavior: What Happened to My Code?},
  author       = {Xi Wang and Haogang Chen and Alvin Cheung and Zhihao
    Jia and Nickolai Zeldovich and M. Frans Kaashoek},
  booktitle    = {Proceedings of 3rd ACM SIGOPS Asia-Pacific Workshop
    on Systems (APSys 2012)},
  year         = 2012,
  month        = jul,
  address      = {Seoul, South Korea},
}
Recovering from intrusions in distributed systems with Dare Taesoo Kim, Ramesh Chandra, and Nickolai Zeldovich. APSys 2012.

Dare is a system that recovers system integrity after intrusions that spread between machines in a distributed system. Dare extends the rollback-and-reexecute recovery model of Retro to distributed system recovery by solving the following challenges: tracking dependencies across machines, repairing network connections, minimizing distributed repair, and dealing with long-running daemon processes. This paper describes an early prototype of Dare, presents some preliminary results, and discusses open problems.

@inproceedings{dare:apsys12,
  title        = {Recovering from Intrusions in Distributed Systems
    with {Dare}},
  author       = {Taesoo Kim and Ramesh Chandra and Nickolai Zeldovich},
  booktitle    = {Proceedings of 3rd ACM SIGOPS Asia-Pacific Workshop
    on Systems (APSys 2012)},
  year         = 2012,
  month        = jul,
  address      = {Seoul, South Korea},
}
Non-scalable locks are dangerous Silas Boyd-Wickizer, M. Frans Kaashoek, Robert Morris, and Nickolai Zeldovich. Linux Symposium 2012.
@inproceedings{locks:linuxsymp,
  title        = {Non-scalable locks are dangerous},
  author       = {Silas Boyd-Wickizer and M. Frans Kaashoek and Robert
    Morris and Nickolai Zeldovich},
  booktitle    = {Proceedings of the Linux Symposium},
  year         = 2012,
  month        = jul,
  address      = {Ottawa, Canada},
}
Executing web application queries on a partitioned database Neha Narula, and Robert Morris. USENIX Webapps 2012.

Partitioning data over multiple storage servers is an attractive way to increase throughput for web-like workloads. However, there is often no one partitioning that yields good performance for all queries, and it can be challenging for the web developer to determine how best to execute queries over partitioned data.

This paper presents Dixie, a SQL query planner, optimizer, and executor for databases horizontally partitioned over multiple servers. Dixie focuses on increasing inter-query parallel speedup by involving as few servers as possible in each query. One way it does this is by supporting tables with multiple copies partitioned on different columns, in order to expand the set of queries that can be satisified from a single server. Dixie automatically transforms SQL queries to execute over a partitioned database, using a cost model and plan generator that exploit multiple table copies.

We evaluate Dixie on a database and query stream taken from Wikipedia, partitioned across ten MySQL servers. By adding one copy of a 13 MB table and using Dixie's query optimizer, we achieve a throughput improvement of 3.2X over a single optimized partitioning of each table and 8.5X over the same data on a single server. On specific queries Dixie with table copies increases throughput linearly with the number of servers, while the best single-table-copy partitioning achieves little scaling. For a large class of joins, which traditional wisdom suggests requires tables partitioned on the join keys, Dixie can find higher-performance plans using other partitionings.

@inproceedings{dixie:usenixwebapps2012,
  title        = {Executing Web Application Queries on a Partitioned
    Database},
  author       = {Neha Narula and Robert Morris},
  booktitle    = {Proceedings of the 3rd USENIX Conference on Web
    Application Development (USENIX Webapps '12)},
  year         = 2012,
  month        = jun,
  address      = {Boston, Massachusetts},
}
Improving network connection locality on multicore systems Aleksey Pesterev, Jacob Strauss, Nickolai Zeldovich, and Robert T. Morris. EuroSys 2012.

Incoming and outgoing processing for a given TCP connection often execute on different cores: an incoming packet is typically processed on the core that receives the interrupt, while outgoing data processing occurs on the core running the relevant user code. As a result, accesses to read/write connection state (such as TCP control blocks) often involve cache invalidations and data movement between cores' caches. These can take hundreds of processor cycles, enough to significantly reduce performance.

We present a new design, called Affinity-Accept, that causes all processing for a given TCP connection to occur on the same core. Affinity-Accept arranges for the network interface to determine the core on which application processing for each new connection occurs, in a lightweight way; it adjusts the card's choices only in response to imbalances in CPU scheduling. Measurements show that for the Apache web server serving static files on a 48-core AMD system, Affinity-Accept reduces time spent in the TCP stack by 30% and improves overall throughput by 24%.

@inproceedings{affinity-accept:eurosys12,
  title        = {Improving Network Connection Locality on Multicore
    Systems},
  author       = {Aleksey Pesterev and Jacob Strauss and Nickolai
    Zeldovich and Robert T. Morris},
  booktitle    = {Proceedings of the ACM EuroSys Conference (EuroSys
    2012)},
  year         = 2012,
  month        = apr,
  address      = {Bern, Switzerland},
}
Cache craftiness for fast multicore key-value storage Yandong Mao, Eddie Kohler, and Robert Morris. EuroSys 2012.
@inproceedings{masstree:eurosys12,
  title        = {Cache Craftiness for Fast Multicore Key-Value
    Storage},
  author       = {Yandong Mao and Eddie Kohler and Robert Morris},
  booktitle    = {Proceedings of the ACM EuroSys Conference (EuroSys
    2012)},
  year         = 2012,
  month        = apr,
  address      = {Bern, Switzerland},
}
UFlood: High-throughput flooding over wireless mesh networks Jayashree Subramanian, Robert Morris, and Hari Balakrishnan. Infocom 2012.

This paper proposes UFlood, a flooding protocol for wireless mesh networks. UFlood targets situations such as software updates where all nodes need to receive the same large file of data, and where limited radio range requires forwarding. UFlood's goals are high throughput and low airtime, defined respectively as rate of completion of a flood to the slowest receiving node and total time spent transmitting. The key to achieving these goals is good choice of sender for each transmission opportunity. The best choice evolves as a flood proceeds in ways that are difficult to predict.

UFlood's core new idea is a distributed heuristic to dynamically choose the senders likely to lead to all nodes receiving the flooded data in the least time. The mechanism takes into account which data nearby receivers already have as well as inter-node channel quality. The mechanism includes a novel bit-rate selection algorithm that trades off the speed of high bit-rates against the larger number of nodes likely to receive low bit-rates. Unusually, UFlood uses both random network coding to increase the usefulness of each transmission and detailed feedback about what data each receiver already has; the feedback is critical in deciding which node's coded transmission will have the most benefit to receivers. The required feedback is potentially voluminous, but UFlood includes novel techniques to reduce its cost.

The paper presents an evaluation on a 25-node 802.11 test-bed. UFlood achieves 150% higher throughput than MORE, a high-throughput flooding protocol, using 65% less airtime. UFlood uses 54% less airtime than MNP, an existing efficient protocol, and achieves 300% higher throughput.

@inproceedings{uflood:infocom12,
  title        = {{UFlood}: High-Throughput Flooding over Wireless
    Mesh Networks},
  author       = {Jayashree Subramanian and Robert Morris and Hari
    Balakrishnan},
  booktitle    = {Proceedings of the 31st Infocom},
  year         = 2012,
  month        = mar,
  address      = {Orlando, FL},
}
Scalable address spaces using RCU balanced trees Austin T. Clements, Frans Kaashoek, and Nickolai Zeldovich. ASPLOS 2012.

Software developers commonly exploit multicore processors by building multithreaded software in which all threads of an application share a single address space. This shared address space has a cost: kernel virtual memory operations such as handling soft page faults, growing the address space, mapping files, etc. can limit the scalability of these applications. In widely-used operating systems, all of these operations are synchronized by a single per-process lock. This paper contributes a new design for increasing the concurrency of kernel operations on a shared address space by exploiting read-copy-update (RCU) so that soft page faults can both run in parallel with operations that mutate the same address space and avoid contending with other page faults on shared cache lines. To enable such parallelism, this paper also introduces an RCU-based binary balanced tree for storing memory mappings. An experimental evaluation using three multithreaded applications shows performance improvements on 80 cores ranging from 1.7× to 3.4× for an implementation of this design in the Linux 2.6.37 kernel. The RCU-based binary tree enables soft page faults to run at a constant cost with an increasing number of cores, suggesting that the design will scale well beyond 80 cores.

@inproceedings{rcuvm:asplos12,
  title        = {Scalable Address Spaces Using {RCU} Balanced Trees},
  author       = {Austin T. Clements and Frans Kaashoek and Nickolai
    Zeldovich},
  booktitle    = {Proceedings of the 17th International Conference on
    Architectural Support for Programming Languages and Operating
    Systems ({ASPLOS})},
  year         = 2012,
  month        = mar,
  address      = {London, UK},
}

2011

CPHash: a cache-partitioned hash table Zviad Metreveli, Nickolai Zeldovich, and Frans Kaashoek. MIT CSAIL technical report, November 2011.
@techreport{cphash:tr,
  title        = {{CPHash}: a cache-partitioned hash table},
  author       = {Zviad Metreveli and Nickolai Zeldovich and Frans
    Kaashoek},
  number       = {MIT-CSAIL-TR-2011-051},
  year         = 2011,
  month        = nov,
  institution  = {{MIT} Computer Science and Artificial Intelligence
    Laboratory},
}
Software fault isolation with API integrity and multi-principal modules Yandong Mao, Haogang Chen, Dong Zhou, Xi Wang, Nickolai Zeldovich, and M. Frans Kaashoek. SOSP 2011.
@inproceedings{lxfi:sosp11,
  title        = {Software fault isolation with {API} integrity and
    multi-principal modules},
  author       = {Yandong Mao and Haogang Chen and Dong Zhou and Xi
    Wang and Nickolai Zeldovich and M. Frans Kaashoek},
  booktitle    = {Proceedings of the 23rd ACM Symposium on Operating
    Systems Principles (SOSP 2011)},
  year         = 2011,
  month        = oct,
  address      = {Cascais, Portugal},
}
Intrusion recovery for database-backed web applications Ramesh Chandra, Taesoo Kim, Meelap Shah, Neha Narula, and Nickolai Zeldovich. SOSP 2011.

Warp is a system that helps users and administrators of web applications recover from intrusions such as SQL injection, cross-site scripting, and clickjacking attacks, while preserving legitimate user changes. Warp repairs from an intrusion by rolling back parts of the database to a version before the attack, and replaying subsequent legitimate actions. Warp allows administrators to retroactively patch security vulnerabilities – i.e., apply new security patches to past executions – to recover from intrusions without requiring the administrator to track down or even detect attacks. Warp's time-travel database allows fine-grained rollback of database rows, and enables repair to proceed concurrently with normal operation of a web application. Finally, Warp captures and replays user input at the level of a browser's DOM, to recover from attacks that involve a user's browser. For a web server running MediaWiki, Warp requires no application source code changes to recover from a range of common web application vulnerabilities with minimal user input at a cost of 24–27% in throughput and 2–3.2 GB/day in storage.

@inproceedings{warp:sosp11,
  title        = {Intrusion Recovery for Database-backed Web
    Applications},
  author       = {Ramesh Chandra and Taesoo Kim and Meelap Shah and
    Neha Narula and Nickolai Zeldovich},
  booktitle    = {Proceedings of the 23rd ACM Symposium on Operating
    Systems Principles (SOSP 2011)},
  year         = 2011,
  month        = oct,
  address      = {Cascais, Portugal},
}
Linux kernel vulnerabilities: State-of-the-art defenses and open problems Haogang Chen, Yandong Mao, Xi Wang, Dong Zhou, Nickolai Zeldovich, and M. Frans Kaashoek. APSys 2011.
@inproceedings{vulnerabilities:chen,
  title        = {Linux kernel vulnerabilities: State-of-the-art
    defenses and open problems},
  author       = {Haogang Chen and Yandong Mao and Xi Wang and Dong
    Zhou and Nickolai Zeldovich and M. Frans Kaashoek},
  booktitle    = {Proceedings of 2nd ACM SIGOPS Asia-Pacific Workshop
    on Systems (APSys 2011)},
  year         = 2011,
  month        = jul,
  address      = {Shanghai, China},
}
Retroactive auditing Xi Wang, Nickolai Zeldovich, and M. Frans Kaashoek. APSys 2011.
@inproceedings{rad:apsys11,
  title        = {Retroactive Auditing},
  author       = {Xi Wang and Nickolai Zeldovich and M. Frans Kaashoek},
  booktitle    = {Proceedings of 2nd ACM SIGOPS Asia-Pacific Workshop
    on Systems (APSys 2011)},
  year         = 2011,
  month        = jul,
  address      = {Shanghai, China},
}
Eyo: Device-transparent personal storage Jacob Strauss, Justin Mazzola Paluska, Chris Lesniewski-Laas, Bryan Ford, Robert Morris, and Frans Kaashoek. USENIX 2011.

Users increasingly store data collections such as digital photographs on multiple personal devices, each of which typically offers a storage management interface oblivious to the contents of the user's other devices. As a result, collections become disorganized and drift out of sync. This paper presents Eyo, a novel personal storage system that provides device transparency: a user can think in terms of "file X", rather than "file X on device Y ", and will see the same set of files on all personal devices. Eyo allows a user to view and manage the entire collection of objects from any of their devices, even from disconnected devices and devices with too little storage to hold all the object content. Eyo synchronizes these collections across any network topology, including direct peer-to-peer links. Eyo provides applications with a storage API with first-class access to object version history in order to resolve update conflicts automatically. Experiments with several applications Eyo usingmedia players, a photo editor, a podcast manager, and an interface emailshow that device transparency requires only minor application changes, and matches the storage and bandwidth capabilities of typical portable devices.

@inproceedings{eyo:usenix11,
  title        = {Eyo: Device-Transparent Personal Storage},
  author       = {Jacob Strauss and Justin Mazzola Paluska and Chris
    Lesniewski-Laas and Bryan Ford and Robert Morris and Frans
    Kaashoek},
  booktitle    = {Proceedings of the 2011 USENIX Annual Technical
    Conference (USENIX '11)},
  year         = 2011,
  month        = jun,
  address      = {Portland, Oregon},
}
A software approach to unifying multicore caches Silas Boyd-Wickizer, M. Frans Kaashoek, Robert Morris, and Nickolai Zeldovich. MIT CSAIL technical report, June 2011.
@techreport{mfc:tr11,
  title        = {A Software Approach to Unifying Multicore Caches},
  author       = {Silas Boyd-Wickizer and M. Frans Kaashoek and Robert
    Morris and Nickolai Zeldovich},
  institution  = {{MIT} Computer Science and Artificial Intelligence
    Laboratory},
  number       = {MIT-CSAIL-TR-2011-032},
  year         = 2011,
  month        = jun,
}
CPHash: A cache-partitioned hash table with LRU eviction Zviad Metreveli. Master's thesis, Massachusetts Institute of Technology, June 2011.
In this thesis we introduce CPHash – a scalable fixed size hash table that supports eviction using an LRU list, and CPServer – a scalable in memory key/value cache server that uses CPHash to implement its hash table. CPHash uses computation migration to avoid transferring data between cores. Experiments on a 48 core machine show that CPHash has 2 to 3 times higher throughput than a hash table implemented using scalable fine-grained locks. CPServer achieves 1.2 to 1.7 times higher throughput than a key/value cache server that uses a hash table with scalable fine-grained locks and 1.5 to 2.6 times higher throughput than Memcached.
@mastersthesis{cphash:zviad-meng-thesis,
  title        = {{CPHash}: A Cache-Partitioned Hash Table with {LRU}
    Eviction},
  author       = {Zviad Metreveli},
  school       = {Massachusetts Institute of Technology},
  year         = 2011,
  month        = jun,
}

2010

Intrusion recovery using selective re-execution Taesoo Kim, Xi Wang, Nickolai Zeldovich, and M. Frans Kaashoek. OSDI 2010.

Retro repairs a desktop or server after an adversary compromises it, by undoing the adversary's changes while preserving legitimate user actions, with minimal user involvement. During normal operation, Retro records an action history graph, which is a detailed dependency graph describing the system's execution. Retro uses refinement to describe graph objects and actions at multiple levels of abstraction, which allows for precise dependencies. During repair, Retro uses the action history graph to undo an unwanted action and its indirect effects by first rolling back its direct effects, and then re-executing legitimate actions that were influenced by that change. To minimize user involvement and re-execution, Retro uses predicates to selectively re-execute only actions that were semantically affected by the adversary's changes, and uses compensating actions to handle external effects.

An evaluation of a prototype of Retro for Linux with 2 real-world attacks, 2 synthesized challenge attacks, and 6 attacks from previous work, shows that Retro can often repair the system without user involvement, and avoids false positives and negatives from previous solutions. These benefits come at the cost of 35–127% in execution time overhead and of 4–150 GB of log space per day, depending on the workload. For example, a HotCRP paper submission web site incurs 35% slowdown and generates 4 GB of logs per day under the workload from 30 minutes prior to the SOSP 2007 deadline.

@inproceedings{retro:osdi10,
  title        = {Intrusion recovery using selective re-execution},
  author       = {Taesoo Kim and Xi Wang and Nickolai Zeldovich and M.
    Frans Kaashoek},
  booktitle    = {Proceedings of the 9th {USENIX} {S}ymposium on
    {O}perating {S}ystems {D}esign and {I}mplementation ({OSDI} '10)},
  year         = 2010,
  month        = oct,
  address      = {Vancouver, Canada},
}
An analysis of Linux scalability to many cores Silas Boyd-Wickizer, Austin T. Clements, Yandong Mao, Aleksey Pesterev, M. Frans Kaashoek, Robert Morris, and Nickolai Zeldovich. OSDI 2010.

This paper analyzes the scalability of seven system applications (Exim, memcached, Apache, PostgreSQL, gmake, Psearchy, and MapReduce) running on Linux on a 48-core computer. Except for gmake, all applications trigger scalability bottlenecks inside a recent Linux kernel. Using mostly standard parallel programming techniques—this paper introduces one new technique, sloppy counters—these bottlenecks can be removed from the kernel or avoided by changing the applications slightly. Modifying the kernel required in total 3002 lines of code changes. A speculative conclusion from this analysis is that there is no scalability reason to give up on traditional operating system organizations just yet.

@inproceedings{linux:osdi10,
  title        = {An Analysis of {Linux} Scalability to Many Cores},
  author       = {Silas Boyd-Wickizer and Austin T. Clements and
    Yandong Mao and Aleksey Pesterev and M. Frans Kaashoek and Robert
    Morris and Nickolai Zeldovich},
  booktitle    = {Proceedings of the 9th {USENIX} {S}ymposium on
    {O}perating {S}ystems {D}esign and {I}mplementation ({OSDI} '10)},
  year         = 2010,
  month        = oct,
  address      = {Vancouver, Canada},
}
Design and applications of a secure and decentralized distributed hash table Christopher T. Lesniewski-Laas. Ph.D. thesis, Massachusetts Institute of Technology, September 2010.
@phdthesis{ctl-phd,
  title        = {Design and Applications of a Secure and
    Decentralized Distributed Hash Table},
  author       = {Christopher T. Lesniewski-Laas},
  school       = {Massachusetts Institute of Technology},
  year         = 2010,
  month        = sep,
}
Device-transparent personal storage Jacob Strauss. Ph.D. thesis, Massachusetts Institute of Technology, September 2010.
@phdthesis{jastr-phd,
  title        = {Device-Transparent Personal Storage},
  author       = {Jacob Strauss},
  school       = {Massachusetts Institute of Technology},
  year         = 2010,
  month        = sep,
}
Making Linux protection mechanisms egalitarian with UserFS Taesoo Kim, and Nickolai Zeldovich. USENIX Security 2010.

UserFS provides egalitarian OS protection mechanisms in Linux. UserFS allows any user—not just the system administrator—to allocate Unix user IDs, to use chroot, and to set up firewall rules in order to confine untrusted code. One key idea in UserFS is representing user IDs as files in a /proc-like file system, thus allowing applications to manage user IDs like any other files, by setting permissions and passing file descriptors over Unix domain sockets. UserFS addresses several challenges in making user IDs egalitarian, including accountability, resource allocation, persistence, and UID reuse. We have ported several applications to take advantage of UserFS; by changing just tens to hundreds of lines of code, we prevented attackers from exploiting application-level vulnerabilities, such as code injection or missing ACL checks in a PHP-based wiki application. Implementing UserFS requires minimal changes to the Linux kernel-a single 3,000-line kernel module-and incurs no performance overhead for most operations, making it practical to deploy on real systems.

@inproceedings{userfs:sec19,
  title        = {Making {Linux} Protection Mechanisms Egalitarian
    with {UserFS}},
  author       = {Taesoo Kim and Nickolai Zeldovich},
  booktitle    = {Proceedings of the 19th {USENIX} {S}ecurity
    {S}ymposium},
  year         = 2010,
  month        = aug,
  address      = {Washington, D.C.},
}
Tolerating malicious device drivers in Linux Silas Boyd-Wickizer, and Nickolai Zeldovich. USENIX 2010.
@inproceedings{sud:usenix10,
  title        = {Tolerating Malicious Device Drivers in {Linux}},
  author       = {Silas Boyd-Wickizer and Nickolai Zeldovich},
  booktitle    = {Proceedings of the 2010 USENIX Annual Technical
    Conference (USENIX '10)},
  year         = 2010,
  month        = jun,
  address      = {Boston, Massachusetts},
}
WhanuaSIP: A secure peer-to-peer communication platform Raymond Chen. Master's thesis, Massachusetts Institute of Technology, June 2010.
This thesis presents a novel mechanism for achieving secure and reliable peer-to-peer communications on the Internet. WhanauSIP merges a Sybil-proof distributed hash table with mature SIP technology to enable instant messaging, audio chat, and video conferencing that is resilient to censoring, eavesdropping, and forgery. Performance and security evaluations performed on the PlanetLab network demonstrate that the majority of resource lookups return within 5 seconds. These results indicate that WhanauSIP delivers practical performance with respect to call session initialization latency for VoIP telephony. Furthermore, the tests demonstrated that lookup performance was minimally affected during a Sybil cluster ID attack, illustrating the network's resilience to malicious adversaries. The thesis delivers three software packages for public use: a general Whanau distributed hash table implementation, a WhanauSIP gateway, and a desktop IM/VoIP client.
@mastersthesis{whanuasip:raymond-meng-thesis,
  title        = {{WhanuaSIP}: A Secure Peer-to-Peer Communication
    Platform},
  author       = {Raymond Chen},
  school       = {Massachusetts Institute of Technology},
  year         = 2010,
  month        = jun,
}
Optimizing MapReduce for multicore architectures Yandong Mao, Robert Morris, and Frans Kaashoek. MIT CSAIL technical report, May 2010.
@techreport{metis:tr,
  title        = {Optimizing {MapReduce} for Multicore Architectures},
  author       = {Yandong Mao and Robert Morris and Frans Kaashoek},
  number       = {MIT-CSAIL-TR-2010-020},
  year         = 2010,
  month        = may,
  institution  = {{MIT} Computer Science and Artificial Intelligence
    Laboratory},
}
Whānau: A Sybil-proof distributed hash table Chris Lesniewski-Laas, and M. Frans Kaashoek. NSDI 2010.

Whānau is a novel routing protocol for distributed hash tables (DHTs) that is efficient and strongly resistant to the Sybil attack. Whānau uses the social connections between users to build routing tables that enable Sybil-resistant lookups. The number of Sybils in the social network does not affect the protocol's performance, but links between honest users and Sybils do. When there are n well-connected honest nodes, Whānau can tolerate up to O(n/log n) such "attack edges". This means that an adversary must convince a large fraction of the honest users to make a social connection with the adversary's Sybils before any lookups will fail.

Whānau uses ideas from structured DHTs to build routing tables that contain O(√n log n) entries per node. It introduces the idea of layered identifiers to counter clustering attacks, a class of Sybil attacks challenging for previous DHTs to handle. Using the constructed tables, lookups provably take constant time. Simulation results, using social network graphs from LiveJournal, Flickr, YouTube, and DBLP, confirm the analytic results. Experimental results on PlanetLab confirm that the protocol can handle modest churn.

@inproceedings{whanau:nsdi10,
  title        = {Wh\=anau: A {Sybil}-Proof Distributed Hash Table},
  author       = {Chris Lesniewski-Laas and M. Frans Kaashoek},
  booktitle    = {Proceedings of the 7th {USENIX} {S}ymposium on
    {N}etworked {S}ystems {D}esign and {I}mplementation ({NSDI} '10)},
  year         = 2010,
  month        = apr,
  address      = {San Jose, CA},
}
Locating cache performance bottlenecks using data profiling Aleksey Pesterev, Nickolai Zeldovich, and Robert T. Morris. EuroSys 2010.

Effective use of CPU data caches is critical to good performance, but poor cache use patterns are often hard to spot using existing execution profiling tools. Typical profilers attribute costs to specific code locations. The costs due to frequent cache misses on a given piece of data, however, may be spread over instructions throughout the application. The resulting individually small costs at a large number of instructions can easily appear insignificant in a code profiler's output.

DProf helps programmers understand cache miss costs by attributing misses to data types instead of code. Associating cache misses with data helps programmers locate data structures that experience misses in many places in the application's code. DProf introduces a number of new views of cache miss data, including a data profile, which reports the data types with the most cache misses, and a data flow graph, which summarizes how objects of a given type are accessed throughout their lifetime, and which accesses incur expensive cross-CPU cache loads. We present two case studies of using DProf to find and fix cache performance bottlenecks in Linux. The improvements provide a 16--57% throughput improvement on a range of memcached and Apache workloads.

@inproceedings{dprof:eurosys10,
  title        = {Locating Cache Performance Bottlenecks Using Data
    Profiling},
  author       = {Aleksey Pesterev and Nickolai Zeldovich and Robert
    T. Morris},
  booktitle    = {Proceedings of the ACM EuroSys Conference (EuroSys
    2010)},
  year         = 2010,
  month        = apr,
  address      = {Paris, France},
}

2009

Device transparency: a new model for mobile storage Jacob Strauss, Chris Lesniewski-Laas, Justin Mazzola Paluska, Bryan Ford, Robert Morris, and Frans Kaashoek. HotStorage 2009.

This paper proposes a new storage model, device transparency, in which users view and manage their entire data collection from any of their devices, even from disconnected storage-limited devices holding only a subset of the entire collection.

@inproceedings{eyo:hotstorage09,
  title        = {Device Transparency: a New Model for Mobile Storage},
  author       = {Jacob Strauss and Chris Lesniewski-Laas and Justin
    Mazzola Paluska and Bryan Ford and Robert Morris and Frans
    Kaashoek},
  booktitle    = {Proceedings of the {W}orkshop on {H}ot {T}opics in
    {S}torage and {F}ile {S}ystems ({HotStorage} '09)},
  year         = 2009,
  month        = oct,
  address      = {Big Sky, Montana},
}
Improving application security with data flow assertions Alexander Yip, Xi Wang, Nickolai Zeldovich, and M. Frans Kaashoek. SOSP 2009.

Resin is a new language runtime that helps prevent security vulnerabilities, by allowing programmers to specify application-level data flow assertions. Resin provides policy objects, which programmers use to specify assertion code and metadata; data tracking, which allows programmers to associate assertions with application data, and to keep track of assertions as the data flow through the application; and filter objects, which programmers use to define data flow boundaries at which assertions are checked. Resin's runtime checks data flow assertions by propagating policy objects along with data, as that data moves through the application, and then invoking filter objects when data crosses a data flow boundary, such as when writing data to the network or a file.

Using Resin, Web application programmers can prevent a range of problems, from SQL injection and cross-site scripting, to inadvertent password disclosure and missing access control checks. Adding a Resin assertion to an application requires few changes to the existing application code, and an assertion can reuse existing code and data structures. For instance, 23 lines of code detect and prevent three previously-unknown missing access control vulnerabilities in phpBB, a popular Web forum application. Other assertions comprising tens of lines of code prevent a range of vulnerabilities in Python and PHP applications. A prototype of Resin incurs a 33% CPU overhead running the HotCRP conference management application.

@inproceedings{resin:sosp09,
  title        = {Improving Application Security with Data Flow
    Assertions},
  author       = {Alexander Yip and Xi Wang and Nickolai Zeldovich and
    M. Frans Kaashoek},
  booktitle    = {Proceedings of the 22th {ACM} {S}ymposium on
    {O}perating {S}ystems {P}rinciples ({SOSP} '09)},
  year         = 2009,
  month        = oct,
  address      = {Big Sky, Montana},
}
Whanaungatanga: Sybil-proof routing with social networks Chris Lesniewski-Laas, and M. Frans Kaashoek. MIT CSAIL technical report, September 2009.
@techreport{whanau:tr09,
  title        = {Whanaungatanga: {Sybil}-proof routing with social
    networks},
  author       = {Chris Lesniewski-Laas and M. Frans Kaashoek},
  institution  = {{MIT} Computer Science and Artificial Intelligence
    Laboratory},
  number       = {MIT-CSAIL-TR-2009-045},
  year         = 2009,
  month        = sep,
}
Flexible, wide-area storage for distributed systems using semantic cues Jeremy Stribling. Ph.D. thesis, Massachusetts Institute of Technology, September 2009.
@phdthesis{strib-phd,
  title        = {Flexible, Wide-Area Storage for Distributed Systems
    Using Semantic Cues},
  author       = {Jeremy Stribling},
  school       = {Massachusetts Institute of Technology},
  year         = 2009,
  month        = sep,
}
Improving web site security with data flow management Alexander Yip. Ph.D. thesis, Massachusetts Institute of Technology, September 2009.
@phdthesis{yipal-phd,
  title        = {Improving Web Site Security with Data Flow
    Management},
  author       = {Alexander Yip},
  school       = {Massachusetts Institute of Technology},
  year         = 2009,
  month        = sep,
}
Reinventing scheduling for multicore systems Silas Boyd-Wickizer, Robert Morris, and M. Frans Kaashoek. HotOS 2009.
@inproceedings{o2:hotos09,
  title        = {Reinventing Scheduling for Multicore Systems},
  author       = {Silas Boyd-Wickizer and Robert Morris and M. Frans
    Kaashoek},
  booktitle    = {Proceedings of the 12th {W}orkshop on {H}ot {T}opics
    in {O}perating {S}ystems ({HotOS-XII})},
  year         = 2009,
  month        = may,
  address      = {Monte Verit{\`a}, Switzerland},
}
Efficient file distribution in a flexible, wide-area file system Irene Zhang. Master's thesis, Massachusetts Institute of Technology, May 2009.

WheelFS is a wide-area distributed file system designed to help applications cope with the challenges of sharing data over the wide-area network. A wide range of applications can use WheelFS as a storage layer because applications can control various trade-offs in WheelFS, such as consistency versus availability, using \textit{semantic cues}. One key feature that many applications require from any storage system is efficient file distribution. The storage system needs to be able to serve files quickly, even large or popular ones, and allow users and applications to quickly browse files. Wide-area links with high latency and low throughput make achieving these goals difficult for most distributed storage systems.

This thesis explores using prefetching, a traditional file system optimization technique, in wide-area file systems for more efficient file distribution. This thesis focuses on \textit{Tread}, a prefetcher for WheelFS. Tread includes several types of prefetching to improve the performance of reading files and directories in WheelFS: read-ahead prefetching, whole file prefetching, directory prefetching and a prefetching optimization for WheelFS's built-in cooperative caching. To makes the best use of scarce wide-area resources, Tread adaptively rate-limits prefetching and gives applications control over what and how prefetching is done using WheelFS's semantic cues.

Experiments show that Tread can reduce the time to read a 10MB file in WheelFS by 40% and the time to list a directory with 100 entries by more than 80%. In addition, experiments on Planetlab show that using prefetching with cooperative caching to distribute a 10MB file to 270 clients reduces the average latency for each client to read the file by almost 45%.

@mastersthesis{wheelfs:irene-meng,
  title        = {Efficient File Distribution in a Flexible, Wide-area
    File System},
  author       = {Irene Zhang},
  school       = {Massachusetts Institute of Technology},
  year         = 2009,
  month        = may,
}
Securing wide-area storage in WheelFS Xavid Pretzer. Master's thesis, Massachusetts Institute of Technology, May 2009.
@mastersthesis{wheelfs:xavid-thesis,
  title        = {Securing Wide-area Storage in {WheelFS}},
  author       = {Xavid Pretzer},
  school       = {Massachusetts Institute of Technology},
  year         = 2009,
  month        = may,
}
Flexible, wide-area storage for distributed systems with WheelFS Jeremy Stribling, Yair Sovran, Irene Zhang, Xavid Pretzer, Jinyang Li, M. Frans Kaashoek, and Robert Morris. NSDI 2009.

WheelFS is a wide-area distributed storage system intended to help multi-site applications share data and gain fault tolerance. WheelFS takes the form of a distributed file system with a familiar POSIX interface. Its design allows applications to adjust the tradeoff between prompt visibility of updates from other sites and the ability for sites to operate independently despite failures and long delays. WheelFS allows these adjustments via semantic cues, which provide application control over consistency, failure handling, and file and replica placement.

WheelFS is implemented as a user-level file system and is deployed on PlanetLab and Emulab. Three applications (a distributed Web cache, an email service and large file distribution) demonstrate that WheelFS's file system interface simplifies construction of distributed applications by allowing reuse of existing software. These applications would perform poorly with the strict semantics implied by a traditional file system interface, but by providing cues to WheelFS they are able to achieve good performance. Measurements show that applications built on WheelFS deliver comparable performance to services such as CoralCDN and BitTorrent that use specialized wide-area storage systems.

@inproceedings{wheelfs:nsdi09,
  title        = {Flexible, Wide-Area Storage for Distributed Systems
    with {WheelFS}},
  author       = {Jeremy Stribling and Yair Sovran and Irene Zhang and
    Xavid Pretzer and Jinyang Li and M. Frans Kaashoek and Robert
    Morris},
  booktitle    = {Proceedings of the 6th {USENIX} {S}ymposium on
    {N}etworked {S}ystems {D}esign and {I}mplementation ({NSDI} '09)},
  year         = 2009,
  month        = apr,
  address      = {Boston, MA},
}
Privacy-preserving browser-side scripting with BFlow Alexander Yip, Neha Narula, Maxwell Krohn, and Robert Morris. EuroSys 2009.

Some web sites provide interactive extensions using browser scripts, often without inspecting the scripts to verify that they are benign and bug-free. Others handle users' confidential data and display it via the browser. Such new features contribute to the power of online services, but their combination would allow attackers to steal confidential data. This paper presents BFlow, a security system that uses information flow control to allow the combination while preventing attacks on data confidentiality.

BFlow allows untrusted JavaScript to compute with, render, and store confidential data, while preventing leaks of that data. BFlow tracks confidential data as it flows within the browser, between scripts on a page and between scripts and web servers. Using these observations and assistance from participating web servers, BFlow prevents scripts that have seen confidential data from leaking it, all without disrupting the JavaScript communication techniques used in complex web pages. To achieve these ends, BFlow augments browsers with a new "protection zone" abstraction.

We have implemented a BFlow browser reference monitor and server support. To evaluate BFlow's confidentiality protection and flexibility, we have built a BFlow-protected blog that supports Blogger's third party JavaScript extensions. BFlow is compatible with every legitimate Blogger extension that we have found, yet it prevents malicious extensions from leaking confidential data.

@inproceedings{bflow:eurosys09,
  title        = {Privacy-Preserving Browser-Side Scripting With
    {BFlow}},
  author       = {Alexander Yip and Neha Narula and Maxwell Krohn and
    Robert Morris},
  booktitle    = {Proceedings of the ACM EuroSys Conference (EuroSys
    2009)},
  year         = 2009,
  month        = mar,
  address      = {Nuremberg, Germany},
}
Ksplice: Automatic rebootless kernel updates Jeff Arnold, and M. Frans Kaashoek. EuroSys 2009.
@inproceedings{ksplice:eurosys,
  title        = {Ksplice: Automatic rebootless kernel updates},
  author       = {Jeff Arnold and M. Frans Kaashoek},
  booktitle    = {Proceedings of the ACM EuroSys Conference (EuroSys
    2009)},
  year         = 2009,
  month        = mar,
  address      = {Nuremberg, Germany},
}

2008

Corey: An operating system for many cores Silas Boyd-Wickizer, Haibo Chen, Rong Chen, Yandong Mao, Frans Kaashoek, Robert Morris, Aleksey Pesterev, Lex Stein, Ming Wu, Yuehua Dai, Yang Zhang, and Zheng Zhang. OSDI 2008.
@inproceedings{corey:osdi08,
  title        = {Corey: An Operating System for Many Cores},
  author       = {Silas Boyd-Wickizer and Haibo Chen and Rong Chen and
    Yandong Mao and Frans Kaashoek and Robert Morris and Aleksey
    Pesterev and Lex Stein and Ming Wu and Yuehua Dai and Yang Zhang
    and Zheng Zhang},
  pages        = {43--57},
  booktitle    = {Proceedings of the 8th {USENIX} {S}ymposium on
    {O}perating {S}ystems {D}esign and {I}mplementation ({OSDI} '08)},
  year         = 2008,
  month        = dec,
  address      = {San Diego, California},
}
An extension-oriented compiler Russ Cox. Ph.D. thesis, Massachusetts Institute of Technology, September 2008.
@phdthesis{rsc-phd,
  title        = {An Extension-Oriented Compiler},
  author       = {Russ Cox},
  school       = {Massachusetts Institute of Technology},
  year         = 2008,
  month        = sep,
}
Information flow control for secure web sites Maxwell Krohn. Ph.D. thesis, Massachusetts Institute of Technology, September 2008.
@phdthesis{krohn-phd,
  title        = {Information Flow Control for Secure Web Sites},
  author       = {Maxwell Krohn},
  school       = {Massachusetts Institute of Technology},
  year         = 2008,
  month        = sep,
}
UIA: A global connectivity architecture for mobile personal devices Bryan Alexander Ford. Ph.D. thesis, Massachusetts Institute of Technology, September 2008.
@phdthesis{ford-phd,
  title        = {{UIA}: A Global Connectivity Architecture for Mobile
    Personal Devices},
  author       = {Bryan Alexander Ford},
  school       = {Massachusetts Institute of Technology},
  year         = 2008,
  month        = sep,
}
Vx32: Lightweight user-level sandboxing on the x86 Bryan Ford, and Russ Cox. USENIX 2008.

Code sandboxing is useful for many purposes, but most sandboxing techniques require kernel modifications, do not completely isolate guest code, or incur substantial performance costs. Vx32 is a multipurpose user-level sandbox that enables any application to load and safely execute one or more guest plug-ins, confining each guest to a system call API controlled by the host application and to a restricted memory region within the host’s address space. Vx32 runs guest code efficiently on several widespread operating systems without kernel extensions or special privileges; it protects the host program from both reads and writes by its guests; and it allows the host to restrict the instruction set available to guests. The key to vx32’s combination of portability, flexibility, and efficiency is its use of x86 segmentation hardware to sandbox the guest’s data accesses, along with a lightweight instruction translator to sandbox guest instructions.

We evaluate vx32 using microbenchmarks and whole system benchmarks, and we examine four applications based on vx32: an archival storage system, an extensible public-key infrastructure, an experimental user-level operating system running atop another host OS, and a Linux system call jail. The first three applications export custom APIs independent of the host OS to their guests, making their plug-ins binary-portable across host systems. Compute-intensive workloads for the first two applications exhibit between a 30% slowdown and a 30% speedup on vx32 relative to native execution; speedups result from vx32’s instruction translator improving the cache locality of guest code. The experimental user-level operating system allows the use of the guest OS’s applications alongside the host’s native applications and runs faster than whole-system virtual machine monitors such as VMware and QEMU. The Linux system call jail incurs up to 80% overhead but requires no kernel modifications and is delegation-based, avoiding concurrency vulnerabilities present in other interposition mechanisms.

@inproceedings{vx32:usenix08,
  title        = {Vx32: Lightweight User-level Sandboxing on the x86},
  author       = {Bryan Ford and Russ Cox},
  booktitle    = {Proceedings of the 2008 USENIX Annual Technical
    Conference (USENIX '08)},
  year         = 2008,
  month        = jun,
  address      = {Boston, Massachusetts},
}
Storing and managing data in a distributed hash table Emil Sit. Ph.D. thesis, Massachusetts Institute of Technology, June 2008.
@phdthesis{pt:sit,
  title        = {Storing and Managing Data in a Distributed Hash
    Table},
  author       = {Emil Sit},
  school       = {Massachusetts Institute of Technology},
  year         = 2008,
  month        = jun,
}
Ksplice: An automatic system for rebootless kernel security updates Jeffrey Brian Arnold. Master's thesis, Massachusetts Institute of Technology, May 2008.

Ksplice allows system administrators to apply security patches to their operating system kernels without having to reboot. Based on a source code patch and the kernel source code to be patched, Ksplice applies the patch to the corresponding running kernel, without requiring work from a programmer. To be fully automatic, Ksplice's design is limited to patches that do not introduce semantic changes to data structures, but a study of all significant x86-32 Linux security patches from May 2005 to December 2007 finds that only eight patches of 50 make semantic changes. An evaluation with Debian and kernel.org Linux kernels shows that Ksplice can automatically apply the remaining 42 patches, which means that 84% of the Linux kernel vulnerabilities from this interval can be corrected by Ksplice without the need for rebooting.

@mastersthesis{ksplice:jbarnold-meng,
  title        = {{Ksplice}: An Automatic System for Rebootless Kernel
    Security Updates},
  author       = {Jeffrey Brian Arnold},
  school       = {Massachusetts Institute of Technology},
  year         = 2008,
  month        = may,
}
UsenetDHT: A low-overhead design for Usenet Emil Sit, Robert Morris, and M. Frans Kaashoek. NSDI 2008.

Usenet is a popular distributed messaging and file sharing service: servers in Usenet flood articles over an overlay network to fully replicate articles across all servers. However, replication of Usenet's full content requires that each server pay the cost of receiving (and storing) over 1 Tbyte/day. This paper presents the design and implementation of UsenetDHT, a Usenet system that allows a set of cooperating sites to keep a shared, distributed copy of Usenet articles. UsenetDHT consists of client-facing Usenet NNTP front-ends and a distributed hash table (DHT) that provides shared storage of articles across the wide area. This design allows participating sites to partition the storage burden, rather than replicating all Usenet articles at all sites.

UsenetDHT requires a DHT that maintains durability despite transient and permanent failures, and provides high storage performance. These goals can be difficult to provide simultaneously: even in the absence of failures, verifying adequate replication levels of large numbers of objects can be resource intensive, and interfere with normal operations. This paper introduces Passing Tone, a new replica maintenance algorithm for DHash that minimizes the impact of monitoring replication levels on memory and disk resources by operating with only pairwise communication. Passing Tone's implementation provides performance by using data structures that avoid disk accesses and enable batch operations.

Microbenchmarks over a local gigabit network demonstrate that the total system throughput scales linearly as servers are added, providing 5.7 Mbyte/s of write bandwidth and 7 Mbyte/s of read bandwidth per server. UsenetDHT is currently deployed on a 12-server network at 7 sites running Passing Tone over the wide-area: this network supports our research laboratory's live 2.5 Mbyte/s Usenet feed and 30.6 Mbyte/s of synthetic read traffic. These results suggest a DHT-based design may be a viable way to redesign Usenet and globally reduce costs.

@inproceedings{usenetdht:nsdi08,
  title        = {{UsenetDHT}: A Low-Overhead Design for {Usenet}},
  author       = {Emil Sit and Robert Morris and M. Frans Kaashoek},
  booktitle    = {Proceedings of the 5rd {USENIX} {S}ymposium on
    {N}etworked {S}ystems {D}esign and {I}mplementation ({NSDI} '08)},
  year         = 2008,
  month        = apr,
  address      = {San Francisco, CA},
}
An offline foundation for online accountable pseudonyms Bryan Ford, and Jacob Strauss. SocialNets 2008.

Online anonymity often appears to undermine accountability, offering little incentive for civil behavior, but accountability failures usually result not from anonymity itself but from the disposability of virtual identities. A user banned for misbehavior--- e.g., spamming from a free E-mail account or stuffing an online ballot box--- can simply open other accounts or connect from other IP addresses. Instead of curtailing freedom of expression by giving up anonymity, online services and communities should support accountable pseudonyms: virtual personas that can provide both anonymity and accountability. We propose Pseudonym parties, a scheme for creating accountable pseudonyms, which combine in-person social occasions (parties) with technical infrastructure (a pseudonymous sign-on service) to enforce the rule that one real person gets one virtual persona on any participating online service. Pseudonym parties enable the user to adopt different personas in different online spaces without revealing the connection between them, while ensuring that each user has only one accountable pseudonym in each space. Pseudonym parties can be started incrementally in a fully decentralized fashion, can run on volunteer labor with minimal funds, and may even be fun.

@inproceedings{accountable:socialnets08,
  title        = {An Offline Foundation for Online Accountable
    Pseudonyms},
  author       = {Bryan Ford and Jacob Strauss},
  booktitle    = {Proceedings of the First International Workshop on
    Social Network Systems (SocialNets 2008)},
  year         = 2008,
  month        = apr,
  address      = {Glasgow, Scotland},
}
A Sybil-proof one-hop DHT Chris Lesniewski-Laas. SocialNets 2008.

Decentralized systems, such as structured overlays, are subject to the Sybil attack, in which an adversary creates many false identities to increase its influence. This paper describes a one-hop distributed hash table which uses the social links between users to strongly resist the Sybil attack. The social network is assumed to be fast mixing, meaning that a random walk in the honest part of the network quickly approaches the uniform distribution. As in the related SybilLimit system, with a social network of n honest nodes and m honest edges, the protocol can tolerate up to o(n/log n) attack edges (social links from honest nodes to compromised nodes). The routing tables contain O(√m log m) entries per node and are constructed efficiently by a distributed protocol. This is the first sublinear solution to this problem. Preliminary simulation results are presented to demonstrate the approach's effectiveness.

@inproceedings{sybil:socialnets08,
  title        = {A {Sybil}-proof one-hop {DHT}},
  author       = {Chris Lesniewski-Laas},
  booktitle    = {Proceedings of the First International Workshop on
    Social Network Systems (SocialNets 2008)},
  year         = 2008,
  month        = apr,
  address      = {Glasgow, Scotland},
}
Xoc, an extension-oriented compiler for systems programming Russ Cox, Tom Bergan, Austin T. Clements, Frans Kaashoek, and Eddie Kohler. ASPLOS 2008.

Today's system programmers go to great lengths to extend the languages in which they program. For instance, system-specific compilers find errors in Linux and other systems, and add support for specialized control flow to Qt and event-based programs. These compilers are difficult to build and cannot always understand each other's language changes. However, they can greatly improve code understandability and correctness, advantages that should be accessible to all programmers.

We describe an extension-oriented compiler for C called xoc. An extension-oriented compiler, unlike a conventional extensible compiler, implements new features via many small extensions that are loaded together as needed. Xoc gives extension writers full control over program syntax and semantics while hiding many compiler internals. Xoc programmers concisely define powerful compiler extensions that, by construction, can be combined; even some parts of the base compiler, such as GNU C compatibility, are structured as extensions.

Xoc is based on two key interfaces. Syntax patterns allow extension writers to manipulate language fragments using concrete syntax. Lazy computation of attributes allows extension writers to use the results of analyses by other extensions or the core without needing to worry about pass scheduling.

Extensions built using xoc include xsparse, a 345-line extension that mimics Sparse, Linux's C front end, and xlambda, a 170-line extension that adds function expressions to C. An evaluation of xoc using these and 13 other extensions shows that xoc extensions are typically more concise than equivalent extensions written for conventional extensible compilers and that it is possible to compose extensions.

@inproceedings{xoc:asplos08,
  author       = {Russ Cox and Tom Bergan and Austin T. Clements and
    Frans Kaashoek and Eddie Kohler},
  title        = {Xoc, an Extension-Oriented Compiler for Systems
    Programming},
  pages        = {244--254},
  booktitle    = {Proceedings of the 13th International Conference on
    Architectural Support for Programming Languages and Operating
    Systems ({ASPLOS})},
  year         = 2008,
  month        = mar,
  address      = {Seattle, Washington},
}
A comparison of designs for extensible and extension-oriented compilers Austin T. Clements. Master's thesis, Massachusetts Institute of Technology, February 2008.

Today's system programmers go to great lengths to extend the languages in which they program. For instance, system-specific compilers find errors in Linux and other systems, and add support for specialized control flow to Qt and event-based programs. These compilers are difficult to build and cannot always understand each other's language changes. However, they can greatly improve code understandability and correctness, advantages that should be accessible to all programmers.

This thesis considers four extensible and extension-oriented compilers: CIL, Polyglot, xtc, and Xoc. These four compilers represent four distinctly different approaches to the problem of bridging the gap between language design and system implementation. Taking an extension author's point of view, this thesis compares the design of each compiler's extension interface in terms of extension structure, syntactic analysis, semantic analysis, and rewriting.

To perform the comparison, this thesis uses three extensions implemented variously in the four compilers: a bitwise rotation operator, function expressions, and lock checking. These extensions are designed to span a representative space of analysis and rewriting needs.

Based on this comparison, this thesis identifies the following implications of the design decisions of each extension interface: the expressiveness, understandability, and correctness of extension implementations can benefit from domain specific languages and language features tailored to the extension interface; compiler-managed scheduling trades loss of control for automatic extension composability; unifying internal and external program representation improves ease of use and extension composability, but gives up potentially useful control over the internal representation; concrete syntax patterns provide a natural interface to internal program representation, but must be more powerful than simple tree matching to be practical; grammars, types, and syntax interfaces have a natural correspondence; and accounting for semantic information in the final output enables hygienic rewriting, which can simplify extensions.

@mastersthesis{xoc:aclements-meng,
  title        = {A Comparison of Designs for Extensible and
    Extension-Oriented Compilers},
  author       = {Austin T. Clements},
  school       = {Massachusetts Institute of Technology},
  year         = 2008,
  month        = feb,
}

2007

A World Wide Web without walls Maxwell Krohn, Alex Yip, Micah Brodsky, Robert Morris, and Michael Walfish. HotNets 2007.
@inproceedings{w5:hotnets07,
  author       = {Maxwell Krohn and Alex Yip and Micah Brodsky and
    Robert Morris and Michael Walfish},
  title        = {A {World Wide Web} Without Walls},
  booktitle    = {Proceedings of the Sixth {W}orkshop on {H}ot
    {T}opics in {N}etworks ({HotNets-VI})},
  year         = 2007,
  month        = nov,
  organization = {{ACM SIGCOMM}},
  address      = {Atlanta, Georgia},
}
Alpaca: Extensible authorization for distributed services Chris Lesniewski-Laas, Bryan Ford, Jacob Strauss, Robert Morris, and M. Frans Kaashoek. CCS 2007.

Traditional Public Key Infrastructures (PKI) have not lived up to their promise because there are too many ways to define PKIs, too many cryptographic primitives to build them with, and too many administrative domains with incompatible roots of trust. Alpaca is an authentication and authorization framework that embraces PKI diversity by enabling one PKI to "plug in" another PKI's credentials and cryptographic algorithms, allowing users of the latter to authenticate themselves to services using the former using their existing, unmodified certificates. Alpaca builds on Proof-Carrying Authorization (PCA), expressing a credential as an explicit proof of a logical claim. Alpaca generalizes PCA to express not only delegation policies but also the cryptographic primitives, credential formats, and namespace structures needed to use foreign credentials directly. To achieve this goal, Alpaca introduces a method of creating and naming new principals which behave according to arbitrary rules, a modular approach to logical axioms, and a domain-specific language specialized for reasoning about authentication. We have implemented Alpaca as a Python module that assists applications in generating proofs (e.g., in a client requesting access to a resource), and in verifying those proofs via a compact 800-line TCB (e.g., in a server providing that resource). We present examples demonstrating Alpaca's extensibility in scenarios involving inter-organization PKI interoperability and secure remote PKI upgrade.

@inproceedings{alpaca:ccs07,
  title        = {Alpaca: Extensible Authorization for Distributed
    Services},
  author       = {Chris Lesniewski-Laas and Bryan Ford and Jacob
    Strauss and Robert Morris and M. Frans Kaashoek},
  booktitle    = {Proceedings of the 14th {ACM} Conference on Computer
    and Communications Security ({CCS-2007})},
  year         = 2007,
  month        = oct,
  address      = {Alexandria, VA},
}
Information flow control for standard OS abstractions Maxwell Krohn, Alexander Yip, Micah Brodsky, Natan Cliffer, M. Frans Kaashoek, Eddie Kohler, and Robert Morris. SOSP 2007.
@inproceedings{KYBCKMM07,
  author       = {Maxwell Krohn and Alexander Yip and Micah Brodsky
    and Natan Cliffer and M. Frans Kaashoek and Eddie Kohler and
    Robert Morris},
  title        = {Information Flow Control for Standard {OS}
    Abstractions},
  booktitle    = {Proceedings of the 21st {ACM} {S}ymposium on
    {O}perating {S}ystems {P}rinciples ({SOSP} '07)},
  year         = 2007,
  month        = oct,
  address      = {Stevenson, WA},
}
Structured streams: a new transport abstraction Bryan Ford. SIGCOMM 2007.

Internet applications currently have a choice between stream and datagram transport abstractions. Datagrams efficiently support small transactions and streams are suited for long-running conversations, but neither abstraction adequately supports applications like HTTP that exhibit a mixture of transaction sizes, or applications like FTP and SIP that use multiple transport instances. Structured Stream Transport (SST) enhances the traditional stream abstraction with a hierarchical hereditary structure, allowing applications to create lightweight child streams from any existing stream. Unlike TCP streams, these lightweight streams incur neither 3-way handshaking delays on startup nor TIME-WAIT periods on close. Each stream offers independent data transfer and flow control, allowing different transactions to proceed in parallel without head-of-line blocking, but all streams share one congestion control context. SST supports both reliable and best-effort delivery in a way that semantically unifies datagrams with streams and solves the classic “large datagram” problem, where a datagram's loss probability increases exponentially with fragment count. Finally, an application can prioritize its streams relative to each other and adjust priorities dynamically through out-of-band signaling. A user-space prototype shows that SST is TCP-friendly to within 2%, and performs comparably to a user-space TCP and to within 10% of kernel TCP on a WiFi network.

@inproceedings{sst:sigcomm07,
  title        = {Structured Streams: a New Transport Abstraction},
  author       = {Bryan Ford},
  booktitle    = {Proceedings of the {ACM} {SIGCOMM} '07 Conference},
  year         = 2007,
  month        = aug,
  address      = {Kyoto, Japan},
}
Events can make sense Maxwell Krohn, Eddie Kohler, and M. Frans Kaashoek. USENIX 2007.
@inproceedings{krohn:tame,
  author       = {Maxwell Krohn and Eddie Kohler and M. Frans Kaashoek},
  title        = {Events Can Make Sense},
  booktitle    = {Proceedings of the 2007 USENIX Annual Technical
    Conference (USENIX '07)},
  year         = 2007,
  month        = jun,
  address      = {Santa Clara, CA},
}
Don't give up on distributed file systems Jeremy Stribling, Emil Sit, M. Frans Kaashoek, Jinyang Li, and Robert Morris. IPTPS 2007.

Wide-area distributed applications often reinvent the wheel for their storage needs, each incorporating its own special-purpose storage manager to cope with distribution, intermittent failures, limited bandwidth, and high latencies. This paper argues that a distributed file system could provide a reusable solution to these problems by coupling a standard interface with a design suited to wide-area distribution. For concreteness, this paper presents such a file system, called WheelFS, which allows applications to control consistency through the use of semantic cues, and minimizes communication costs by adhering to the slogan read globally, write locally. WheelFS could simplify distributed experiments, CDNs, and Grid applications.

@inproceedings{wheelfs:iptps07,
  title        = {Don't Give Up on Distributed File Systems},
  author       = {Jeremy Stribling and Emil Sit and M. Frans Kaashoek
    and Jinyang Li and Robert Morris},
  booktitle    = {Proceedings of the 6th International Workshop on
    Peer-to-Peer Systems (IPTPS07)},
  year         = 2007,
  month        = feb,
  address      = {Bellevue, WA},
}

2006

Persistent personal names for globally connected mobile devices Bryan Ford, Jacob Strauss, Chris Lesniewski-Laas, Sean Rhea, Frans Kaashoek, and Robert Morris. OSDI 2006.

The Unmanaged Internet Architecture (UIA) provides zero-configuration connectivity among mobile devices through personal names. Users assign personal names through an ad hoc device introduction process requiring no central allocation. Once assigned, names bind securely to the global identities of their target devices independent of network location. Each user manages one namespace, shared among all the user's devices and always available on each device. Users can also name other users to share resources with trusted acquaintances. Devices with naming relationships automatically arrange connectivity when possible, both in ad hoc networks and using global infrastructure when available. A UIA prototype demonstrates these capabilities using optimistic replication for name resolution and group management and a routing algorithm exploiting the user's social network for connectivity.

Demo Video:

  • AVI: 20MB, H.264 codec; playable with VLC or MPlayer for example.
  • DVD: 262MB, MPEG-2; burn this ISO image onto a DVD and pop it into a DVD player.
@inproceedings{uia:osdi06,
  title        = {Persistent Personal Names for Globally Connected
    Mobile Devices},
  author       = {Bryan Ford and Jacob Strauss and Chris
    Lesniewski-Laas and Sean Rhea and Frans Kaashoek and Robert Morris},
  booktitle    = {Proceedings of the 7th {USENIX} {S}ymposium on
    {O}perating {S}ystems {D}esign and {I}mplementation ({OSDI} '06)},
  year         = 2006,
  month        = nov,
  address      = {Seattle, Washington},
}
Efficient replica maintenance for distributed storage systems Byung-Gon Chun, Frank Dabek, Andreas Haeberlen, Emil Sit, Hakim Weatherspoon, Frans Kaashoek, John Kubiatowicz, and Robert Morris. NSDI 2006.

This paper considers replication strategies for storage systems that aggregate the disks of many nodes spread over the Internet. Maintaining replication in such systems can be prohibitively expensive, since every transient network or host failure could potentially lead to copying a server's worth of data over the Internet to maintain replication levels.

The following insights in designing an efficient replication algorithm emerge from the paper's analysis. First, durability can be provided separately from availability; the former is less expensive to ensure and a more useful goal for many wide-area applications. Second, the focus of a durability algorithm must be to create new copies of data objects faster than permanent disk failures destroy the objects; careful choice of policies for what nodes should hold what data can decrease repair time. Third, increasing the number of replicas of each data object does not help a system tolerate a higher disk failure probability, but does help tolerate bursts of failures. Finally, ensuring that the system makes use of replicas that recover after temporary failure is critical to efficiency.

Based on these insights, the paper proposes the Carbonite replication algorithm for keeping data durable at a low cost. A simulation of Carbonite storing 1 TB of data over a 365 day trace of PlanetLab activity shows that Carbonite is able to keep all data durable and uses 44% more network traffic than a hypothetical system that only responds to permanent failures. In comparison, Total Recall and DHash require almost a factor of two more network traffic than this hypothetical system.

@inproceedings{carbonite:nsdi06,
  title        = {Efficient Replica Maintenance for Distributed
    Storage Systems},
  author       = {Byung-Gon Chun and Frank Dabek and Andreas Haeberlen
    and Emil Sit and Hakim Weatherspoon and Frans Kaashoek and John
    Kubiatowicz and Robert Morris},
  booktitle    = {Proceedings of the 3rd {USENIX} {S}ymposium on
    {N}etworked {S}ystems {D}esign and {I}mplementation ({NSDI} '06)},
  year         = 2006,
  month        = may,
  address      = {San Jose, CA},
}
Overcite: A distributed, cooperative citeseer Jeremy Stribling, Jinyang Li, Isaac G. Councill, M. Frans Kaashoek, and Robert Morris. NSDI 2006.

CiteSeer is a popular online resource for the computer science research community, allowing users to search and browse a large archive of research papers. CiteSeer is expensive: it generates 35 GB of network traffic per day, requires nearly one terabyte of disk storage, and needs significant human maintenance.

OverCite is a new digital research library system that aggregates donated resources at multiple sites to provide CiteSeer-like document search and retrieval. OverCite enables members of the community to share the costs of running CiteSeer. The challenge facing OverCite is how to provide scalable and load-balanced storage and query processing with automatic data management. OverCite uses a three-tier design: presentation servers provide an identical user interface to CiteSeer's; application servers partition and replicate a search index to spread the work of answering each query among several nodes; and a distributed hash table stores documents and meta-data, and coordinates the activities of the servers.

Evaluation of a prototype shows that OverCite increases its query throughput by a factor of seven with a nine-fold increase in the number of servers. OverCite requires more total storage and network bandwidth than centralized CiteSeer, but spreads these costs over all the sites. OverCite can exploit the resources of these sites to support new features such as document alerts and to scale to larger data sets.

@inproceedings{overcite:nsdi06,
  title        = {OverCite: A Distributed, Cooperative CiteSeer},
  author       = {Jeremy Stribling and Jinyang Li and Isaac G.
    Councill and M. Frans Kaashoek and Robert Morris},
  booktitle    = {Proceedings of the 3rd {USENIX} {S}ymposium on
    {N}etworked {S}ystems {D}esign and {I}mplementation ({NSDI} '06)},
  year         = 2006,
  month        = may,
  address      = {San Jose, CA},
}
Pastwatch: A distributed version control system Alexander Yip, Benjie Chen, and Robert Morris. NSDI 2006.
@inproceedings{pastwatch:nsdi06,
  title        = {Pastwatch: A Distributed Version Control System},
  author       = {Alexander Yip and Benjie Chen and Robert Morris},
  booktitle    = {Proceedings of the 3rd {USENIX} {S}ymposium on
    {N}etworked {S}ystems {D}esign and {I}mplementation ({NSDI} '06)},
  year         = 2006,
  month        = may,
  address      = {San Jose, CA},
}
Asbestos: Operating system security for mobile devices Martijn Stevenson. Master's thesis, Massachusetts Institute of Technology, May 2006.
@mastersthesis{armasb:martijn-meng,
  title        = {Asbestos: Operating System Security for Mobile
    Devices},
  author       = {Martijn Stevenson},
  school       = {Massachusetts Institute of Technology},
  year         = 2006,
  month        = may,
}
Proactive replication for data durability Emil Sit, Andreas Haeberlen, Frank Dabek, Byung-Gon Chun, Hakim Weatherspoon, Frans Kaashoek, Robert Morris, and John Kubiatowicz. IPTPS 2006.
@inproceedings{tempo:iptps06,
  title        = {Proactive replication for data durability},
  author       = {Emil Sit and Andreas Haeberlen and Frank Dabek and
    Byung-Gon Chun and Hakim Weatherspoon and Frans Kaashoek and
    Robert Morris and John Kubiatowicz},
  booktitle    = {Proceedings of the 5th International Workshop on
    Peer-to-Peer Systems (IPTPS06)},
  year         = 2006,
  month        = feb,
  address      = {Santa Barbara, CA},
}
User-relative names for globally connected personal devices Bryan Ford, Jacob Strauss, Chris Lesniewski-Laas, Sean Rhea, Frans Kaashoek, and Robert Morris. IPTPS 2006.

Nontechnical users who own increasingly ubiquitous network-enabled personal devices such as laptops, digital cameras, and smart phones need a simple, intuitive, and secure way to share information and services between their devices. User Information Architecture, or UIA, is a novel naming and peer-to-peer connectivity architecture addressing this need. Users assign UIA names by "introducing" devices to each other on a common local-area network, but these names remain securely bound to their target as devices migrate. Multiple devices owned by the same user, once introduced, automatically merge their namespaces to form a distributed personal cluster that the owner can access or modify from any of his devices. Instead of requiring users to allocate globally unique names from a central authority, UIA enables users to assign their own user-relative names both to their own devices and to other users. With UIA, for example, Alice can always access her iPod from any of her own personal devices at any location via the name ipod, and her friend Bob can access her iPod via a relative name like ipod.Alice.

@inproceedings{uia:iptps06,
  title        = {User-Relative Names for Globally Connected Personal
    Devices},
  author       = {Bryan Ford and Jacob Strauss and Chris
    Lesniewski-Laas and Sean Rhea and Frans Kaashoek and Robert Morris},
  booktitle    = {Proceedings of the 5th International Workshop on
    Peer-to-Peer Systems (IPTPS06)},
  year         = 2006,
  month        = feb,
  address      = {Santa Barbara, CA},
}

2005

VXA: A virtual architecture for durable compressed archives Bryan Ford. FAST 2005.

Data compression algorithms change frequently, and obsolete decoders do not always run on new hardware and operating systems, threatening the long-term usability of content archived using those algorithms. Re-encoding content into new formats is cumbersome, and highly undesirable when lossy compression is involved. Processor architectures, in contrast, have remained comparatively stable over recent decades. VXA, an archival storage system designed around this observation, archives executable decoders along with the encoded content it stores. VXA decoders run in a specialized virtual machine that implements an OS-independent execution environment based on the standard x86 architecture. The VXA virtual machine strictly limits access to host system services, making decoders safe to run even if an archive contains malicious code. VXA's adoption of a "native" processor architecture instead of type-safe language technology allows reuse of existing "hand-optimized" decoders in C and assembly language, and permits decoders access to performance-enhancing architecture features such as vector processing instructions. The performance cost of VXA's virtualization is typically less than 15% compared with the same decoders running natively. The storage cost of archived decoders, typically 30-130KB each, can be amortized across many archived files sharing the same compression method.

@inproceedings{vxa:fast05,
  title        = {{VXA}: A Virtual Architecture for Durable Compressed
    Archives},
  author       = {Bryan Ford},
  booktitle    = {Proceedings of the 4th USENIX Conference on File and
    Storage Technologies (FAST '05)},
  year         = 2005,
  month        = dec,
  address      = {San Francisco, California},
}
Routing tradeoffs in dynamic peer-to-peer networks Jinyang Li. Ph.D. thesis, Massachusetts Institute of Technology, November 2005.
@phdthesis{accordion:jinyang,
  title        = {Routing Tradeoffs in Dynamic Peer-to-peer Networks},
  author       = {Jinyang Li},
  school       = {Massachusetts Institute of Technology},
  year         = 2005,
  month        = nov,
}
A distributed hash table Frank Dabek. Ph.D. thesis, Massachusetts Institute of Technology, November 2005.
@phdthesis{dht:dabek,
  title        = {A Distributed Hash Table},
  author       = {Frank Dabek},
  school       = {Massachusetts Institute of Technology},
  year         = 2005,
  month        = nov,
}
Labels and event processes in the Asbestos operating system Petros Efstathopoulos, Maxwell Krohn, Steve VanDeBogart, Cliff Frey, David Ziegler, Eddie Kohler, David Mazières, Frans Kaashoek, and Robert Morris. SOSP 2005.
@inproceedings{EKVFZKMKM05,
  author       = {Petros Efstathopoulos and Maxwell Krohn and Steve
    VanDeBogart and Cliff Frey and David Ziegler and Eddie Kohler and
    David Mazi\`eres and Frans Kaashoek and Robert Morris},
  title        = {Labels and Event Processes in the {Asbestos}
    Operating System},
  booktitle    = {Proceedings of the 20th {ACM} {S}ymposium on
    {O}perating {S}ystems {P}rinciples ({SOSP} '05)},
  year         = 2005,
  month        = oct,
  address      = {Brighton, UK},
}
Sybil-resistant DHT routing George Danezis, Chris Lesniewski-Laas, M. Frans Kaashoek, and Ross Anderson. ESORICS 2005.

Distributed Hash Tables (DHTs) are very efficient distributed systems for routing, but at the same time vulnerable to disruptive nodes. Designers of such systems want them used in open networks, where an adversary can perform a sybil attack by introducing a large number of corrupt nodes in the network, considerably degrading its performance. We introduce a routing strategy that alleviates some of the effects of such an attack by making sure that lookups are performed using a diverse set of nodes. This ensures that at least some of the nodes queried are good, and hence the search makes forward progress. This strategy makes use of latent social information present in the introduction graph of the network.

@inproceedings{sybil:esorics05,
  title        = {Sybil-resistant {DHT} routing},
  author       = {George Danezis and Chris Lesniewski-Laas and M.
    Frans Kaashoek and Ross Anderson},
  booktitle    = {Proceedings of the 10th European Symposium On
    Research In Computer Security},
  year         = 2005,
  month        = sep,
  address      = {Milan, Italy},
}
Integrity and access control in untrusted content distribution networks Kevin Fu. Ph.D. thesis, Massachusetts Institute of Technology, September 2005.
@phdthesis{fu-phd,
  title        = {Integrity and access control in untrusted content
    distribution networks},
  author       = {Kevin Fu},
  school       = {Massachusetts Institute of Technology},
  year         = 2005,
  month        = sep,
}
SoftECC: A system for software memory integrity checking Dave Dopson. Master's thesis, Massachusetts Institute of Technology, September 2005.
@mastersthesis{softecc:ddopson-meng,
  title        = {{SoftECC}: A System for Software Memory Integrity
    Checking},
  author       = {Dave Dopson},
  school       = {Massachusetts Institute of Technology},
  year         = 2005,
  month        = sep,
}
Overcite: A cooperative digital research library Jeremy Stribling. Master's thesis, Massachusetts Institute of Technology, September 2005.

CiteSeer is a well-known online resource for the computer science research community, allowing users to search and browse a large archive of research papers. Unfortunately, its current centralized incarnation is costly to run. Although members of the community would presumably be willing to donate hardware and bandwidth at their own sites to assist CiteSeer, the current architecture does not facilitate such distribution of resources.

OverCite is a design for a new architecture for a distributed and cooperative research library based on a distributed hash table (DHT). The new architecture harnesses donated resources at many sites to provide document search and retrieval service to researchers worldwide. A preliminary evaluation of an initial OverCite prototype shows that it can service more queries per second than a centralized system, and that it increases total storage capacity by a factor of n/4 in a system of n nodes. OverCite can exploit these additional resources by supporting new features such as document alerts, and by scaling to larger data sets.

@mastersthesis{overcite:strib-ms,
  title        = {OverCite: A Cooperative Digital Research Library},
  author       = {Jeremy Stribling},
  school       = {Massachusetts Institute of Technology},
  year         = 2005,
  month        = sep,
}
Building fast and secure web services with OKWS Maxwell Krohn. Master's thesis, Massachusetts Institute of Technology, September 2005.
@mastersthesis{okws:krohn-ms,
  author       = {Maxwell Krohn},
  title        = {Building Fast and Secure Web Services With {OKWS}},
  school       = {Massachusetts Institute of Technology},
  year         = 2005,
  month        = sep,
}
SSL splitting: securely serving data from untrusted caches Chris Lesniewski-Laas, and M. Frans Kaashoek. Computer Networks 48(5), August 2005.

A popular technique for reducing the bandwidth load on Web servers is to serve the content from proxies. Typically these hosts are trusted by the clients and server not to modify the data that they proxy. SSL splitting is a new technique for guaranteeing the integrity of data served from proxies without requiring changes to Web clients. Instead of relaying an insecure HTTP connection, an SSL splitting proxy simulates a normal Secure Sockets Layer (SSL) connection with the client by merging authentication records from the server with data records from a cache. This technique reduces the bandwidth load on the server, while allowing an unmodified Web browser to verify that the data served from proxies is endorsed by the originating server. SSL splitting is implemented as a patch to the industry-standard OpenSSL library, with which the server is linked. In experiments replaying two-hour access.log traces taken from LCS Web sites over an ADSL link, SSL splitting reduces bandwidth consumption of the server by between 25% and 90% depending on the warmth of the cache and the redundancy of the trace. Uncached requests forwarded through the proxy exhibit latencies within approximately 5% of those of an unmodified SSL server.

@article{ssl-splitting:compnet05,
  title        = {{SSL} splitting: securely serving data from
    untrusted caches},
  author       = {Chris Lesniewski-Laas and M. Frans Kaashoek},
  journal      = {Computer Networks},
  publisher    = {Elsevier},
  volume       = 48,
  number       = 5,
  pages        = {763--779},
  year         = 2005,
  month        = aug,
}
Architecture and evaluation of an unplanned 802.11b mesh network John Bicket, Daniel Aguayo, Sanjit Biswas, and Robert Morris. MobiCom 2005.

This paper evaluates the ability of a wireless mesh architecture to provide high performance Internet access while demanding little deployment planning or operational management. The architecture considered in this paper has unplanned node placement (rather than planned topology), omni-directional antennas (rather than directional links), and multi-hop routing (rather than single-hop base stations). These design decisions contribute to ease of deployment, an important requirement for community wireless networks. However, this architecture carries the risk that lack of planning might render the network's performance unusably low.

For example, it might be necessary to place nodes carefully to ensure connectivity; the omni-directional antennas might provide uselessly short radio ranges; or the inefficiency of multi-hop forwarding might leave some users effectively disconnected.

The paper evaluates this unplanned mesh architecture with a case study of the Roofnet 802.11b mesh network. Roofnet consists of 37 nodes spread over four square kilometers of an urban area. The network provides users with usable performance despite lack of planning: the average inter-node throughput is 627 kbits/second, even though the average route has three hops.

The paper evaluates multiple aspects of the architecture: the effect of node density on connectivity and throughput; the characteristics of the links that the routing protocol elects to use; the usefulness of the highly connected mesh afforded by omni-directional antennas for robustness and throughput; and the potential performance of a single-hop network using the same nodes as Roofnet.

@inproceedings{roofnet:mobicom05,
  title        = {Architecture and Evaluation of an Unplanned 802.11b
    Mesh Network},
  author       = {John Bicket and Daniel Aguayo and Sanjit Biswas and
    Robert Morris},
  booktitle    = {Proceedings of the 11th {ACM} International
    Conference on Mobile Computing and Networking ({MobiCom} '05)},
  year         = 2005,
  month        = aug,
  address      = {Cologne, Germany},
}
Opportunistic routing in multi-hop wireless networks Sanjit Biswas, and Robert Morris. SIGCOMM 2005.

This paper describes ExOR, an integrated routing and MAC protocol that increases the throughput of large unicast transfers in multi-hop wireless networks. ExOR chooses each hop of a packet's route after the transmission for that hop, so that the choice can reflect which intermediate nodes actually received the transmission. This deferred choice gives each transmission multiple opportunities to make progress. As a result ExOR can use long radio links with high loss rates, which would be avoided by traditional routing. ExOR increases a connection's throughput while using no more network capacity than traditional routing.

ExOR's design faces the following challenges. The nodes that receive each packet must agree on their identities and choose one forwarder. The agreement protocol must have low overhead, but must also be robust enough that it rarely forwards a packet zero times or more than once. Finally, ExOR must choose the forwarder with the lowest remaining cost to the ultimate destination.

Measurements of an implementation on a 38-node 802.11b test-bed show that ExOR increases throughput for most node pairs when compared with traditional routing. For pairs between which traditional routing uses one or two hops, ExOR's robust acknowledgments prevent un necessary retransmissions, increasing throughput by nearly 35%. For more distant pairs, ExOR takes advantage of the choice of forwarders to provide throughput gains of a factor of two to four.

@inproceedings{roofnet:exor-sigcomm05,
  title        = {Opportunistic Routing in Multi-Hop Wireless Networks},
  author       = {Sanjit Biswas and Robert Morris},
  booktitle    = {Proceedings of the {ACM} {SIGCOMM} '05 Conference},
  year         = 2005,
  month        = aug,
  address      = {Philadelphia, Pennsylvania},
}
Make least privilege a right (not a privilege) Maxwell Krohn, Petros Efstathopoulos, Cliff Frey, Frans Kaashoek, Eddie Kohler, David Mazières, Robert Morris, Michelle Osborne, Steve VanDeBogart, and David Ziegler. HotOS 2005.
@inproceedings{KEFKKMMOVZ05,
  author       = {Maxwell Krohn and Petros Efstathopoulos and Cliff
    Frey and Frans Kaashoek and Eddie Kohler and David Mazi\`eres and
    Robert Morris and Michelle Osborne and Steve VanDeBogart and David
    Ziegler},
  title        = {Make Least Privilege a Right (Not a Privilege)},
  booktitle    = {Proceedings of the 10th {W}orkshop on {H}ot {T}opics
    in {O}perating {S}ystems ({HotOS-X})},
  year         = 2005,
  month        = jun,
  address      = {Santa Fe, NM},
}
Bandwidth-efficient management of DHT routing tables Jinyang Li, Jeremy Stribling, Robert Morris, and M. Frans Kaashoek. NSDI 2005.

Today an application developer using a distributed hash table (DHT) with n nodes must choose a DHT protocol from the spectrum between O(1) lookup protocols and O(log n) protocols. O(1) protocols achieve low latency lookups on small or low-churn networks because lookups take only a few hops, but incur high maintenance traffic on large or high-churn networks. O(log n) protocols incur less maintenance traffic on large or high-churn networks but require more lookup hops in small networks. Accordion is a new routing protocol that does not force the developer to make this choice: Accordion adjusts itself to provide the best performance across a range of network sizes and churn rates while staying within a bounded bandwidth budget.

The key challenges in the design of Accordion are the algorithms that choose the routing table's size and content. Each Accordion node learns of new neighbors opportunistically, in a way that causes the density of its neighbors to be inversely proportional to their distance in ID space from the node. This distribution allows Accordion to vary the table size along a continuum while still guaranteeing at most O(log n) lookup hops. The user-specified bandwidth budget controls the rate at which a node learns about new neighbors. Each node limits its routing table size by evicting neighbors that it judges likely to have failed. High churn (i.e., short node lifetimes) leads to a high eviction rate. The equilibrium between the learning and eviction processes determines the table size.

Simulations show that Accordion maintains an efficient lookup latency versus bandwidth tradeoff over a wider range of operating conditions than existing DHTs.

@inproceedings{accordion:nsdi05,
  title        = {Bandwidth-efficient management of {DHT} routing
    tables},
  author       = {Jinyang Li and Jeremy Stribling and Robert Morris
    and M. Frans Kaashoek},
  booktitle    = {Proceedings of the 2nd {USENIX} {S}ymposium on
    {N}etworked {S}ystems {D}esign and {I}mplementation ({NSDI} '05)},
  year         = 2005,
  month        = may,
  address      = {Boston, Massachusetts},
}
Improving web availability for clients with MONET David G. Andersen, Hari Balakrishnan, M. Frans Kaashoek, and Rohit Rao. NSDI 2005.
@inproceedings{monet:nsdi05,
  title        = {Improving Web Availability for Clients with {MONET}},
  author       = {David G. Andersen and Hari Balakrishnan and M. Frans
    Kaashoek and Rohit Rao},
  booktitle    = {Proceedings of the 2nd {USENIX} {S}ymposium on
    {N}etworked {S}ystems {D}esign and {I}mplementation ({NSDI} '05)},
  year         = 2005,
  month        = may,
  address      = {Boston, Massachusetts},
}
Mandatory security and performance of services in Asbestos David Patrick Ziegler. Master's thesis, Massachusetts Institute of Technology, May 2005.
@mastersthesis{asbestos:ziegler-meng,
  title        = {Mandatory Security and Performance of Services in
    {Asbestos}},
  author       = {David Patrick Ziegler},
  school       = {Massachusetts Institute of Technology},
  year         = 2005,
  month        = may,
}
Acetone: A system call interface for Asbestos labels Clifford A. Frey. Master's thesis, Massachusetts Institute of Technology, May 2005.
@mastersthesis{acetone:frey-meng,
  title        = {Acetone: A System Call Interface for {Asbestos}
    Labels},
  author       = {Clifford A. Frey},
  school       = {Massachusetts Institute of Technology},
  year         = 2005,
  month        = may,
}
Peer-to-peer communication across network address translators Bryan Ford, Pyda Srisuresh, and Dan Kegel. USENIX 2005.
@inproceedings{p2pnat:usenix05,
  title        = {Peer-to-Peer Communication Across Network Address
    Translators},
  author       = {Bryan Ford and Pyda Srisuresh and Dan Kegel},
  booktitle    = {Proceedings of the 2005 USENIX Annual Technical
    Conference (USENIX '05)},
  year         = 2005,
  month        = apr,
  address      = {Anaheim, California},
}
A performance vs. cost framework for evaluating DHT design tradeoffs under churn Jinyang Li, Jeremy Stribling, Robert Morris, M. Frans Kaashoek, and Thomer M. Gil. Infocom 2005.

Protocols for distributed hash tables (DHTs) incorporate features to achieve low latency for lookup requests in the face of churn, continuous changes in membership. These protocol features can include a directed identifier space, parallel lookups, pro-active flooding of membership changes, and stabilization protocols for maintaining accurate routing. In addition, DHT protocols have parameters that can be tuned to achieve different tradeoffs between lookup latency and communication cost due to maintenance traffic. The relative importance of the features and parameters is not well understood, because most previous work evaluates protocols on static networks.

This paper presents a performance versus cost framework (PVC) that allows designers to compare the effects of different protocol features and parameter values. PVC views a protocol as consuming a certain amount of network bandwidth in order to achieve a certain lookup latency, and helps reveal the efficiency with which protocols use additional network resources to improve latency. To demonstrate the value of PVC, this paper simulates Chord, Kademlia, Kelips, OneHop, and Tapestry under different workloads and uses PVC to understand which features are more important under churn. PVC analysis shows that the key to efficiently using additional bandwidth is for a protocol to adjust its routing table size. It also shows that routing table stabilization is wasteful and can be replaced with opportunistic learning through normal lookup traffic. These insights combined demonstrate that PVC is a valuable tool for DHT designers.

@inproceedings{dhtcomparison:infocom05,
  title        = {A performance vs. cost framework for evaluating
    {DHT} design tradeoffs under churn},
  author       = {Jinyang Li and Jeremy Stribling and Robert Morris
    and M. Frans Kaashoek and Thomer M. Gil},
  booktitle    = {Proceedings of the 24th Infocom},
  year         = 2005,
  month        = mar,
  address      = {Miami, FL},
}
Opportunistic routing in multi-hop wireless networks Sanjit Zubin Biswas. Master's thesis, Massachusetts Institute of Technology, March 2005.

This thesis describes ExOR, an integrated routing and MAC protocol for bulk transfers in multi-hop wireless networks. ExOR exploits the broadcast nature of radios by making forwarding decisions based on which nodes receive each transmission. The spatial diversity among receivers provides each transmission multiple opportunities to make progress in the face of packet losses. As a result ExOR can use long links with high loss rates, which would be avoided by unicast routing.

ExOR operates on batches of packets. The source node includes a list of candidate forwarders in each packet, prioritized by closeness to the destination. Receiving nodes buffer successfully received packets and await the end of the batch. The highest priority forwarder then broadcasts the packets in its buffer, including its copy of the ``batch map'' in each packet. The batch map contains the sender's best guess of the highest priority node to have received each packet. The remaining forwarders then transmit in order, sending only packets which were not acknowledged in the batch maps of higher priority nodes. The forwarders continue to cycle through the priority list until the destination has enough packets to recover the original data using forward error correction.

An evaluation of an implementation on a 38-node 802.11b test-bed shows that ExOR improves bulk data transfer throughput for most node pairs when compared with unicast routing. For pairs between which unicast uses one or two hops, ExOR's robust batch maps prevent unnecessary retransmissions, increasing throughput by nearly 50\%. For longer unicast routes, ExOR takes advantage of spatial diversity, providing gains of a factor of two to four when using a batch size of 10 packets.

@mastersthesis{roofnet:biswas-ms,
  title        = {Opportunistic Routing in Multi-Hop Wireless Networks},
  author       = {Sanjit Zubin Biswas},
  school       = {Massachusetts Institute of Technology},
  year         = 2005,
  month        = mar,
}
Arpeggio: Metadata searching and content sharing with Chord Austin T. Clements, Dan R. K. Ports, and David R. Karger. IPTPS 2005.

Arpeggio is a peer-to-peer file-sharing network based on the Chord lookup primitive. Queries for data whose metadata matches a certain criterion are performed efficiently by using a distributed keyword-set index, augmented with index-side filtering. We introduce index gateways, a technique for minimizing index maintenance overhead. Because file data is large, Arpeggio employs subrings to track live source peers without the cost of inserting the data itself into the network. Finally, we introduce postfetching, a technique that uses information in the index to improve the availability of rare files. The result is a system that provides efficient query operations with the scalability and reliability advantages of full decentralization, and a content distribution system tuned to the requirements and capabilities of a peer-to-peer network.

@inproceedings{arpeggio:iptps05,
  title        = {Arpeggio: Metadata Searching and Content Sharing
    with {C}hord},
  author       = {Austin T. Clements and Dan R. K. Ports and David R.
    Karger},
  booktitle    = {Proceedings of the 4th International Workshop on
    Peer-to-Peer Systems (IPTPS05)},
  year         = 2005,
  month        = feb,
  address      = {Ithaca, NY},
}
Overcite: A cooperative digital research library Jeremy Stribling, Isaac G. Councill, Jinyang Li, M. Frans Kaashoek, David R. Karger, Robert Morris, and Scott Shenker. IPTPS 2005.

CiteSeer is a well-known online resource for the computer science research community, allowing users to search and browse a large archive of research papers. Unfortunately, its current centralized incarnation is costly to run. Although members of the community would presumably be willing to donate hardware and bandwidth at their own sites to assist CiteSeer, the current architecture does not facilitate such distribution of resources. OverCite is a proposal for a new architecture for a distributed and cooperative research library based on a distributed hash table (DHT). The new architecture will harness resources at many sites, and thereby be able to support new features such as document alerts and scale to larger data sets.

@inproceedings{overcite:iptps05,
  title        = {OverCite: A Cooperative Digital Research Library},
  author       = {Jeremy Stribling and Isaac G. Councill and Jinyang
    Li and M. Frans Kaashoek and David R. Karger and Robert Morris and
    Scott Shenker},
  booktitle    = {Proceedings of the 4th International Workshop on
    Peer-to-Peer Systems (IPTPS05)},
  year         = 2005,
  month        = feb,
  address      = {Ithaca, NY},
}
Bit-rate selection in wireless networks John Bicket. Master's thesis, Massachusetts Institute of Technology, February 2005.
@mastersthesis{roofnet:jbicket-ms,
  title        = {Bit-rate Selection in Wireless Networks},
  author       = {John Bicket},
  school       = {Massachusetts Institute of Technology},
  year         = 2005,
  month        = feb,
}

2004

Middleboxes no longer considered harmful Michael Walfish, Jeremy Stribling, Maxwell Krohn, Hari Balakrishnan, Robert Morris, and Scott Shenker. OSDI 2004.

Intermediate network elements, such as network address translators (NATs), firewalls, and transparent caches are now commonplace. The usual reaction in the network architecture community to these so-called middleboxes is a combination of scorn (because they violate important architectural principles) and dismay (because these violations make the Internet less flexible). While we acknowledge these concerns, we also recognize that middleboxes have become an Internet fact of life for important reasons. To retain their functions while eliminating their dangerous side-effects, we propose an extension to the Internet architecture, called the Delegation-Oriented Architecture (DOA), that not only allows, but also facilitates, the deployment of middleboxes. DOA involves two relatively modest changes to the current architecture: (a) a set of references that are carried in packets and serve as persistent host identifiers and (b) a way to resolve these references to delegates chosen by the referenced host.

@inproceedings{doa:osdi04,
  title        = {Middleboxes No Longer Considered Harmful},
  author       = {Michael Walfish and Jeremy Stribling and Maxwell
    Krohn and Hari Balakrishnan and Robert Morris and Scott Shenker},
  booktitle    = {Proceedings of the 6th {USENIX} {S}ymposium on
    {O}perating {S}ystems {D}esign and {I}mplementation ({OSDI} '04)},
  year         = 2004,
  month        = dec,
  address      = {San Francisco, California},
}
An empirical study of spam traffic and the use of DNS black lists Jaeyeon Jung, and Emil Sit. SIGCOMM 2004.
@inproceedings{empircal:imc04,
  author       = {Jaeyeon Jung and Emil Sit},
  title        = {An Empirical Study of Spam Traffic and the Use of
    {DNS} Black Lists},
  booktitle    = {Proceedings of the {ACM} {SIGCOMM} Internet
    Measurement Conference '04},
  year         = 2004,
  month        = oct,
  address      = {Taormina, Sicily, Italy},
}
Multiq: Automated detection of multiple bottleneck capacities along a path Sachin Katti, Dina Katabi, Charles Blake, Eddie Kohler, and Jacob Strauss. SIGCOMM 2004.
@inproceedings{multiq:imc04,
  title        = {MultiQ: Automated Detection of Multiple Bottleneck
    Capacities Along a Path},
  author       = {Sachin Katti and Dina Katabi and Charles Blake and
    Eddie Kohler and Jacob Strauss},
  booktitle    = {Proceedings of the {ACM} {SIGCOMM} Internet
    Measurement Conference '04},
  year         = 2004,
  month        = oct,
  address      = {Taormina, Sicily, Italy},
}
User authentication and remote execution across administrative domains Michael Kaminsky. Ph.D. thesis, Massachusetts Institute of Technology, September 2004.
@phdthesis{kaminsky-phd,
  title        = {User Authentication and Remote Execution Across
    Administrative Domains},
  author       = {Michael Kaminsky},
  school       = {Massachusetts Institute of Technology},
  year         = 2004,
  month        = sep,
}
Link-level measurements from an 802.11b mesh network Daniel Aguayo, John Bicket, Sanjit Biswas, Glenn Judd, and Robert Morris. SIGCOMM 2004.

This paper analyzes the causes of packet loss in a 38-node urban multi-hop 802.11b network. The patterns and causes of loss are important in the design of routing and error-correction protocols, as well as in network planning.

This paper makes the following observations. The distribution of inter-node loss rates is relatively uniform over the whole range of loss rates; there is no clear theshold separating "in range" and "out of range." Most links have relatively stable loss rates from one second to the next, though a small minotiry have very bursty losses at that time scale. Signal-to-noise ratio and distance have little predictive value for loss rate. The large number of links with intermediate loss rates is probably due to multi-path fading rather than attenuation or interference.

The prenomena discussed here are all well-known. The contributions of this paper are an understanding of their relative importance, of how they interact, and of the implications for MAC and routing protocol design.

@inproceedings{roofnet:sigcomm04,
  title        = {Link-level Measurements from an 802.11b Mesh Network},
  author       = {Daniel Aguayo and John Bicket and Sanjit Biswas and
    Glenn Judd and Robert Morris},
  booktitle    = {Proceedings of the {ACM} {SIGCOMM} '04 Conference},
  year         = 2004,
  month        = aug,
  address      = {Portland, Oregon},
}
Vivaldi: A decentralized network coordinate system Frank Dabek, Russ Cox, Frans Kaashoek, and Robert Morris. SIGCOMM 2004.

This paper is posted as corrected on August 28, 2004. The original version of the paper incorrectly described how often nodes measure the RTT to another node. Nodes keep one RTT probe RPC outstanding (instead of sending one probe per second)

Large-scale Internet applications can benefit from an ability to predict round-trip times to other hosts without having to contact them first. Explicit measurements are often unattractive because the cost of measurement can outweigh the benefits of exploiting proximity information. Vivaldi is a simple, light-weight algorithm that assigns synthetic coordinates to hosts such that the distance between the coordinates of two hosts accurately predicts the communication latency between the hosts.

Vivaldi is fully distributed, requiring no fixed network infrastructure and no distinguished hosts. It is also efficient: a new host can compute good coordinates for itself after collecting latency information from only a few other hosts. Because it requires little communication, Vivaldi can piggy-back on the communication patterns of the application using it and scale to a large number of hosts.

An evaluation of Vivaldi using a simulated network whose latencies are based on measurements among 1740 Internet hosts shows that a 2-dimensional Euclidean model with height vectors embeds these hosts with low error (the median relative error in round-trip time prediction is 11 percent).

@inproceedings{vivaldi:sigcomm,
  title        = {Vivaldi: A Decentralized Network Coordinate System},
  author       = {Frank Dabek and Russ Cox and Frans Kaashoek and
    Robert Morris},
  booktitle    = {Proceedings of the {ACM} {SIGCOMM} '04 Conference},
  year         = 2004,
  month        = aug,
  address      = {Portland, Oregon},
}
Building secure high-performance web services with OKWS Maxwell Krohn. USENIX 2004.
@inproceedings{krohn:okws,
  title        = {Building Secure High-Performance Web Services with
    {OKWS}},
  author       = {Maxwell Krohn},
  address      = {Boston, MA},
  month        = jun,
  booktitle    = {Proceedings of the 2004 USENIX Annual Technical
    Conference (USENIX '04)},
  year         = 2004,
}
REX: Secure, Extensible Remote Execution Michael Kaminsky, Eric Peterson, Daniel B. Giffin, Kevin Fu, David Mazières, and M. Frans Kaashoek. USENIX 2004.

This version of the paper contains corrected values for the round-trip time reported in Section 6.2 and the latencies reported in Table 2. The previous values were only one-way instead of round-trip.

The ubiquitous SSH package has demonstrated the importance of secure remote login and execution. As re- mote execution tools grow in popularity, users require new features and extensions, which are difficult to add to existing systems. REX is a remote execution utility with a novel architecture specifically designed for extensibility as well as security and transparent connection persistence in the face of network complexities such as NAT and dynamic IP addresses. To achieve extensibility, REX bases much of its functionality on a single new abstraction--emulated file descriptor passing across machines. This abstraction is powerful enough for users to extend REX's functionality in many ways without changing the core software or protocol.

REX addresses security in two ways. First, the implementation internally leverages file descriptor passing to split the server into several smaller programs, reducing both privileged and remotely exploitable code. Second, REX selectively delegates authority to processes running on remote machines that need to access other resources. The delegation mechanism lets users incrementally construct trust policies for remote machines. Finally, REX provides mechanisms for accessing servers without globally routable IP addresses, and for resuming sessions when a TCP connection aborts or an endpoint's IP address changes. Measurements of the system demonstrate that REX's architecture does not come at the cost of performance.

@inproceedings{rex:usenix04,
  title        = {{REX}: {S}ecure, {E}xtensible {R}emote {E}xecution},
  author       = {Michael Kaminsky and Eric Peterson and Daniel B.
    Giffin and Kevin Fu and David Mazi{\`e}res and M. Frans Kaashoek},
  address      = {Boston, MA},
  month        = jun,
  pages        = {199--212},
  booktitle    = {Proceedings of the 2004 USENIX Annual Technical
    Conference (USENIX '04)},
  year         = 2004,
}
High-throughput routing for multi-hop wireless networks Douglas S. J. De Couto. Ph.D. thesis, Massachusetts Institute of Technology, June 2004.
@phdthesis{decouto-phd,
  title        = {High-Throughput Routing for Multi-Hop Wireless
    Networks},
  author       = {Douglas S. J. {De Couto}},
  school       = {Massachusetts Institute of Technology},
  year         = 2004,
  month        = jun,
}
On-the-fly verification of rateless erasure codes for efficient content distribution Maxwell N. Krohn, Michael J. Freedman, and David Mazières. IEEE SP 2004.

The quality of peer-to-peer content distribution can suffer when malicious participants intentionally corrupt content. Some systems using simple block-by-block downloading can verify blocks with traditional cryptographic signatures and hashes, but these techniques do not apply well to more elegant systems that use rateless erasure codes for efficient multicast transfers. This paper presents a practical scheme, based on homomorphic hashing, that enables a downloader to perform on-the-fly verification of erasure-encoded blocks.

@inproceedings{otfvec,
  title        = {On-the-Fly Verification of Rateless Erasure Codes
    for Efficient Content Distribution},
  author       = {Maxwell N. Krohn and Michael J. Freedman and David
    Mazi{\`e}res},
  booktitle    = {Proceedings of the {IEEE} Symposium on Security and
    Privacy},
  address      = {Oakland, CA},
  month        = may,
  year         = 2004,
}
Providing asynchronous file I/O for the Plan 9 operating system Jason Hickey. Master's thesis, Massachusetts Institute of Technology, May 2004.
@mastersthesis{plan9:jmhickey-meng,
  title        = {Providing Asynchronous File {I/O} for the {Plan 9}
    Operating System},
  author       = {Jason Hickey},
  school       = {Massachusetts Institute of Technology},
  year         = 2004,
  month        = may,
}
M&M: A passive toolkit for measuring, correlating, and tracking path characteristics Sachin Katti, Dina Katabi, Eddie Kohler, and Jacob Strauss. MIT CSAIL technical report, April 2004.
@techreport{mnm:katti04-tr,
  author       = {Sachin Katti and Dina Katabi and Eddie Kohler and
    Jacob Strauss},
  title        = {{M\&M}: A Passive Toolkit for Measuring,
    Correlating, and Tracking Path Characteristics},
  institution  = {{MIT} Computer Science and Artificial Intelligence
    Laboratory},
  number       = {MIT-CSAIL-TR-945},
  year         = 2004,
  month        = apr,
}
Designing a DHT for low latency and high throughput Frank Dabek, Jinyang Li, Emil Sit, James Robertson, M. Frans Kaashoek, and Robert Morris. NSDI 2004.

Designing a wide-area distributed hash table (DHT) that provides high-throughput and low-latency network storage is a challenge. Existing systems have explored a range of solutions, including iterative routing, recursive routing, proximity routing and neighbor selection, erasure coding, replication, and server selection.

This paper explores the design of these techniques and their interaction in a complete system, drawing on the measured performance of a new DHT implementation and results from a simulator with an accurate Internet latency model. New techniques that resulted from this exploration include use of latency predictions based on synthetic coordinates, efficient integration of lookup routing and data fetching, and a congestion control mechanism suitable for fetching data striped over large numbers of servers.

Measurements with 425 server instances running on 150 PlanetLab and RON hosts show that the latency optimizations reduce the time required to locate and fetch data by a factor of two. The throughput optimizations result in a sustainable bulk read throughput related to the number of DHT hosts times the capacity of the slowest access link; with 150 selected PlanetLab hosts, the peak aggregate throughput over multiple clients is 12.8 megabytes per second.

@inproceedings{dhash:nsdi,
  title        = {Designing a {DHT} for low latency and high
    throughput},
  author       = {Frank Dabek and Jinyang Li and Emil Sit and James
    Robertson and M. Frans Kaashoek and Robert Morris},
  booktitle    = {Proceedings of the 1st {USENIX} {S}ymposium on
    {N}etworked {S}ystems {D}esign and {I}mplementation ({NSDI} '04)},
  year         = 2004,
  month        = mar,
  address      = {San Francisco, California},
}
Comparing the performance of distributed hash tables under churn Jinyang Li, Jeremy Stribling, Thomer M. Gil, Robert Morris, and M. Frans Kaashoek. IPTPS 2004.

A protocol for a distributed hash table (DHT) incurs communication costs to keep up with churn---changes in membership---in order to maintain its ability to route lookups efficiently. This paper formulates a unified framework for evaluating cost and performance. Communication costs are combined into a single cost measure (bytes), and performance benefits are reduced to a single latency measure. This approach correctly accounts for background maintenance traffic and timeouts during lookup due to stale routing data, and also correctly leaves open the possibility of different preferences in the tradeoff of lookup time versus communication cost. Using the unified framework, this paper analyzes the effects of DHT parameters on the performance of four protocols under churn.

@inproceedings{dhtcomparison:iptps04,
  title        = {Comparing the performance of distributed hash tables
    under churn},
  author       = {Jinyang Li and Jeremy Stribling and Thomer M. Gil
    and Robert Morris and M. Frans Kaashoek},
  booktitle    = {Proceedings of the 3rd International Workshop on
    Peer-to-Peer Systems (IPTPS04)},
  year         = 2004,
  month        = feb,
  address      = {San Diego, CA},
}
UsenetDHT: A low overhead Usenet server Emil Sit, Frank Dabek, and James Robertson. IPTPS 2004.
@inproceedings{usenetdht:iptps04,
  title        = {{UsenetDHT}: A Low Overhead {Usenet} Server},
  author       = {Emil Sit and Frank Dabek and James Robertson},
  booktitle    = {Proceedings of the 3rd International Workshop on
    Peer-to-Peer Systems (IPTPS04)},
  year         = 2004,
  month        = feb,
  address      = {San Diego, CA},
}
Parsing expression grammars: A recognition-based syntactic foundation Bryan Ford. POPL 2004.
@inproceedings{parsing:popl04,
  author       = {Bryan Ford},
  title        = {Parsing Expression Grammars: A Recognition-Based
    Syntactic Foundation},
  booktitle    = {Proceedings of the Symposium on Principles of
    Programming Languages},
  year         = 2004,
  month        = jan,
}

2003

Unmanaged internet protocol: Taming the edge network management crisis Bryan Ford. HotNets 2003.
@inproceedings{uip:hotnets03,
  author       = {Bryan Ford},
  title        = {Unmanaged Internet Protocol: Taming the Edge Network
    Management Crisis},
  booktitle    = {Proceedings of the Second {W}orkshop on {H}ot
    {T}opics in {N}etworks ({HotNets-II})},
  year         = 2003,
  month        = nov,
  organization = {{ACM SIGCOMM}},
  address      = {Cambridge, Massachusetts},
}
Practical, distributed network coordinates Russ Cox, Frank Dabek, Frans Kaashoek, Jinyang Li, and Robert Morris. HotNets 2003.

Vivaldi is a distributed algorithm that assigns synthetic coordinates to Internet hosts, so that the Euclidean distance between two hosts' coordinates predicts the network latency between them. Each node in Vivaldi computes its coordinates by simulating its position in a network of physical springs. Vivaldi is both distributed and efficient: no fixed infrastructure need be deployed and a new host can compute useful coordinates after collecting latency information from only a few other hosts. Vivaldi can rely on piggy-backing latency information on application traffic instead of generating extra traffic by sending its own probe packets.

This paper evaluates Vivaldi through simulations of 750 hosts, with a matrix of inter-host latencies derived from measurements between 750 real Internet hosts. Vivaldi finds synthetic coordinates that predict the measured latencies with a median relative error of 14 percent. The simulations show that a new host joining an existing Vivaldi system requires fewer than 10 probes to achieve this accuracy. Vivaldi is currently used by the Chord distributed hash table to perform proximity routing, replica selection, and retransmission timer estimation.

@inproceedings{hotnets:vivaldi,
  title        = {Practical, Distributed Network Coordinates},
  author       = {Russ Cox and Frank Dabek and Frans Kaashoek and
    Jinyang Li and Robert Morris},
  booktitle    = {Proceedings of the Second {W}orkshop on {H}ot
    {T}opics in {N}etworks ({HotNets-II})},
  year         = 2003,
  month        = nov,
  organization = {{ACM SIGCOMM}},
  address      = {Cambridge, Massachusetts},
}
Opportunistic routing in multi-hop wireless networks Sanjit Biswas, and Robert Morris. HotNets 2003.

This paper describes Extremely Opportunistic Routing (ExOR), a new unicast routing technique for multi-hop wireless networks. ExOR forwards each packet through a sequence of nodes, deferring the choice of each node in the sequence until after the previous node has transmitted the packet on its radio. ExOR then determines which node, of all the nodes that successfully received that transmission, is the node closest to the destination. That closest node transmits the packet. The result is that each hop moves the packet farther (on average) than the hops of the best possible pre-determined route.

The ExOR design addresses the challenge of choosing a forwarding node after transmission using a distributed algorithm. First, when a node transmits a packet, it includes in the packet a simple schedule describing the priority order in which the potential receivers should forward the packet. The node computes the schedule based on shared measurements of inter-node delivery rates. ExOR then uses a distributed slotted MAC protocol for acknowledgments to ensure that the receivers agree who the highest priority receiver was.

The efficacy of ExOR depends mainly on the rate at which the reception probability falls off with distance. Simulations based on measured radio characteristics suggest that ExOR reduces the total number of transmissions by nearly a factor of two over the best possible pre-determined route.

@inproceedings{roofnet:exor-hotnets03,
  title        = {Opportunistic Routing in Multi-Hop Wireless Networks},
  author       = {Sanjit Biswas and Robert Morris},
  booktitle    = {Proceedings of the Second {W}orkshop on {H}ot
    {T}opics in {N}etworks ({HotNets-II})},
  year         = 2003,
  month        = nov,
  organization = {{ACM SIGCOMM}},
  address      = {Cambridge, Massachusetts},
}
Measuring the effects of internet path faults on reactive routing Nick Feamster, David G. Andersen, Hari Balakrishnan, and M. Frans Kaashoek. SIGMETRICS 2003.
@inproceedings{sigmetrics:routing,
  title        = {Measuring the effects of internet path faults on
    reactive routing},
  author       = {Nick Feamster and David G. Andersen and Hari
    Balakrishnan and M. Frans Kaashoek},
  pages        = {126--137},
  booktitle    = {Proceedings of the 2003 Conference on Measurement
    and Modelling of Computer Systems},
  year         = 2003,
  month        = nov,
  organization = {{ACM SIGMETRICS}},
  address      = {San Diego, CA},
}
Scalable Internet routing on topology-independent node identities Bryan Ford. MIT LCS technical report, October 2003.
@techreport{uip:ford03-tr,
  author       = {Bryan Ford},
  title        = {Scalable {Internet} Routing on Topology-Independent
    Node Identities},
  institution  = {{MIT} Laboratory for Computer Science},
  number       = {MIT-LCS-TR-926},
  year         = 2003,
  month        = oct,
}
A Measurement Study of Available Bandwidth Estimation Tools Jacob Strauss, Dina Katabi, and Frans Kaashoek. SIGCOMM 2003.
@inproceedings{spruce:imc03,
  title        = {{A Measurement Study of Available Bandwidth
    Estimation Tools}},
  author       = {Jacob Strauss and Dina Katabi and Frans Kaashoek},
  booktitle    = {Proceedings of the {ACM} {SIGCOMM} Internet
    Measurement Conference '03},
  year         = 2003,
  month        = oct,
  address      = {Miami, Florida},
}
Decentralized user authentication in a global file system Michael Kaminsky, George Savvides, David Mazières, and M. Frans Kaashoek. SOSP 2003.

The challenge for user authentication in a global file system is allowing people to grant access to specific users and groups in remote administrative domains, without assuming any kind of pre-existing administrative relationship. The traditional approach to user authentication across administrative domains is for users to prove their identities through a chain of certificates. Certificates allow for general forms of delegation, but they often require more infrastructure than is necessary to support a network file system.

This paper introduces an approach without certificates. Local authentication servers pre-fetch and cache remote user and group definitions from remote authentication servers. During a file access, an authentication server can establish identities for users based just on local information. This approach is particularly well-suited to file systems, and it provides a simple and intuitive interface that is similar to those found in local access control mechanisms. An implementation of the authentication server and a file server supporting access control lists demonstrate the viability of this design in the context of the Self-certifying File System (SFS). Experiments demonstrate that the authentication server can scale to groups with tens of thousands of members.

@inproceedings{sfs:sosp03,
  title        = {{D}ecentralized User Authentication in a Global File
    System},
  author       = {Michael Kaminsky and George Savvides and David
    Mazi{\`e}res and M. Frans Kaashoek},
  pages        = {60--73},
  booktitle    = {Proceedings of the 19th {ACM} {S}ymposium on
    {O}perating {S}ystems {P}rinciples ({SOSP} '03)},
  year         = 2003,
  month        = oct,
  address      = {Bolton Landing, New York},
}
A high-throughput path metric for multi-hop wireless routing Douglas S. J. De Couto, Daniel Aguayo, John Bicket, and Robert Morris. MobiCom 2003.

This paper presents the expected transmission count metric (ETX), which finds high-throughput paths on multi-hop wireless networks. ETX minimizes the expected total number of packet transmissions (including retransmissions) required to successfully deliver a packet to the ultimate destination. The ETX metric incorporates the effects of link loss ratios, asymmetry in the loss ratios between the two directions of each link, and interference among the successive links of a path. In contrast, the minimum hop-count metric chooses arbitrarily among the different paths of the same minimum length, regardless of the often large differences in throughput among those paths, and ignoring the possibility that a longer path might offer higher throughput.

This paper describes the design and implementation of ETX as a metric for the DSDV and DSR routing protocols, as well as modifications to DSDV and DSR which allow them to use ETX. Measurements taken from a 29-node 802.11b test-bed demonstrate the poor performance of minimum hop-count, illustrate the causes of that poor performance, and confirm that ETX improves performance. For long paths the throughput improvement is often a factor of two or more, suggesting that ETX will become more useful as networks grow larger and paths become longer.

@inproceedings{grid:mobicom03,
  title        = {A High-Throughput Path Metric for Multi-Hop Wireless
    Routing},
  author       = {Douglas S. J. {De Couto} and Daniel Aguayo and John
    Bicket and Robert Morris},
  booktitle    = {Proceedings of the 9th {ACM} International
    Conference on Mobile Computing and Networking ({MobiCom} '03)},
  year         = 2003,
  month        = sep,
  address      = {San Diego, California},
}
SSL splitting: securely serving data from untrusted caches Chris Lesniewski-Laas, and M. Frans Kaashoek. USENIX Security 2003.

A popular technique for reducing the bandwidth load on Web servers is to serve the content from proxies. Typically these hosts are trusted by the clients and server not to modify the data that they proxy. SSL splitting is a new technique for guaranteeing the integrity of data served from proxies without requiring changes to Web clients. Instead of relaying an insecure HTTP connection, an SSL splitting proxy simulates a normal Secure Sockets Layer (SSL) connection with the client by merging authentication records from the server with data records from a cache. This technique reduces the bandwidth load on the server, while allowing an unmodified Web browser to verify that the data served from proxies is endorsed by the originating server.

SSL splitting is implemented as a patch to the industry-standard OpenSSL library, with which the server is linked. In experiments replaying two-hour access.log traces taken from LCS Web sites over an ADSL link, SSL splitting reduces bandwidth consumption of the server by between 25% and 90% depending on the warmth of the cache and the redundancy of the trace. Uncached requests forwarded through the proxy exhibit latencies within approximately 5% of those of an unmodified SSL server.

@inproceedings{ssl-splitting:usenixsecurity03,
  title        = {{SSL} splitting: securely serving data from
    untrusted caches},
  author       = {Chris Lesniewski-Laas and M. Frans Kaashoek},
  pages        = {187--199},
  booktitle    = {Proceedings of the 12th {USENIX} {S}ecurity
    {S}ymposium},
  year         = 2003,
  month        = aug,
  address      = {Washington, D.C.},
}
Experience with an evolving overlay network testbed David Andersen, Hari Balakrishnan, M. Frans Kaashoek, and Robert Morris. SIGCOMM 33(3), July 2003.
@article{ron:ccr,
  title        = {Experience with an evolving overlay network testbed},
  author       = {David Andersen and Hari Balakrishnan and M. Frans
    Kaashoek and Robert Morris},
  month        = jul,
  volume       = 33,
  number       = 3,
  pages        = {13--19},
  journal      = {{ACM SIGCOMM} Computer Communication Review},
  year         = 2003,
}
Multiprocessor support for event-driven programs Nickolai Zeldovich, Alexander Yip, Frank Dabek, Robert Morris, David Mazières, and Frans Kaashoek. USENIX 2003.

This paper presents a new asynchronous programming library (libasync-smp) that allows event-driven applications to take advantage of multiprocessors by running code for event handlers in parallel. To control the concurrency between events, the programmer can specify a color for each event: events with the same color (the default case) are handled serially; events with different colors can be handled in parallel. The programmer can incrementally expose parallelism in existing event-driven applications by assigning different colors to computationally-intensive events that do not share mutable state.

An evaluation of libasyncsmp demonstrates that applications achieve multiprocessor speedup with little programming effort. As an example, parallelizing the cryptography in the SFS file server required about 90 lines of changed code in two modules, out of a total of about 12,000 lines. Multiple clients were able to read large cached files from the libasync-smp SFS server running on a 4-CPU machine 2.5 times as fast as from an unmodified uniprocessor SFS server on one CPU. Applications without computationally intensive tasks also benefit: an event-driven Web server achieves 1.5 speedup on four CPUs with multiple clients reading small cached files.

@inproceedings{asyncmp:usenix03,
  title        = {Multiprocessor Support for Event-Driven Programs},
  author       = {Nickolai Zeldovich and Alexander Yip and Frank Dabek
    and Robert Morris and David Mazi{\`e}res and Frans Kaashoek},
  booktitle    = {Proceedings of the 2003 USENIX Annual Technical
    Conference (USENIX '03)},
  year         = 2003,
  month        = jun,
  address      = {San Antonio, Texas},
}
Certifying program execution with secure processors Benjie Chen, and Robert Morris. HotOS 2003.
@inproceedings{cerium:hotos03,
  title        = {{C}ertifying Program Execution with Secure
    Processors},
  author       = {Benjie Chen and Robert Morris},
  booktitle    = {Proceedings of the 9th {W}orkshop on {H}ot {T}opics
    in {O}perating {S}ystems ({HotOS-IX})},
  year         = 2003,
  month        = may,
  address      = {Lihue, Hawaii},
}
Robust and efficient data management for a distributed hash table Josh Cates. Master's thesis, Massachusetts Institute of Technology, May 2003.
@mastersthesis{chord:cates-meng,
  title        = {Robust and Efficient Data Management for a
    Distributed Hash Table},
  author       = {Josh Cates},
  school       = {Massachusetts Institute of Technology},
  year         = 2003,
  month        = may,
}
On the feasibility of peer-to-peer web indexing and search Jinyang Li, Boon Thau Loo, Joseph M. Hellerstein, M. Frans Kaashoek, David Karger, and Robert Morris. IPTPS 2003.
@inproceedings{p2psearch:iptps03,
  title        = {On the feasibility of peer-to-peer web indexing and
    search},
  author       = {Jinyang Li and Boon Thau Loo and Joseph M.
    Hellerstein and M. Frans Kaashoek and David Karger and Robert
    Morris},
  booktitle    = {Proceedings of the 2nd International Workshop on
    Peer-to-Peer Systems (IPTPS03)},
  year         = 2003,
  month        = feb,
  address      = {Berkeley, CA},
}
Towards a common API for structured peer-to-peer overlays Frank Dabek, Ben Zhao, Peter Druschel, John Kubiatowicz, and Ion Stoica. IPTPS 2003.

In this paper, we describe an ongoing effort to define common APIs for structured peer-to-peer overlays and the key abstractions that can be built on them. In doing so, we hope to facilitate independent innovation in overlay protocols, services, and applications, to allow direct experimental comparisons, and to encourage application development by third parties. We provide a snapshot of our efforts and discuss open problems in an effort to solicit feedback from the research community.

@inproceedings{iptps:apis,
  title        = {Towards a Common {API} for Structured Peer-to-Peer
    Overlays},
  author       = {Frank Dabek and Ben Zhao and Peter Druschel and John
    Kubiatowicz and Ion Stoica},
  booktitle    = {Proceedings of the 2nd International Workshop on
    Peer-to-Peer Systems (IPTPS03)},
  year         = 2003,
  month        = feb,
  address      = {Berkeley, CA},
}
Chord: A scalable peer-to-peer lookup service for internet applications Ion Stoica, Robert Morris, David Liben-Nowell, David Karger, M. Frans Kaashoek, Frank Dabek, and Hari Balakrishnan. IEEE Transactions on Networking 11, February 2003.

A fundamental problem that confronts peer-to-peer applications is the efficient location of the node that stores a desired data item. This paper presents Chord, a distributed lookup protocol that addresses this problem. Chord provides support for just one operation: given a key, it maps the key onto a node. Data location can be easily implemented on top of Chord by associating a key with each data item, and storing the key/data pair at the node to which the key maps. Chord adapts efficiently as nodes join and leave the system, and can answer queries even if the system is continuously changing. Results from theoretical analysis and simulations show that Chord is scalable: communication cost and the state maintained by each node scale logarithmically with the number of Chord nodes.

@article{chord:ton,
  title        = {Chord: A Scalable Peer-to-peer Lookup Service for
    Internet Applications},
  author       = {Ion Stoica and Robert Morris and David Liben-Nowell
    and David Karger and M. Frans Kaashoek and Frank Dabek and Hari
    Balakrishnan},
  journal      = {{IEEE} Transactions on Networking},
  volume       = 11,
  year         = 2003,
  month        = feb,
},
}
Looking up data in P2P systems Hari Balakrishnan, M. Frans Kaashoek, David Karger, Robbert Morris, and Ion Stoica. Communications of the ACM 46, February 2003.
@article{acm:lookup,
  title        = {Looking up data in {P2P} systems},
  author       = {Hari Balakrishnan and M. Frans Kaashoek and David
    Karger and Robbert Morris and Ion Stoica},
  volume       = 46,
  month        = feb,
  year         = 2003,
  pages        = {43--48},
  journal      = {Communications of the ACM},
}
Melody: A distributed music-sharing system James Robertson. Master's thesis, Massachusetts Institute of Technology, February 2003.
@mastersthesis{chord:jsr-meng,
  title        = {Melody: A Distributed Music-Sharing System},
  author       = {James Robertson},
  school       = {Massachusetts Institute of Technology},
  year         = 2003,
  month        = feb,
}
SSL splitting and barnraising: Cooperative caching with authenticity guarantees Chris Lesniewski-Laas. Master's thesis, Massachusetts Institute of Technology, February 2003.

SSL splitting is a cryptographic technique to guarantee that public data served by caching Web proxies is endorsed by the originating server. When a client makes a request, the trusted server generates a stream of authentication records and sends them to the untrusted proxy, which combines them with a stream of data records retrieved from its local cache. The combined stream is relayed to the client, a standard Web browser, which verifies the data's integrity. Since the combined stream simulates a normal Secure Sockets Layer (SSL) connection, SSL splitting works with unmodified browsers; however, since it does not provide confidentiality, it is appropriate for applications that require only authentication. The server must be linked to a patched version of the industry-standard OpenSSL library; no other server modifications are necessary. In experiments replaying two-hour access.log traces taken from LCS Web sites over a DSL link, SSL splitting reduces bandwidth consumption of the server by between 25% and 90% depending on the warmth of the cache and the redundancy of the trace. Uncached requests forwarded through the proxy exhibit latencies within approximately 5% of those of an unmodified SSL server.

@mastersthesis{ssl-splitting:ctl-meng,
  title        = {{SSL} splitting and Barnraising: Cooperative caching
    with authenticity guarantees},
  author       = {Chris Lesniewski-Laas},
  school       = {Massachusetts Institute of Technology},
  year         = 2003,
  month        = feb,
}
REX: Secure, modular remote execution through file descriptor passing Michael Kaminsky, Eric Peterson, Kevin Fu, David Mazières, and M. Frans Kaashoek. MIT LCS technical report, January 2003.

The ubiquitous SSH package has demonstrated the importance of secure remote login and execution. This paper presents a new system, REX, designed to provide remote login and execution in the context of the SFS secure distributed file system. REX departs from traditional remote login design and is built around two main mechanisms---file descriptor passing and a user agent process.

File descriptor passing allows REX to be split into several smaller pieces; privileged code can run as its own process to provide enhanced security guarantees. REX also emulates secure file descriptor passing over network connections, allowing users to build extensions to REX outside of the core REX software.

REX uses and extends SFS's agent mechanism to provide a transparent distributed computing environment to users. The agent stores private keys, server nicknames, and other per-user configuration state; REX makes the SFS agent available to programs that it executes on remote machines.

We have an implementation of REX and demonstrate that its flexibility does not come at the cost of performance. Initial REX connections are comparable to those of SSH in speed, while subsequent connections are much faster because REX exploits the SFS agent to cache connection state to avoid costly public-key operations.

@techreport{sfs:rextr03,
  title        = {{REX}: Secure, modular remote execution through file
    descriptor passing},
  author       = {Michael Kaminsky and Eric Peterson and Kevin Fu and
    David Mazi{\`e}res and M. Frans Kaashoek},
  institution  = {{MIT} Laboratory for Computer Science},
  year         = 2003,
  month        = jan,
  number       = {MIT-LCS-TR-884},
  note         = {http://www.pdos.lcs.mit.edu/papers/sfs:rextr03/},
}

2002

Ivy: A read/write peer-to-peer file system Athicha Muthitacharoen, Robert Morris, Thomer Gil, and Benjie Chen. OSDI 2002.
@inproceedings{ivy:osdi02,
  title        = {Ivy: A Read/Write Peer-to-peer File System},
  author       = {Athicha Muthitacharoen and Robert Morris and Thomer
    Gil and Benjie Chen},
  booktitle    = {Proceedings of the 5th {USENIX} {S}ymposium on
    {O}perating {S}ystems {D}esign and {I}mplementation ({OSDI} '02)},
  year         = 2002,
  month        = dec,
  address      = {Boston, Massachusetts},
}
A session-based architecture for Internet mobility Alex C. Snoeren. Ph.D. thesis, Massachusetts Institute of Technology, December 2002.
@phdthesis{snoeren-phd,
  title        = {A Session-Based Architecture for {Internet} Mobility},
  author       = {Alex C. Snoeren},
  school       = {Massachusetts Institute of Technology},
  year         = 2002,
  month        = dec,
}
Tarzan: A peer-to-peer anonymizing network layer Michael J. Freedman, and Robert Morris. CCS 2002.

Tarzan is a peer-to-peer anonymous IP network overlay. Because it provides IP service, Tarzan is general-purpose and transparent to applications. Organized as a decentralized peer-to-peer overlay, Tarzan is fault-tolerant, highly scalable, and easy to manage.

Tarzan achieves its anonymity with layered encryption and multi-hop routing, much like a Chaumian mix. A message initiator chooses a path of peers pseudo-randomly through a restricted topology in a way that adversaries cannot easily influence. Cover traffic prevents a global observer from using traffic analysis to identify an initiator. Protocols toward unbiased peer-selection offer new directions for distributing trust among untrusted entities.

Tarzan provides anonymity to either clients or servers, without requiring that both participate. In both cases, Tarzan uses a network address translator (NAT) to bridge between Tarzan hosts and oblivious Internet hosts.

Measurements show that Tarzan imposes minimal overhead over a corresponding non-anonymous overlay route.

@inproceedings{tarzan:ccs9,
  title        = {Tarzan: A Peer-to-Peer Anonymizing Network Layer},
  author       = {Michael J. Freedman and Robert Morris},
  booktitle    = {Proceedings of the 9th {ACM} Conference on Computer
    and Communications Security ({CCS-9})},
  year         = 2002,
  month        = nov,
  address      = {Washington, D.C.},
}
Programming language optimizations for modular router configurations Eddie Kohler, Robert Morris, and Benjie Chen. ASPLOS 2002.
@inproceedings{click:asplos02,
  title        = {Programming Language Optimizations for Modular
    Router Configurations},
  author       = {Eddie Kohler and Robert Morris and Benjie Chen},
  booktitle    = {Proceedings of the 10th Conference on Architectural
    Support for Programming Languages and Operating Systems (ASPLOS)},
  month        = oct,
  year         = 2002,
}
Performance of multihop wireless networks: Shortest path is not enough Douglas S. J. De Couto, Daniel Aguayo, Benjamin A. Chambers, and Robert Morris. HotNets 2002.

Existing wireless ad hoc routing protocols typically find routes with the minimum hop-count. This paper presents experimental evidence from two wireless test-beds which shows that there are usually multiple minimum hop-count paths, many of which have poor throughput. As a result, minimum-hop-count routing often chooses routes that have significantly less capacity than the best paths that exist in the network. Much of the reason for this is that many of the radio links between nodes have loss rates low enough that the routing protocol is willing to use them, but high enough that much of the capacity is consumed by retransmissions. These observations suggest that more attention be paid to link quality when choosing ad hoc routes; the paper presents measured link characteristics likely to be useful in devising a better path quality metric.

@inproceedings{grid:hotnets02,
  title        = {Performance of Multihop Wireless Networks: Shortest
    Path is Not Enough},
  author       = {Douglas S. J. {De Couto} and Daniel Aguayo and
    Benjamin A. Chambers and Robert Morris},
  booktitle    = {Proceedings of the First {W}orkshop on {H}ot
    {T}opics in {N}etworks ({HotNets-I})},
  year         = 2002,
  month        = oct,
  organization = {{ACM SIGCOMM}},
  address      = {Princeton, New Jersey},
}
Packrat parsing: Simple, powerful, lazy, linear time Bryan Ford. ICFP 2002.
@inproceedings{packrat-parsing:icfp02,
  author       = {Bryan Ford},
  title        = {Packrat Parsing: Simple, Powerful, Lazy, Linear Time},
  booktitle    = {Proceedings of the 2002 International Conference on
    Functional Programming},
  year         = 2002,
  month        = oct,
}
Event-driven programming for robust software Frank Dabek, Nickolai Zeldovich, Frans Kaashoek, David Mazières, and Robert Morris. SIGOPS European Workshop 2002.

Events are a better means of managing I/O concurrency in server software than threads: events help avoid bugs caused by the unnecessary CPU concurrency introduced by threads. Event-based programs also tend to have more stable performance under heavy load than threaded programs. We argue that our libasync-smp non-blocking I/O library makes event-based programming convenient and evaluate extensions to the library that allow event-based programs to take advantage of multi-processors. We conclude that events provide all the benefits of threads, with substantially less complexity; the result is more robust software.

@inproceedings{events:sigops,
  title        = {Event-Driven Programming for Robust Software},
  author       = {Frank Dabek and Nickolai Zeldovich and Frans
    Kaashoek and David Mazi{\`e}res and Robert Morris},
  booktitle    = {Proceedings of the 2002 SIGOPS European Workshop},
  year         = 2002,
  month        = sep,
  address      = {Saint-Emilion, France},
}
Span: An energy-efficient coordination algorithm for topology maintenance in ad hoc wireless networks Benjie Chen, Kyle Jamieson, Hari Balakrishnan, and Robert Morris. ACM Wireless Networks 8(5), September 2002.

This paper presents Span, a power saving technique for multi-hop ad hoc wireless networks that reduces energy consumption without significantly diminishing the capacity or connectivity of the network. Span builds on the observation that when a region of a shared-channel wireless network has a sufficient density of nodes, only a small number of them need be on at any time to forward traffic for active connections.

Span is a distributed, randomized algorithm where nodes make local decisions on whether to sleep, or to join a forwarding backbone as a coordinator. Each node bases its decision on an estimate of how many of its neighbors will benefit from it being awake, and the amount of energy available to it. We give a randomized algorithm where coordinators rotate with time, demonstrating how localized node decisions lead to a connected, capacity-preserving global topology.

Improvement in system lifetime due to Span increases as the ratio of idle-to-sleep energy consumption increases. Our simulations show that with a practical energy model, system lifetime of an 802.11 network in power saving mode with Span is a factor of two better than without. Additionally, Span also improves communication latency and capacity.

@article{span:wireless01,
  title        = {Span: An Energy-Efficient Coordination Algorithm for
    Topology Maintenance in Ad Hoc Wireless Networks},
  author       = {Benjie Chen and Kyle Jamieson and Hari Balakrishnan
    and Robert Morris},
  volume       = 8,
  number       = 5,
  year         = 2002,
  month        = sep,
},
  journal      = {{ACM} Wireless Networks},
}
Packrat parsing: a practical linear-time algorithm with backtracking Bryan Ford. Master's thesis, Massachusetts Institute of Technology, September 2002.
@mastersthesis{packrat-parsing:ford-ms,
  author       = {Bryan Ford},
  title        = {Packrat Parsing: a practical linear-time algorithm
    with backtracking},
  school       = {Massachusetts Institute of Technology},
  year         = 2002,
  month        = sep,
}
Choosing Internet paths with high bulk transfer capacity Jacob Strauss. Master's thesis, Massachusetts Institute of Technology, Sep 2002.
@mastersthesis{chord:jastr-meng,
  title        = {Choosing {Internet} Paths with High Bulk Transfer
    Capacity},
  author       = {Jacob Strauss},
  school       = {Massachusetts Institute of Technology},
  year         = 2002,
  month        = sep,
}
Network layer support for overlay networks John Jannotti. Ph.D. thesis, Massachusetts Institute of Technology, August 2002.
@phdthesis{jannotti-phd,
  title        = {Network Layer Support for Overlay Networks},
  author       = {John Jannotti},
  school       = {Massachusetts Institute of Technology},
  year         = 2002,
  month        = aug,
}
Natron: Overlay routing to oblivious destinations Alexander Yip. Master's thesis, Massachusetts Institute of Technology, Aug 2002.
@mastersthesis{natron:yip-meng,
  title        = {NATRON: Overlay Routing to Oblivious Destinations},
  author       = {Alexander Yip},
  school       = {Massachusetts Institute of Technology},
  year         = 2002,
  month        = aug,
}
Access control lists for the self-certifying filesystem George Savvides. Master's thesis, Massachusetts Institute of Technology, Aug 2002.
@mastersthesis{sfs:savvides-meng,
  title        = {Access Control Lists for the Self-Certifying
    Filesystem},
  author       = {George Savvides},
  school       = {Massachusetts Institute of Technology},
  year         = 2002,
  month        = aug,
}
Self-certifying filesystem implementation for Windows David Euresti. Master's thesis, Massachusetts Institute of Technology, Aug 2002.
@mastersthesis{sfs:euresti-meng,
  title        = {Self-Certifying Filesystem Implementation for
    {Windows}},
  author       = {David Euresti},
  school       = {Massachusetts Institute of Technology},
  year         = 2002,
  month        = aug,
}
Simple and general statistical profiling with PCT Charles Blake, and Steve Bauer. USENIX 2002.

The Profile Collection Toolkit (PCT) provides a novel generalized CPU profiling facility. PCT enables arbitrarily late profiling activation and arbitrarily early report generation. PCT usually requires no re-compilation, re-linking, or even re-starting of programs. Profiling reports gracefully degrade with available debugging data.

PCT uses its debugger controller, dbctl, to drive a debugger's control over a process. dbctl has a configuration language that allows users to specify context-specific debugger commands. These commands can sample general program state, such as call stacks and function parameters.

For systems or situations with poor debugger support, PCT provides several other portable and flexible collection methods. PCT can track most program code, including code in shared libraries and late-loaded shared objects. On Linux, PCT can seamlessly merge kernel CPU time profiles with user-level CPU profiles to create whole system reports.

@inproceedings{pct:usenix02,
  title        = {Simple and General Statistical Profiling with {PCT}},
  author       = {Charles Blake and Steve Bauer},
  pages        = {333--346},
  booktitle    = {Proceedings of the 2002 USENIX Annual Technical
    Conference (USENIX '02)},
  year         = 2002,
  month        = jun,
  address      = {Monterey, California},
}
A keyword set search system for peer-to-peer networks Omprakash D Gnawali. Master's thesis, Massachusetts Institute of Technology, Jun 2002.
@mastersthesis{chord:om_p-meng,
  title        = {A Keyword Set Search System for Peer-to-Peer
    Networks},
  author       = {Omprakash D Gnawali},
  school       = {Massachusetts Institute of Technology},
  year         = 2002,
  month        = jun,
}
Concurrency control for multi-processor event-driven systems Nickolai Zeldovich. Master's thesis, Massachusetts Institute of Technology, May 2002.
@mastersthesis{sfs:zeldovich-meng,
  title        = {Concurrency Control for Multi-Processor Event-Driven
    Systems},
  author       = {Nickolai Zeldovich},
  school       = {Massachusetts Institute of Technology},
  year         = 2002,
  month        = may,
}
The Grid Roofnet: a rooftop ad hoc wireless network Benjamin A. Chambers. Master's thesis, Massachusetts Institute of Technology, May 2002.
@mastersthesis{grid:bac-meng,
  title        = {The {Grid} {Roofnet}: a Rooftop Ad Hoc Wireless
    Network},
  author       = {Benjamin A. Chambers},
  school       = {Massachusetts Institute of Technology},
  year         = 2002,
  month        = may,
}
A peer-to-peer anonymizing network layer Michael J. Freedman. Master's thesis, Massachusetts Institute of Technology, May 2002.

Existing Internet systems implement anonymity at the application layer or through centralized components. A robust, decentralized infrastructure that anonymizes any Internet traffic could benefit a wide array of existing protocols and systems. This anonymous network layer could seamlessly replace the current communications channel, and it could continue to offer anonymity and availability even while components fail maliciously.

This thesis proposes Tarzan, a peer-to-peer anonymous IP network overlay. Because it provides IP service, Tarzan is general-purpose and transparent to applications. Organized as a decentralized peer-to-peer overlay, Tarzan is fault-tolerant, highly scalable, and easy to manage.

Tarzan achieves its anonymity with layered encryption and multi-hop routing, much like a Chaumian mix. A message initiator chooses a path of peers pseudo-randomly in a way that adversaries cannot easily influence. Cover traffic prevents a global observer from drawing conclusions based on traffic analysis as to an initiator's identity.

Tarzan provides anonymity to either clients or servers, without requiring that both participate, presenting the abstraction of a one-way anonymous tunnel. In both cases, Tarzan uses a network address translator (NAT) to bridge between Tarzan hosts and oblivious Internet hosts.

We quantify Tarzan's anonymity properties and show that Tarzan imposes minimal performance overhead over a corresponding non-anonymous overlay route.

@mastersthesis{tarzan:freedman-meng,
  title        = {A Peer-to-Peer Anonymizing Network Layer},
  author       = {Michael J. Freedman},
  school       = {Massachusetts Institute of Technology},
  year         = 2002,
  month        = may,
}
Herodotus: A peer-to-peer web archival system Timo Burkard. Master's thesis, Massachusetts Institute of Technology, May 2002.
@mastersthesis{chord:tburkard-meng,
  title        = {Herodotus: A Peer-to-Peer Web Archival System},
  author       = {Timo Burkard},
  school       = {Massachusetts Institute of Technology},
  year         = 2002,
  month        = may,
}
Efficient peer-to-peer lookup based on a distributed trie Michael J. Freedman, and Radek Vingralek. IPTPS 2002.

Two main approaches have been taken for distributed key/value lookup operations in peer-to-peer systems: broadcast searches and location-deterministic algorithms. We describe a third alternative based on a distributed trie. This algorithm functions well in a very dynamic, hostile environment, offering security benefits over prior proposals. Our approach takes advantage of working-set temporal locality and global key/value distribution skews due to content popularity. Peers gradually learn system state during lookups, receiving the sought values and/or internal information used by the trie. The distributed trie converges to an accurate network map over time. We describe several modes of information piggybacking, and conservative and liberal variances of the basic algorithm for adversarial settings. Simulations show efficient lookups and low failure rates.

@inproceedings{trie:iptps02,
  title        = {Efficient Peer-To-Peer Lookup Based on a Distributed
    Trie},
  author       = {Michael J. Freedman and Radek Vingralek},
  booktitle    = {Proceedings of the 1st International Workshop on
    Peer-to-Peer Systems (IPTPS)},
  year         = 2002,
  month        = mar,
  address      = {Cambridge, MA},
}
Serving DNS using Chord Russ Cox, Athicha Muthitacharoen, and Robert Morris. IPTPS 2002.

The current domain name system (DNS) couples ownership of domains with the responsibility of serving data for them. The DNS security extensions (DNSSEC) allow verificaton of records obtained by alternate means, opening exploration of alternative storage systems for DNS records. We explore one such alternative using DHash, a peer-to-peer distributed hash table built on top of Chord. Our system inherits Chord's fault-tolerance and load balance properties, at the same time eliminating many administrative problems with the current DNS. Still, our system has significantly higher latencies and other disadvantages in comparison with conventional DNS. We use this comparison to draw conclusions about general issues that still need to be addressed in peer-to-peer systems and distributed hash tables in particular.

@inproceedings{chord:dns02,
  title        = {Serving {DNS} using {Chord}},
  author       = {Russ Cox and Athicha Muthitacharoen and Robert
    Morris},
  booktitle    = {Proceedings of the 1st International Workshop on
    Peer-to-Peer Systems (IPTPS)},
  year         = 2002,
  month        = mar,
  address      = {Cambridge, MA},
}
Security considerations for peer-to-peer distributed hash tables Emil Sit, and Robert Morris. IPTPS 2002.

Recent peer-to-peer research has focused on providing efficient hash lookup systems that can be used to build more complex systems. These systems have good properties when their algorithms are executed correctly but have not generally considered how to handle misbehaving nodes. This paper looks at what sorts of security problems are inherent in large peer-to-peer systems based on distributed hash lookup systems. We examine the types of problems that such systems might face, drawing examples from existing systems, and propose some design principles for detecting and preventing these problems.

@inproceedings{chord:security02,
  title        = {Security Considerations for Peer-to-Peer Distributed
    Hash Tables},
  author       = {Emil Sit and Robert Morris},
  booktitle    = {Proceedings of the 1st International Workshop on
    Peer-to-Peer Systems (IPTPS)},
  year         = 2002,
  month        = mar,
  address      = {Cambridge, MA},
}
Effects of loss rate on ad hoc wireless routing Douglas S. J. De Couto, Daniel Aguayo, Benjamin A. Chambers, and Robert Morris. MIT LCS technical report, March 2002.

This paper uses measurements from two deployed wireless ad hoc networks to illustrate the effects of link loss rates on routing protocol performance. Measurements of these networks show that the radio links between the majority of nodes have substantial loss rates. These loss rates are high enough to decrease forwarding performance, but not high enough to prevent existing ad hoc routing protocols from using the links. Link-level retransmission can mask high loss rates, at the cost of substantial decreases in throughput. Simulations, driven by the observed loss rates, show that the shortest paths chosen by existing routing protocols tend to find routes with much less capacity than is available along the best route.

Based on these observations, we present a routing metric intended to allow routing protocols to find good routes in wireless ad hoc networks. The metric is the expected total number of transmissions required to deliver a packet along a route. This metric favors routes with high throughput and low total impact on spectrum. It is expected to perform better than existing techniques that eliminate links based on loss rate thresholds.

@techreport{grid:losstr01,
  title        = {Effects of Loss Rate on Ad Hoc Wireless Routing},
  author       = {Douglas S. J. {De Couto} and Daniel Aguayo and
    Benjamin A. Chambers and Robert Morris},
  institution  = {{MIT} Laboratory for Computer Science},
  year         = 2002,
  month        = mar,
  number       = {MIT-LCS-TR-836},
}
Introducing Tarzan, a peer-to-peer anonymizing network layer Michael J. Freedman, Emil Sit, Josh Cates, and Robert Morris. IPTPS 2002.

We introduce Tarzan, a peer-to-peer anonymous network layer that provides generic IP forwarding. Unlike prior anonymizing layers, Tarzan is flexible, transparent, decentralized, and highly scalable.

Tarzan achieves these properties by building anonymous IP tunnels between an open-ended set of peers. Tarzan can provide anonymity to existing applications, such as web browsing and file sharing, without change to those applications. Performance tests show that Tarzan imposes minimal overhead over a corresponding non-anonymous overlay route.

@inproceedings{tarzan:iptps02,
  title        = {Introducing {Tarzan}, a Peer-to-Peer Anonymizing
    Network Layer},
  author       = {Michael J. Freedman and Emil Sit and Josh Cates and
    Robert Morris},
  booktitle    = {Proceedings of the 1st International Workshop on
    Peer-to-Peer Systems (IPTPS)},
  year         = 2002,
  month        = mar,
  address      = {Cambridge, MA},
}
Fast and secure distributed read-only file system Kevin Fu, M. Frans Kaashoek, and David Mazières. ACM Transactions on Computer Systems 20(1), February 2002.
@article{sfsro:tocs2002,
  title        = {{F}ast and secure distributed read-only file system},
  author       = {Kevin Fu and M. Frans Kaashoek and David
    Mazi{\`e}res},
  volume       = 20,
  number       = 1,
  year         = 2002,
  month        = feb,
  pages        = {1--24},
  journal      = {{ACM} Transactions on Computer Systems},
}
Fast and flexible application-level networking on exokernel systems Gregory R. Ganger, Dawson R. Engler, M. Frans Kaashoek, Hector M. Briceno, Russell Hunt, and Thomas Pinckney. ACM Transactions on Computer Systems 20(1), February 2002.

Application-level networking is a promising software organization for improving performance and functionality for important network services. The Xok/ExOS exokernel system includes application-level support for standard network services, while at the same time allowing application writers to specialize networking services. This paper describes how Xok/ExOS’s kernel mechanisms and library operating system organization achieve this flexibility, and retrospectively shares our experiences and lessons learned (both positive and negative). It also describes how we used this flexibility to build and specialize three network data services: the Cheetah HTTP server, the webswamp Web benchmarking tool, and an application-level TCP forwarder. Overall measurements show large performance improvements relative to similar services built on conventional interfaces, in each case reaching the maximum possible end-to-end performance for the experimental platform. For example, Cheetah provides factor of 2-4 increases in throughput compared to highly tuned socket-based implementations and factor of 3-8 increases compared to conventional systems. Webswamp can offer loads that are two to eight times heavier. The TCP forwarder provides 50-300% higher throughput while also providing end-to-end TCP semantics that cannot be achieved with POSIX sockets. With more detailed measurements and profiling, these overall performance improvements are also broken down and attributed to the specific specializations described, providing server writers with insights into where to focus their optimization efforts.

@article{exo:tocs2002,
  title        = {{F}ast and flexible Application-Level Networking on
    Exokernel Systems},
  author       = {Gregory R. Ganger and Dawson R. Engler and M. Frans
    Kaashoek and Hector M. Briceno and Russell Hunt and Thomas
    Pinckney},
  volume       = 20,
  number       = 1,
  year         = 2002,
  month        = feb,
  pages        = {49--83},
  journal      = {{ACM} Transactions on Computer Systems},
}
DNS performance and the effectiveness of caching Jaeyeon Jung, Emil Sit, Hari Balakrishnan, and Robert Morris. IEEE Transactions on Networking 10(5), 2002.
@article{dns:ton,
  author       = {Jaeyeon Jung and Emil Sit and Hari Balakrishnan and
    Robert Morris},
  title        = {{DNS} performance and the effectiveness of caching},
  volume       = 10,
  number       = 5,
  year         = 2002,
  pages        = {589--603},
  publisher    = {IEEE Press},
  address      = {Piscataway, NJ, USA},
  journal      = {{IEEE} Transactions on Networking},
}

2001

DNS performance and the effectiveness of caching Jaeyeon Jung, Emil Sit, Hari Balakrishnan, and Robert Morris. SIGCOMM 2001.
@inproceedings{dnscache:sigcommimw01,
  title        = {{DNS} Performance and the Effectiveness of Caching},
  author       = {Jaeyeon Jung and Emil Sit and Hari Balakrishnan and
    Robert Morris},
  booktitle    = {Proceedings of the {ACM} {SIGCOMM} Internet
    Measurement Workshop '01},
  year         = 2001,
  month        = nov,
  address      = {San Francisco, California},
}
Wide-area cooperative storage with CFS Frank Dabek, M. Frans Kaashoek, David Karger, Robert Morris, and Ion Stoica. SOSP 2001.

The Cooperative File System (CFS) is a new peer-to-peer read-only storage system that provides provable guarantees for the efficiency, robustness, and load-balance of file storage and retrieval. CFS does this with a completely decentralized architecture that can scale to large systems. CFS servers provide a distributed hash table (DHash) for block storage. CFS clients interpret DHash blocks as a file system. DHash distributes and caches blocks at a fine granularity to achieve load balance, uses replication for robustness, and decreases latency with server selection. DHash finds blocks using the Chord location protocol, which operates in time logarithmic in the number of servers.

CFS is implemented using the SFS file system toolkit and runs on Linux, OpenBSD, and FreeBSD. Experience on a globally deployed prototype shows that CFS delivers data to clients as fast as FTP. Controlled tests show that CFS is scalable: with 4,096 servers, looking up a block of data involves contacting only seven servers. The tests also demonstrate nearly perfect robustness and unimpaired performance even when as many as half the servers fail.

@inproceedings{cfs:sosp01,
  title        = {Wide-area cooperative storage with {CFS}},
  author       = {Frank Dabek and M. Frans Kaashoek and David Karger
    and Robert Morris and Ion Stoica},
},
  booktitle    = {Proceedings of the 18th {ACM} {S}ymposium on
    {O}perating {S}ystems {P}rinciples ({SOSP} '01)},
  year         = 2001,
  month        = oct,
  address      = {Chateau Lake Louise, Banff, Canada},
}
Resilient overlay networks David Andersen, Hari Balakrishnan, M. Frans Kaashoek, and Robert Morris. SOSP 2001.
@inproceedings{ron:sosp01,
  title        = {Resilient Overlay Networks},
  author       = {David Andersen and Hari Balakrishnan and M. Frans
    Kaashoek and Robert Morris},
},
  booktitle    = {Proceedings of the 18th {ACM} {S}ymposium on
    {O}perating {S}ystems {P}rinciples ({SOSP} '01)},
  year         = 2001,
  month        = oct,
  address      = {Chateau Lake Louise, Banff, Canada},
}
A low-bandwidth network file system Athicha Muthitacharoen, Benjie Chen, and David Mazières. SOSP 2001.

Users rarely consider running network file systems over slow or wide-area networks, as the performance would be unacceptable and the bandwidth consumption too high. Nonetheless, efficient remote file access would often be desirable over such networks---particularly when high latency makes remote login sessions unresponsive. Rather than run interactive programs such as editors remotely, users could run the programs locally and manipulate remote files through the file system. To do so, however, would require a network file system that consumes less bandwidth than most current file systems.

This paper presents LBFS, a network file system designed for low-bandwidth networks. LBFS exploits similarities between files or versions of the same file to save bandwidth. It avoids sending data over the network when the same data can already be found in the server's file system or the client's cache. Using this technique in conjunction with conventional compression and caching, LBFS consumes over an order of magnitude less bandwidth than traditional network file systems on common workloads.

@inproceedings{lbfs:sosp01,
  title        = {A Low-bandwidth Network File System},
  author       = {Athicha Muthitacharoen and Benjie Chen and David
    Mazi{\`e}res},
  pages        = {174--187},
  booktitle    = {Proceedings of the 18th {ACM} {S}ymposium on
    {O}perating {S}ystems {P}rinciples ({SOSP} '01)},
  year         = 2001,
  month        = oct,
  address      = {Chateau Lake Louise, Banff, Canada},
}
A cooperative file system Frank Dabek. Master's thesis, Massachusetts Institute of Technology, September 2001.
@mastersthesis{cfs:dabek-meng,
  title        = {A Cooperative File System},
  author       = {Frank Dabek},
  school       = {Massachusetts Institute of Technology},
  year         = 2001,
  month        = sep,
}
A case study of server selection Tina Tyan. Master's thesis, Massachusetts Institute of Technology, September 2001.
@mastersthesis{chord:tyan-meng,
  title        = {A Case Study of Server Selection},
  author       = {Tina Tyan},
  school       = {Massachusetts Institute of Technology},
  year         = 2001,
  month        = sep,
}
Chord: A scalable peer-to-peer lookup service for Internet applications Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, and Hari Balakrishnan. SIGCOMM 2001.

A fundamental problem that confronts peer-to-peer applications is to efficiently locate the node that stores a particular data item. This paper presents Chord, a distributed lookup protocol that addresses this problem. Chord provides support for just one operation: given a key, it maps the key onto a node. Data location can be easily implemented on top of Chord by associating a key with each data item, and storing the key/data item pair at the node to which the key maps. Chord adapts efficiently as nodes join and leave the system, and can answer queries even if the system is continuously changing. Results from theoretical analysis, simulations, and experiments show that Chord is scalable, with communication cost and the state maintained by each node scaling logarithmically with the number of Chord nodes.

@inproceedings{chord:sigcomm01,
  title        = {Chord: A Scalable Peer-to-peer Lookup Service for
    {Internet} Applications},
  author       = {Ion Stoica and Robert Morris and David Karger and M.
    Frans Kaashoek and Hari Balakrishnan},
},
  booktitle    = {Proceedings of the {ACM} {SIGCOMM} '01 Conference},
  year         = 2001,
  month        = aug,
  address      = {San Diego, California},
}
Dos and don'ts of client authentication on the web Kevin Fu, Emil Sit, Kendra Smith, and Nick Feamster. USENIX Security 2001.

Client authentication has been a continuous source of problems on the Web. Although many well-studied techniques exist for authentication, Web sites continue to use extremely weak authentication schemes, especially in non-enterprise environments such as store fronts. These weaknesses often result from careless use of authenticators within Web cookies. Of the twenty-seven sites we investigated, we weakened the client authentication on two systems, gained unauthorized access on eight, and extracted the secret key used to mint authenticators from one.

We provide a description of the limitations, requirements, and security models specific to Web client authentication. This includes the introduction of the interrogative adversary, a surprisingly powerful adversary that can adaptively query a Web site.

We propose a set of hints for designing a secure client authentication scheme. Using these hints, we present the design and analysis of a simple authentication scheme secure against forgeries by the interrogative adversary. In conjunction with SSL, our scheme is secure against forgeries by the active adversary.

@inproceedings{webauth:sec10,
  title        = {{D}os and Don'ts of Client Authentication on the Web},
  author       = {Kevin Fu and Emil Sit and Kendra Smith and Nick
    Feamster},
  note         = {An extended version is available as MIT-LCS-TR-818},
  booktitle    = {Proceedings of the 10th {USENIX} {S}ecurity
    {S}ymposium},
  year         = 2001,
  month        = aug,
  address      = {Washington, D.C.},
}
Span: An energy-efficient coordination algorithm for topology maintenance in ad hoc wireless networks Benjie Chen, Kyle Jamieson, Hari Balakrishnan, and Robert Morris. MobiCom 2001.

This paper presents Span, a power saving technique for multi-hop ad hoc wireless networks that reduces energy consumption without significantly diminishing the capacity or connectivity of the network. Span builds on the observation that when a region of a shared-channel wireless network has a sufficient density of nodes, only a small number of them need be on at any time to forward traffic for active connections.

Span is a distributed, randomized algorithm where nodes make local decisions on whether to sleep, or to join a forwarding backbone as a coordinator. Each node bases its decision on an estimate of how many of its neighbors will benefit from it being awake, and the amount of energy available to it. We give a randomized algorithm where coordinators rotate with time, demonstrating how localized node decisions lead to a connected, capacity-preserving global topology.

Improvement in system lifetime due to Span increases as the ratio of idle-to-sleep energy consumption increases, and increases as the density of the network increases. For example, our simulations show that with a practical energy model, system lifetime of an 802.11 network in power saving mode with Span is a factor of two better than without. Span integrates nicely with 802.11---when run in conjunction with the 802.11 power saving mode, Span improves communication latency, capacity, and system lifetime.

@inproceedings{span:mobicom01,
  title        = {Span: An Energy-Efficient Coordination Algorithm for
    Topology Maintenance in Ad Hoc Wireless Networks},
  author       = {Benjie Chen and Kyle Jamieson and Hari Balakrishnan
    and Robert Morris},
  pages        = {85--96},
  booktitle    = {Proceedings of the 7th {ACM} International
    Conference on Mobile Computing and Networking ({MobiCom} '01)},
  year         = 2001,
  month        = jul,
  address      = {Rome, Italy},
}
Capacity of ad hoc wireless networks Jinyang Li, Charles Blake, Douglas S. J. De Couto, Hu Imm Lee, and Robert Morris. MobiCom 2001.

Early simulation experience with wireless ad hoc networks suggests that their capacity can be surprisingly low, due to the requirement that nodes forward each others' packets. The achievable capacity depends on network size, traffic patterns, and detailed local radio interactions. This paper examines these factors alone and in combination, using simulation and analysis from first principles. Our results include both specific constants and general scaling relationships helpful in understanding the limitations of wireless ad hoc networks.

We examine interactions of the 802.11 MAC and ad hoc forwarding and the effect on capacity for several simple configurations and traffic patterns. While 802.11 discovers reasonably good schedules, we nonetheless observe capacities markedly less than optimal for very simple chain and lattice networks with very regular traffic patterns. We validate some simulation results with experiments.

We also show that the traffic pattern determines whether an ad hoc network's per node capacity will scale to large networks. In particular, we show that for total capacity to scale up with network size the average distance between source and destination nodes must remain small as the network grows. Non-local traffic patterns in which this average distance grows with the network size result in a rapid decrease of per node capacity. Thus the question ``Are large ad hoc networks feasible?'' reduces to a question about the likely locality of communication in such networks.

@inproceedings{grid:mobicom01,
  title        = {Capacity of Ad Hoc Wireless Networks},
  author       = {Jinyang Li and Charles Blake and Douglas S. J. {De
    Couto} and Hu Imm Lee and Robert Morris},
  pages        = {61--69},
  booktitle    = {Proceedings of the 7th {ACM} International
    Conference on Mobile Computing and Networking ({MobiCom} '01)},
  year         = 2001,
  month        = jul,
  address      = {Rome, Italy},
}
Location proxies and intermediate node forwarding for practical geographic forwarding Douglas S. J. De Couto, and Robert Morris. MIT LCS technical report, June 2001.

Two main problems prevent the deployment of geographic forwarding in real systems: geographic forwarding requires that all nodes know their locations, and it has trouble routing around local dead ends. This paper presents practical solutions to each problem.

The location proxy technique allows a node that does not know its location to find a nearby location aware node to use as a proxy for geographic forwarding. The technique works well over a large range of densities of location aware nodes, and allows a tradeoff between bandwidth used for routing information and expense of providing location information.

The intermediate node forwarding (INF) mechanism is a probabilistic solution for routing around bad geographic topologies via intermediate geographic locations. Existing solutions unrealistically assume that nodes have identical radio propagation; INF works on a restricted set of situations but makes assumptions that better match reality.

Experiments using the ns simulator show that location proxies and INF are effective enough to make geographic forwarding practical. We believe geographic forwarding will enable scalable ad hoc networking.

@techreport{grid:proxytr01,
  title        = {Location Proxies and Intermediate Node Forwarding
    for Practical Geographic Forwarding},
  author       = {Douglas S. J. {De Couto} and Robert Morris},
  institution  = {{MIT} Laboratory for Computer Science},
  year         = 2001,
  month        = jun,
  number       = {MIT-LCS-TR-824},
}
Flexible control of parallelism in a multiprocessor PC router Benjie Chen, and Robert Morris. USENIX 2001.

SMP Click is a software router that provides both flexibility and high performance on stock multiprocessor PC hardware. It achieves high performance using device, buffer, and queue management techniques optimized for multiprocessor routing. It allows vendors or network administrators to configure the router in a way that indicates parallelizable packet processing tasks, and adaptively load-balances those tasks across the available CPUs.

SMP Click's absolute performance is high: it can forward 494,000 64-byte IP packets per second on a 2-CPU 500 MHz Intel Xeon machine, compared to 302,000 packets per second for uniprocessor Click. SMP Click also scales well for CPU intensive tasks: 4-CPU SMP Click can encrypt and forward 87,000 64-byte packets per second using IPSec 3DES, compared to 23,000 packets per second for uniprocessor Click.

@inproceedings{click:usenix01,
  title        = {Flexible Control of Parallelism in a Multiprocessor
    {PC} Router},
  author       = {Benjie Chen and Robert Morris},
  pages        = {333--346},
  booktitle    = {Proceedings of the 2001 USENIX Annual Technical
    Conference (USENIX '01)},
  year         = 2001,
  month        = jun,
  address      = {Boston, Massachusetts},
}
Building peer-to-peer systems with Chord, a distributed lookup service Frank Dabek, Emma Brunskill, M. Frans Kaashoek, David Karger, Robert Morris, Ion Stoica, and Hari Balakrishnan. HotOS 2001.

We argue that the core problem facing peer-to-peer systems is locating documents in a decentralized network and propose Chord, a distributed lookup primitive. Chord provides an efficient method of locating documents while placing few constraints on the applications that use it. As proof that Chord's functionality is useful in the development of peer-to-peer applications, we outline the implementation of a peer-to-peer file sharing system based on Chord.

@inproceedings{chord:hotos,
  title        = {Building Peer-to-Peer Systems With {Chord}, a
    Distributed Lookup Service},
  author       = {Frank Dabek and Emma Brunskill and M. Frans Kaashoek
    and David Karger and Robert Morris and Ion Stoica and Hari
    Balakrishnan},
},
  booktitle    = {Proceedings of the 8th {W}orkshop on {H}ot {T}opics
    in {O}perating {S}ystems ({HotOS-VIII})},
  year         = 2001,
  month        = may,
  organization = {{IEEE} {C}omputer {S}ociety},
  address      = {Schloss Elmau, Germany},
}
Resilient overlay networks David Andersen, Hari Balakrishnan, M. Frans Kaashoek, and Robert Morris. HotOS 2001.
@inproceedings{ron:hotos8,
  title        = {Resilient Overlay Networks},
  author       = {David Andersen and Hari Balakrishnan and M. Frans
    Kaashoek and Robert Morris},
  booktitle    = {Proceedings of the 8th {W}orkshop on {H}ot {T}opics
    in {O}perating {S}ystems ({HotOS-VIII})},
  year         = 2001,
  month        = may,
  organization = {{IEEE} {C}omputer {S}ociety},
  address      = {Schloss Elmau, Germany},
}
Dos and don'ts of client authentication on the web Kevin Fu, Emil Sit, Kendra Smith, and Nick Feamster. MIT LCS technical report, May 2001.
@techreport{webauth:tr,
  title        = {{D}os and Don'ts of Client Authentication on the Web},
  author       = {Kevin Fu and Emil Sit and Kendra Smith and Nick
    Feamster},
  institution  = {{MIT} Laboratory for Computer Science},
  year         = 2001,
  month        = may,
  number       = {MIT-LCS-TR-818},
}
Reconsidering internet mobility Alex C. Snoeren, Hari Balakrishnan, and M. Frans Kaashoek. HotOS 2001.
@inproceedings{migrate:hotos8,
  title        = {Reconsidering Internet Mobility},
  author       = {Alex C. Snoeren and Hari Balakrishnan and M. Frans
    Kaashoek},
  booktitle    = {Proceedings of the 8th {W}orkshop on {H}ot {T}opics
    in {O}perating {S}ystems ({HotOS-VIII})},
  year         = 2001,
  month        = may,
  organization = {{IEEE} {C}omputer {S}ociety},
  address      = {Schloss Elmau, Germany},
}

2000

Modular components for network address translation Eddie Kohler, Robert Morris, and Massimiliano Poletto. MIT LCS Click Project technical report, December 2000.

We present a general-purpose toolkit for network address translation in a modular, component-based networking system. Network address translation is a powerful, general technique for building network applications, such as allowing disparate address realms to communicate, load balancing among servers, and changing ordinary proxies into transparent proxies. The components of our toolkit can be combined in a variety of ways to implement these applications and others. The context of this work, the Click modular networking system, makes the NAT components simpler and more understandable. For example, the NAT components concern themselves solely with address translation; related functions, such as classification, are implemented by separate components. This design is more flexible than most existing NAT implementations. The user can choose where network address translation takes place in relation to other router functions, and can use multiple translators in a single configuration. These components have been in use in a production environment for several months.

We describe our design approach, demonstrate its flexibility by presenting a range of examples of its use, and evaluate its performance.

@techreport{click:rewritertr,
  title        = {Modular components for network address translation},
  author       = {Eddie Kohler and Robert Morris and Massimiliano
    Poletto},
  institution  = {{MIT} Laboratory for Computer Science Click Project},
  year         = 2000,
  month        = dec,
  note         = {http://www.pdos.lcs.mit.edu/papers/click-rewriter/},
}
The Click modular router Eddie Kohler. Ph.D. thesis, Massachusetts Institute of Technology, November 2000.
@phdthesis{click:kohler-phd,
  title        = {The {Click} modular router},
  author       = {Eddie Kohler},
  school       = {Massachusetts Institute of Technology},
  year         = 2000,
  month        = nov,
}
Fast and secure distributed read-only file system Kevin Fu, M. Frans Kaashoek, and David Mazières. OSDI 2000.

Internet users increasingly rely on publicly available data for everything from software installation to investment decisions. Unfortunately, the vast majority of public content on the Internet comes with no integrity or authenticity guarantees. This paper presents the self-certifying read-only file system, a content distribution system providing secure, scalable access to public, read-only data.

The read-only file system makes the security of published content independent from that of the distribution infrastructure. In a secure area (perhaps off-line), a publisher creates a digitally-signed database out of a file system's contents. The publisher then replicates the database on untrusted content-distribution servers, allowing for high availability. The read-only file system protocol furthermore pushes the cryptographic cost of content verification entirely onto clients, allowing servers to scale to a large number of clients. Measurements of an implementation show that an individual server running on a 550 Mhz Pentium III with FreeBSD can support 1,012 connections per second and 300 concurrent clients compiling a large software package.

@inproceedings{sfsro:osdi2000,
  title        = {{F}ast and secure distributed read-only file system},
  author       = {Kevin Fu and M. Frans Kaashoek and David
    Mazi{\`e}res},
  pages        = {181--196},
  booktitle    = {Proceedings of the 4th {USENIX} {S}ymposium on
    {O}perating {S}ystems {D}esign and {I}mplementation ({OSDI} 2000)},
  year         = 2000,
  month        = oct,
  address      = {San Diego, California},
}
CarNet: A scalable ad hoc wireless network system Robert Morris, John Jannotti, Frans Kaashoek, Jinyang Li, and Douglas S. J. De Couto. SIGOPS European Workshop 2000.

CarNet is an application for a large ad hoc mobile network system that scales well without requiring a fixed network infrastructure to route messages. CarNet places radio nodes in cars, which communicate using Grid, a novel scalable routing system. Grid uses geographic forwarding and a scalable distributed location service to route packets from car to car without flooding the network. CarNet will support IP connectivity as well as applications such as cooperative highway congestion monitoring, fleet tracking, and discovery of nearby points of interest.

@inproceedings{grid:sigops-euro9,
  title        = {{C}ar{N}et: A Scalable Ad Hoc Wireless Network
    System},
  author       = {Robert Morris and John Jannotti and Frans Kaashoek
    and Jinyang Li and Douglas S. J. {De Couto}},
},
  note         = {The published version incorrectly lists Douglas De
    Couto's name},
  booktitle    = {Proceedings of the 9th {ACM} {SIGOPS} {E}uropean
    workshop: Beyond the {PC}: New Challenges for the Operating System},
  year         = 2000,
  month        = sep,
  address      = {Kolding, Denmark},
}
A scalable location service for geographic ad hoc routing Jinyang Li, John Jannotti, Douglas S. J. De Couto, David R. Karger, and Robert Morris. MobiCom 2000.

GLS is a new distributed location service which tracks mobile node locations. GLS combined with geographic forwarding allows the construction of ad hoc mobile networks that scale to a larger number of nodes than possible with previous work. GLS is decentralized and runs on the mobile nodes themselves, requiring no fixed infrastructure. Each mobile node periodically updates a small set of other nodes (its location servers) with its current location. A node sends its position updates to its location servers without knowing their actual identities, assisted by a predefined ordering of node identifiers and a predefined geographic hierarchy. Queries for a mobile node's location also use the predefined identifier ordering and spatial hierarchy to find a location server for that node.

Experiments using the ns simulator for up to 600 mobile nodes show that the storage and bandwidth requirements of GLS grow slowly with the size of the network. Furthermore, GLS tolerates node failures well: each failure has only a limited effect and query performance degrades gracefully as nodes fail and restart. The query performance of GLS is also relatively insensitive to node speeds. Simple geographic forwarding combined with GLS compares favorably with Dynamic Source Routing (DSR): in larger networks (over 200 nodes) our approach delivers more packets, but consumes fewer network resources.

@inproceedings{grid:mobicom00,
  title        = {A Scalable Location Service for Geographic Ad Hoc
    Routing},
  author       = {Jinyang Li and John Jannotti and Douglas S. J. {De
    Couto} and David R. Karger and Robert Morris},
  pages        = {120--130},
  booktitle    = {Proceedings of the 6th {ACM} International
    Conference on Mobile Computing and Networking ({MobiCom} '00)},
  year         = 2000,
  month        = aug,
  address      = {Boston, Massachusetts},
}
Programming language techniques for modular router configurations Eddie Kohler, Benjie Chen, M. Frans Kaashoek, Robert Morris, and Massimiliano Poletto. MIT LCS technical report, August 2000.

This paper applies programming language techniques to a high-level system description, both to optimize the system and to prove useful properties about it. The system in question is Click, a modular software router framework. Click routers are built from components called elements. Elements are written in C++, but the user creates a configuration using a simple, declarative data flow language. This language is amenable to data flow analysis and other conventional programming language techniques. Applied to a router configuration, these techniques have high-level results---for example, optimizing the router or verifying its high-level properties. This paper describes several programming language techniques that have been useful in practice, including optimization tools that remove virtual function calls from router definitions and remove redundant parts of adjacent routers. We also present performance results for an extensively optimized standards-compliant IP router. On conventional PC hardware, this router can forward up to 456,000 64-byte packets per second.

@techreport{click:lcstr00,
  title        = {Programming language techniques for modular router
    configurations},
  author       = {Eddie Kohler and Benjie Chen and M. Frans Kaashoek
    and Robert Morris and Massimiliano Poletto},
  institution  = {{MIT} Laboratory for Computer Science},
  year         = 2000,
  month        = aug,
  number       = {MIT-LCS-TR-812},
}
The Click modular router Eddie Kohler, Robert Morris, Benjie Chen, John Jannotti, and M. Frans Kaashoek. ACM Transactions on Computer Systems 18(3), August 2000.

Click is a new software architecture for building flexible and configurable routers. A Click router is assembled from packet processing modules called elements. Individual elements implement simple router functions like packet classification, queueing, scheduling, and interfacing with network devices. A router configuration is a directed graph with elements at the vertices; packets flow along the edges of the graph. Several features make individual elements more powerful and complex configurations easier to write, including pull connections, which model packet flow driven by transmitting hardware devices, and flow-based router context, which helps an element locate other interesting elements.

Click configurations are modular and easy to extend. A standards-compliant Click IP router has sixteen elements on its forwarding path; some of its elements are also useful in Ethernet switches and IP tunneling configurations. Extending the IP router to support dropping policies, fairness among flows, or Differentiated Services simply requires adding a couple elements at the right place. On conventional PC hardware, the Click IP router achieves a maximum loss-free forwarding rate of 333,000 64-byte packets per second, demonstrating that Click's modular and flexible architecture is compatible with good performance.

@article{click:tocs00,
  title        = {The {Click} modular router},
  author       = {Eddie Kohler and Robert Morris and Benjie Chen and
    John Jannotti and M. Frans Kaashoek},
  volume       = 18,
  number       = 3,
  year         = 2000,
  month        = aug,
  pages        = {263--297},
  journal      = {{ACM} Transactions on Computer Systems},
}
Multops: a data structure for denial-of-service attack detection Thomer M. Gil. Master's thesis, Vrije Universiteit, August 2000.
@mastersthesis{click:gil-ms,
  title        = {MULTOPS: a data structure for denial-of-service
    attack detection},
  author       = {Thomer M. Gil},
  school       = {Vrije Universiteit},
  year         = 2000,
  month        = aug,
}
Self-certifying file system David Mazières. Ph.D. thesis, Massachusetts Institute of Technology, May 2000.
@phdthesis{sfs:mazieres-phd,
  title        = {Self-certifying File System},
  author       = {David Mazi{\`e}res},
  school       = {Massachusetts Institute of Technology},
  year         = 2000,
  month        = may,
}
A study of caching in the internet domain name system Emil Sit. Master's thesis, Massachusetts Institute of Technology, May 2000.
@mastersthesis{click:sit-ms,
  title        = {A Study of Caching in the Internet Domain Name
    System},
  author       = {Emil Sit},
  school       = {Massachusetts Institute of Technology},
  year         = 2000,
  month        = may,
}
Flexible key management with SFS agents Michael Kaminsky. Master's thesis, Massachusetts Institute of Technology, May 2000.
@mastersthesis{sfs:kaminsky-ms,
  title        = {Flexible Key Management with {SFS} Agents},
  author       = {Michael Kaminsky},
  school       = {Massachusetts Institute of Technology},
  year         = 2000,
  month        = may,
}
Multiprocessing with the Exokernel operating system Benjie Chen. Master's thesis, Massachusetts Institute of Technology, February 2000.

Exokernel is a minimal operating system kernel that safely multiplexes hardware resources, while leaving all system abstractions to applications. An exokernel exhibits better performance and offers more functionality because applications can provide optimized system abstractions, at the user-level, based on their needs. Current design of the exokernel system, however, does not support multiprocessor architectures. This thesis presents a symmetric multiprocessing exokernel and demonstrates that unprivileged library implementation of operating system abstractions is viable on a multiprocessor system.

This thesis focus on three issues. First, it presents synchronization strategies used in kernel. Second, this thesis describes three new exokernel interfaces: message passing, kernel support for multithreading, and multiprocessor scheduling. Third, because exokernel applications do not trust each other, traditional synchronization primitives used to guard system abstractions, such as voluntary memory locks, do not function well. This thesis presents and evaluates a strategy for synchronization among untrusted processes. A multiprocessor exokernel and a synchronized library operating system result from this thesis. Performance analysis shows that the overheads of synchronization in both the kernel and the library operating system are small.

@mastersthesis{exo:chen-meng,
  title        = {Multiprocessing with the {Exokernel} Operating
    System},
  author       = {Benjie Chen},
  school       = {Massachusetts Institute of Technology},
  year         = 2000,
  month        = feb,
}

1999

The Click modular router Robert Morris, Eddie Kohler, John Jannotti, and M. Frans Kaashoek. SOSP 1999.

Please consider citing the journal version of this paper, which has improved explanations, more detail and examples, and significantly better performance results.

Click is a new software architecture for building flexible and configurable routers. A Click router is assembled from packet processing modules called elements. Individual elements implement simple router functions like packet classification, queueing, scheduling, and interfacing with network devices. Complete configurations are built by connecting elements into a graph; packets flow along the graph's edges. Several features make individual elements more powerful and complex configurations easier to write, including pull processing, which models packet flow driven by transmitting interfaces, and flow-based router context, which helps an element locate other interesting elements.

We demonstrate several working configurations, including an IP router and an Ethernet bridge. These configurations are modular---the IP router has 16 elements on the forwarding path---and easy to extend by adding additional elements, which we demonstrate with augmented configurations. On commodity PC hardware running Linux, the Click IP router can forward 64-byte packets at 73,000 packets per second, just 10% slower than Linux alone.

@inproceedings{click:sosp99,
  title        = {The {C}lick modular router},
  author       = {Robert Morris and Eddie Kohler and John Jannotti and
    M. Frans Kaashoek},
  pages        = {217--231},
  booktitle    = {Proceedings of the 17th {ACM} {S}ymposium on
    {O}perating {S}ystems {P}rinciples ({SOSP} '99)},
  year         = 1999,
  month        = dec,
  address      = {Kiawah Island, South Carolina},
}
Separating key management from file system security David Mazières, Michael Kaminsky, M. Frans Kaashoek, and Emmett Witchel. SOSP 1999.
@inproceedings{sfs:sosp99,
  title        = {{S}eparating key management from file system
    security},
  author       = {David Mazi{\`e}res and Michael Kaminsky and M. Frans
    Kaashoek and Emmett Witchel},
  pages        = {124--139},
  booktitle    = {Proceedings of the 17th {ACM} {S}ymposium on
    {O}perating {S}ystems {P}rinciples ({SOSP} '99)},
  year         = 1999,
  month        = dec,
  address      = {Kiawah Island, South Carolina},
}
Linear scan register allocation Massimiliano Poletto, and Vivek Sarkar. ACM Transactions on Programming Languages and Systems 21(5), September 1999.
@article{linearscan,
  title        = {Linear scan register allocation},
  author       = {Massimiliano Poletto and Vivek Sarkar},
  volume       = 21,
  number       = 5,
  year         = 1999,
  month        = sep,
  pages        = {895--913},
  journal      = {{ACM} Transactions on Programming Languages and
    Systems},
}
A readable TCP in the Prolac protocol language Eddie Kohler, M. Frans Kaashoek, and David R. Montgomery. SIGCOMM 1999.

Prolac is a new statically-typed, object-oriented language for network protocol implementation. It is designed for readability, extensibility, and real-world implementation; most previous protocol languages, in contrast, have been based on hard-to-implement theoretical models and have focused on verification. We present a working Prolac TCP implementation directly derived from 4.4BSD. Our implementation is modular---protocol processing is logically divided into minimally-interacting pieces; readable---Prolac encourages top-down structure and naming intermediate computations; and extensible---subclassing cleanly separates protocol extensions like delayed acknowledgements and slow start. The Prolac compiler uses simple global analysis to remove expensive language features like dynamic dispatch, resulting in end-to-end performance comparable to an unmodified Linux 2.0 TCP.

@inproceedings{prolac:sigcomm99,
  title        = {A readable {TCP} in the {Prolac} protocol language},
  author       = {Eddie Kohler and M. Frans Kaashoek and David R.
    Montgomery},
  pages        = {3--13},
  booktitle    = {Proceedings of the {ACM} {SIGCOMM} '99 Conference:
    Applications, Technologies, Architectures, and Protocols for
    Computer Communication},
  year         = 1999,
  month        = aug,
  address      = {Cambridge, Massachusetts},
}
Language and compiler support for dynamic code generation Massimiliano Poletto. Ph.D. thesis, Massachusetts Institute of Technology, June 1999.
@phdthesis{tickc:poletto-phd,
  title        = {Language and compiler support for dynamic code
    generation},
  author       = {Massimiliano Poletto},
  school       = {Massachusetts Institute of Technology},
  year         = 1999,
  month        = jun,
}
The secure file system under Windows NT Matthew Rimer. Master's thesis, Massachusetts Institute of Technology, June 1999.
@mastersthesis{sfs:rimer-ms,
  title        = {The Secure File System under {Windows NT}},
  author       = {Matthew Rimer},
  school       = {Massachusetts Institute of Technology},
  year         = 1999,
  month        = jun,
}
Evolving software with an application-specific language Eddie Kohler, Massimiliano Poletto, and David R. Montgomery. WCSSS 1999.

Software systems can be developed through evolution (gradual change) or revolution (reimplementation from scratch). Both approaches have advantages and disadvantages. An evolutionary approach keeps the system working throughout, allowing early problem detection, but tends to retain ingrained design flaws and can result in complex, ad hoc systems. A revolutionary approach is required to change the basic architecture of a system, but many more resources must be invested before the system can be evaluated. In this paper, we describe how we used a little application-specific language to combine these approaches' advantages.

The context of our work is CTAS, the next-generation air traffic control automation system developed originally by NASA. The overall goal was to redesign and reimplement one of the CTAS processes in Java, while retaining its ability to communicate with unmodified processes---a project complicated by CTAS's ad hoc message formats. To address this, we designed a language that combines C code copied from CTAS source, to express the message formats, with new Java code for message actions. A compiler then automatically generates code for marshalling and unmarshalling. The result is a system with both evolutionary and revolutionary properties, exemplified by the use of both old CTAS code and new Java code in the message language.

This paper discusses the language and compiler and evaluates some of the engineering tradeoffs inherent in their design.

@inproceedings{evolving-software:wcsss99,
  title        = {Evolving software with an application-specific
    language},
  author       = {Eddie Kohler and Massimiliano Poletto and David R.
    Montgomery},
  pages        = {94--102},
  booktitle    = {Workshop Record of {WCSSS} '99: The 2nd {ACM}
    {SIGPLAN} Workshop on Compiler Support for Systems Software},
  year         = 1999,
  month        = may,
  address      = {Atlanta, Georgia},
}
A fast Prolac TCP for the real world Jr. Montgomery David Rogers. Master's thesis, Massachusetts Institute of Technology, May 1999.
@mastersthesis{prolac:montgomery-meng,
  title        = {A fast {Prolac} {TCP} for the real world},
  author       = {Montgomery, Jr., David Rogers},
  school       = {Massachusetts Institute of Technology},
  year         = 1999,
  month        = may,
}
An x86 protected mode virtual machine monitor for the MIT Exokernel Charles L. Coffing. Master's thesis, Massachusetts Institute of Technology, May 1999.
@mastersthesis{exo:coffing-meng,
  title        = {An x86 Protected Mode Virtual Machine Monitor for
    the {MIT} {Exokernel}},
  author       = {Charles L. Coffing},
  school       = {Massachusetts Institute of Technology},
  year         = 1999,
  month        = may,
}
PAN: a high-performance active network node supporting multiple mobile code systems Erik L. Nygren, Stephen J. Garland, and M. Frans Kaashoek. OpenArch 1999.

A capsule-based active network transports capsules containing code to be executed on network nodes through which they pass. Active networks facilitate the deployment of new protocols, which can be used without any changes to the underlying network infrastructure. This paper describes the design, implementation, and evaluation of a high-performance active network node which supports multiple mobile code systems. Experiments, using capsules executing unsafe native Intel ix86 object code, indicate that active networks may be able to provide significant flexibility relative to traditional networks with only a small performance overhead (as little as 13% for 1500 byte packets). However, capsules executing JavaVM code performed far worse (with over three times the performance overhead of native code for 128 byte packets), indicating that mobile code system performance is critical to overall node performance.

@inproceedings{pan:openarch99,
  title        = {{PAN}: a high-performance active network node
    supporting multiple mobile code systems},
  author       = {Erik L. Nygren and Stephen J. Garland and M. Frans
    Kaashoek},
  pages        = {78--89},
  booktitle    = {Proceedings of the 2nd {IEEE} Conference on Open
    Architectures and Network Programming ({OpenArch} '99)},
  year         = 1999,
  month        = mar,
  address      = {New York, New York},
}
`C and tcc: A language and compiler for dynamic code generation Massimiliano Poletto, Wilson C. Hsieh, Dawson R. Engler, and M. Frans Kaashoek. ACM Transactions on Programming Languages and Systems 21(2), March 1999.
@article{tickc:toplas,
  title        = {{`C} and {tcc}: A language and compiler for dynamic
    code generation},
  author       = {Massimiliano Poletto and Wilson C. Hsieh and Dawson
    R. Engler and M. Frans Kaashoek},
  volume       = 21,
  number       = 2,
  year         = 1999,
  month        = mar,
  pages        = {324--369},
  journal      = {{ACM} Transactions on Programming Languages and
    Systems},
}

1998

The design, implementation and operation of an email pseudonym server David Mazières, and M. Frans Kaashoek. CCS 1998.
@inproceedings{nymserver:ccs5,
  title        = {The design, implementation and operation of an email
    pseudonym server},
  author       = {David Mazi{\`e}res and M. Frans Kaashoek},
  pages        = {27--36},
  booktitle    = {Proceedings of the 5th {ACM} Conference on Computer
    and Communications Security ({CCS-5})},
  year         = 1998,
  month        = nov,
  address      = {San Francisco, California},
}
The exokernel operating system architecture Dawson R. Engler. Ph.D. thesis, Massachusetts Institute of Technology, October 1998.
@phdthesis{exo:engler-phd,
  title        = {The exokernel operating system architecture},
  author       = {Dawson R. Engler},
  school       = {Massachusetts Institute of Technology},
  year         = 1998,
  month        = oct,
}
Escaping the evils of centralized control with self-certifying pathnames David Mazières, and M. Frans Kaashoek. SIGOPS European Workshop 1998.
@inproceedings{sfs:sigops-euro8,
  title        = {Escaping the evils of centralized control with
    self-certifying pathnames},
  author       = {David Mazi{\`e}res and M. Frans Kaashoek},
},
  booktitle    = {Proceedings of the 8th {ACM} {SIGOPS} {E}uropean
    workshop: Support for composing distributed applications},
  year         = 1998,
  month        = sep,
  address      = {Sintra, Portugal},
}
Framework for implementing file systems in Windows NT Danilo Almeida. Master's thesis, Massachusetts Institute of Technology, May 1998.
@mastersthesis{sfs:almeida-ms,
  title        = {Framework for Implementing File Systems in {Windows
    NT}},
  author       = {Danilo Almeida},
  school       = {Massachusetts Institute of Technology},
  year         = 1998,
  month        = may,
}
Flexible and efficient sharing of protected abstractions George M. Candea. Master's thesis, Massachusetts Institute of Technology, May 1998.

Traditional operating systems are overly restrictive and do not allow user-level applications to modify operating system abstractions. The exokernel operating system architecture safely gives untrusted applications efficient control over hardware and software resources by separating management from protection. Decentralized control, however, makes it very difficult for mutually distrustful applications to share system abstractions.

This thesis presents the design, implementation, and evaluation of the protected abstraction mechanism (PAM), a novel way to safely share user-level abstractions in an exokernel. PAM enables unprivileged, untrusted applications to define and securely share generic abstractions at run-time. PAM achieves a good flexibility-performance combination by eliminating the need for context switches and optimizing for the common case, in which the same abstraction is invoked repeatedly. PAM's design emphasizes simplicity and provable correctness, which makes it easy to understand and use: a couple of manual pages are sufficient for the average programmer to start using PAM.

We report measurements of PAM's performance on null method calls. In spite of the fact that such invocations do not take advantage of PAM's context switch-free operation, the PAM version of a simple abstraction outperforms the equivalent LRPC implementation by over 15% on null method calls. It is also considerably easier to write an abstraction using PAM. We therefore believe the protected abstraction mechanism is a viable solution to the problem of safely sharing user-level abstractions in the exokernel.

@mastersthesis{exo:candea-meng,
  title        = {Flexible and efficient sharing of protected
    abstractions},
  author       = {George M. Candea},
  school       = {Massachusetts Institute of Technology},
  year         = 1998,
  month        = may,
}
Performance evaluation of the Orca shared-object system Henri Bal, Raoul Bhoedjang, Rutger Hofman, Ceriel Jacobs, Koen Langendoen, Tim Ruhl, and M. Frans Kaashoek. ACM Transactions on Computer Systems 16(1), February 1998.
@article{orca:tocs,
  title        = {Performance evaluation of the {Orca} shared-object
    system},
  author       = {Henri Bal and Raoul Bhoedjang and Rutger Hofman and
    Ceriel Jacobs and Koen Langendoen and Tim Ruhl and M. Frans
    Kaashoek},
  pages        = {1--40},
  month        = feb,
  year         = 1998,
  volume       = 16,
  number       = 1,
  journal      = {{ACM} Transactions on Computer Systems},
}
An efficient virtual network interface in the FUGU scalable workstation Kenneth M. Mackenzie. Ph.D. thesis, Massachusetts Institute of Technology, February 1998.
@phdthesis{fugu:mackenzie-phd,
  title        = {An efficient virtual network interface in the {FUGU}
    scalable workstation},
  author       = {Kenneth M. Mackenzie},
  school       = {Massachusetts Institute of Technology},
  year         = 1998,
  month        = feb,
}
Applying exokernel principles to conventional operating systems John Jannotti. Master's thesis, Massachusetts Institute of Technology, February 1998.
@mastersthesis{exo-os:jj-meng,
  title        = {Applying Exokernel Principles to Conventional
    Operating Systems},
  author       = {John Jannotti},
  school       = {Massachusetts Institute of Technology},
  year         = 1998,
  month        = feb,
}
The design and implementation of a high-performance active network node Erik L. Nygren. Master's thesis, Massachusetts Institute of Technology, February 1998.

A capsule-oriented active network transports capsules containing code to be executed on the network nodes that they pass through. This approach makes networks more extensible by allowing new networking protocols to be deployed and used without any changes to the underlying network infrastructure. This thesis project describes the design, implementation, and evaluation of a high-performance practical active network node that can serve as a testbed for research into active network performance and resource management issues. Nodes provide resources to executing capsules containing Intel ix86 object code. Although the current implementation does not yet provide safety or interoperability, the results of experiments performed on the system implemented for this thesis indicate that an active network architecture may be able to provide significant flexibility while only incurring a small performance overhead relative to traditional networks.

@mastersthesis{pan:nygren-meng,
  title        = {The design and implementation of a high-performance
    active network node},
  author       = {Erik L. Nygren},
  school       = {Massachusetts Institute of Technology},
  year         = 1998,
  month        = feb,
}

1997

Mobile computing with the Rover toolkit Anthony Joseph. Ph.D. thesis, Massachusetts Institute of Technology, November 1997.
@phdthesis{rover:adj-phd,
  title        = {Mobile computing with the {Rover} toolkit},
  author       = {Anthony Joseph},
  school       = {Massachusetts Institute of Technology},
  year         = 1997,
  month        = nov,
}
Application performance and flexibility on exokernel systems M. Frans Kaashoek, Dawson R. Engler, Gregory R. Ganger, Héctor M. Briceño, Russell Hunt, David Mazières, Thomas Pinckney, Robert Grimm, John Jannotti, and Kenneth Mackenzie. SOSP 1997.

The exokernel operating system architecture safely gives untrusted software efficient control over hardware and software resources by separating management from protection. This paper describes an exokernel system that allows specialized applications to achieve high performance without sacrificing the performance of unmodified UNIX programs. It evaluates the exokernel architecture by measuring end-to-end application performance on Xok, an exokernel for Intel x86-based computers, and by comparing Xok's performance to the performance of two widely-used 4.4BSD UNIX systems (FreeBSD and OpenBSD). The results show that common unmodified UNIX applications can enjoy the benefits of exokernels: applications either perform comparably on Xok/ExOS and the BSD UNIXes, or perform significantly better. In addition, the results show that customized applications can benefit substantially from control over their resources (e.g., a factor of eight for a Web server). This paper also describes insights about the exokernel approach gained through building three different exokernel systems, and presents novel approaches to resource multiplexing.

@inproceedings{exo:sosp97,
  title        = {Application performance and flexibility on exokernel
    systems},
  author       = {M. Frans Kaashoek and Dawson R. Engler and Gregory
    R. Ganger and H{\'e}ctor M. Brice{\~n}o and Russell Hunt and David
    Mazi{\`e}res and Thomas Pinckney and Robert Grimm and John
    Jannotti and Kenneth Mackenzie},
  pages        = {52--65},
  booktitle    = {Proceedings of the 16th {ACM} {S}ymposium on
    {O}perating {S}ystems {P}rinciples ({SOSP} '97)},
  year         = 1997,
  month        = oct,
  address      = {Saint-Mal{\^o}, France},
}
Shared libraries in an exokernel operating system Douglas Karl Wyatt. Master's thesis, Massachusetts Institute of Technology, September 1997.
@mastersthesis{exo:wyatt-meng,
  title        = {Shared libraries in an exokernel operating system},
  author       = {Douglas Karl Wyatt},
  school       = {Massachusetts Institute of Technology},
  year         = 1997,
  month        = sep,
}
Prolac: a language for protocol compilation Eddie Kohler. Master's thesis, Massachusetts Institute of Technology, September 1997.
@mastersthesis{prolac:kohler-ms,
  title        = {Prolac: a language for protocol compilation},
  author       = {Eddie Kohler},
  school       = {Massachusetts Institute of Technology},
  year         = 1997,
  month        = sep,
}
Security and decentralized control in the SFS global file system David Mazières. Master's thesis, Massachusetts Institute of Technology, August 1997.
@mastersthesis{sfs:mazieres-ms,
  title        = {Security and decentralized control in the {SFS}
    global file system},
  author       = {David Mazi{\`e}res},
  school       = {Massachusetts Institute of Technology},
  year         = 1997,
  month        = aug,
}
tcc: a system for fast, flexible, and high-level dynamic code generation Massimiliano Poletto, Dawson R. Engler, and M. Frans Kaashoek. PLDI 1997.
@inproceedings{tickc:pldi97,
  title        = {tcc: a system for fast, flexible, and high-level
    dynamic code generation},
  author       = {Massimiliano Poletto and Dawson R. Engler and M.
    Frans Kaashoek},
  pages        = {109--121},
  booktitle    = {Proceedings of the {ACM} {SIGPLAN} '97 Conference on
    Programming Design and Implementation ({PLDI} '97)},
  year         = 1997,
  month        = jun,
  address      = {Las Vegas, Nevada},
}
Secure applications need flexible operating systems David Mazières, and M. Frans Kaashoek. HotOS 1997.
@inproceedings{secure-apps:hotos6,
  title        = {Secure applications need flexible operating systems},
  author       = {David Mazi{\`e}res and M. Frans Kaashoek},
  pages        = {56--61},
  booktitle    = {Proceedings of the 6th {W}orkshop on {H}ot {T}opics
    in {O}perating {S}ystems ({HotOS-VI})},
  year         = 1997,
  month        = may,
  organization = {{IEEE} {C}omputer {S}ociety},
  address      = {Chatham, Cape Cod, Massachusetts},
}
Mobile computing with the Rover toolkit Anthony D. Joseph, Joshua A. Tauber, and M. Frans Kaashoek. IEEE Transactions on Computers 46(3), March 1997.
@article{rover:ieee-toc,
  title        = {Mobile computing with the {Rover} toolkit},
  author       = {Anthony D. Joseph and Joshua A. Tauber and M. Frans
    Kaashoek},
  volume       = 46,
  number       = 3,
  year         = 1997,
  month        = mar,
  pages        = {337--352},
  journal      = {{IEEE} Transactions on Computers},
}
Operating system extensibility through event capture Thomas Pinckney III. Master's thesis, Massachusetts Institute of Technology, February 1997.
@mastersthesis{exo:pinckney-meng,
  title        = {Operating system extensibility through event capture},
  author       = {Thomas {Pinckney III}},
  school       = {Massachusetts Institute of Technology},
  year         = 1997,
  month        = feb,
}
Decentralizing UNIX abstractions in the exokernel architecture Héctor Manuel Briceño Pulido. Master's thesis, Massachusetts Institute of Technology, February 1997.
@mastersthesis{exo:briceno-meng,
  title        = {Decentralizing {UNIX} abstractions in the exokernel
    architecture},
  author       = {H{\'e}ctor Manuel {Brice{\~n}o Pulido}},
  school       = {Massachusetts Institute of Technology},
  year         = 1997,
  month        = feb,
}
Embedded inodes and explicit grouping: exploiting disk bandwidth for small files Gregory R. Ganger, and M. Frans Kaashoek. USENIX 1997.

Small file performance in most file systems is limited by slowly improving disk access times, even though current file systems improve on-disk locality by allocating related data objects in the same general region. The key insight for why current file systems perform poorly is that locality is insufficient - exploiting disk bandwidth for small data objects requires that they be placed adjacently. We describe C-FFS (Co-locating Fast File System), which introduces two techniques, embedded inodes and explicit grouping, for exploiting what disks do well (bulk data movement) to avoid what they do poorly (reposition to new locations). With embedded inodes, the inodes for most files are stored in the directory with the corresponding name, removing a physical level of indirection without sacrificing the logical level of indirection. With explicit grouping, the data blocks of multiple small files named by a given directory are allocated adjacently and moved to and from the disk as a unit in most cases. Measurements of our C-FFS implementation show that embedded inodes and explicit grouping have the potential to increase small file throughput (for both reads and writes) by a factor of 5-7 compared to the same file system without these techniques. The improvement comes directly from reducing the number of disk accesses required by an order of magnitude. Preliminary experience with software-development applications shows performance improvements ranging from 10-300 percent.

@inproceedings{cffs:usenix97,
  title        = {Embedded inodes and explicit grouping: exploiting
    disk bandwidth for small files},
  author       = {Gregory R. Ganger and M. Frans Kaashoek},
  pages        = {1--17},
  booktitle    = {Proceedings of the 1997 {USENIX} Annual Technical
    Conference (USENIX '97)},
  year         = 1997,
  month        = jan,
  address      = {Anaheim, California},
}
High-performance application-specific networking Deborah Anne Wallach. Ph.D. thesis, Massachusetts Institute of Technology, January 1997.
@phdthesis{app-specific-networking:wallach-phd,
  title        = {High-performance application-specific networking},
  author       = {Deborah Anne Wallach},
  school       = {Massachusetts Institute of Technology},
  year         = 1997,
  month        = jan,
}
Building reliable mobile-aware applications using the Rover toolkit Anthony D. Joseph, and M. Frans Kaashoek. ACM Wireless Networks 3(5), 1997.
@article{rover:winet,
  title        = {Building reliable mobile-aware applications using
    the {Rover} toolkit},
  author       = {Anthony D. Joseph and M. Frans Kaashoek},
  volume       = 3,
  number       = 5,
  year         = 1997,
  pages        = {405--419},
  journal      = {{ACM} Wireless Networks},
}

1996

Building reliable mobile-aware applications using the Rover toolkit Anthony D. Joseph, Joshua A. Tauber, and M. Frans Kaashoek. MobiCom 1996.
@inproceedings{rover:mobicom,
  title        = {Building reliable mobile-aware applications using
    the {Rover} toolkit},
  author       = {Anthony D. Joseph and Joshua A. Tauber and M. Frans
    Kaashoek},
  booktitle    = {Proceedings of the 2nd {ACM} International
    Conference on Mobile Computing and Networking ({MobiCom} '96)},
  year         = 1996,
  month        = nov,
  address      = {Rye, New York},
}
Dynamic computation migration in distributed shared memory systems Wilson C. Hsieh, M. Frans Kaashoek, and William E. Weihl. HPCC 1996.
@inproceedings{dynamic-migration:supercomp96,
  title        = {Dynamic computation migration in distributed shared
    memory systems},
  author       = {Wilson C. Hsieh and M. Frans Kaashoek and William E.
    Weihl},
  booktitle    = {Supercomputing '96 Conference Proceedings: The
    international conference on high performance computing and
    communications},
  organization = {{ACM}},
  year         = 1996,
  month        = nov,
  address      = {Pittsburgh, Pennsylvania},
}
Server operating systems M. Frans Kaashoek, Dawson R. Engler, Gregory R. Ganger, and Deborah A. Wallach. SIGOPS European Workshop 1996.

We introduce server operating systems, which are sets of abstractions and runtime support for specialized, high-performance server applications. We have designed and are implementing a prototype server OS with support for aggressive specialization, direct device-to-device access, an event-driven organization, and dynamic compiler-assisted ILP. Using this server OS, we have constructed an HTTP server that outperforms servers running on a conventional OS by more than an order of magnitude and that can safely timeshare the hardware platform with other applications.

@inproceedings{server-os:sigops-euro,
  title        = {Server operating systems},
  author       = {M. Frans Kaashoek and Dawson R. Engler and Gregory
    R. Ganger and Deborah A. Wallach},
  pages        = {141--148},
  booktitle    = {Proceedings of the 7th {ACM} {SIGOPS} {E}uropean
    workshop: Systems support for worldwide applications},
  year         = 1996,
  month        = sep,
  address      = {Connemara, Ireland},
}
ASHs: application-specific handlers for high-performance messaging Deborah A. Wallach, Dawson R. Engler, and M. Frans Kaashoek. SIGCOMM 1996.
@inproceedings{ash:sigcomm96,
  title        = {{ASHs}: application-specific handlers for
    high-performance messaging},
  author       = {Deborah A. Wallach and Dawson R. Engler and M. Frans
    Kaashoek},
  pages        = {40--52},
  booktitle    = {Proceedings of the {ACM} {SIGCOMM} '96 Conference:
    Applications, Technologies, Architectures, and Protocols for
    Computer Communication},
  year         = 1996,
  month        = aug,
  address      = {Stanford, California},
}
DPF: fast, flexible message demultiplexing using dynamic code generation Dawson R. Engler, and M. Frans Kaashoek. SIGCOMM 1996.
@inproceedings{dpf:sigcomm96,
  title        = {{DPF}: fast, flexible message demultiplexing using
    dynamic code generation},
  author       = {Dawson R. Engler and M. Frans Kaashoek},
  pages        = {53--59},
  booktitle    = {Proceedings of the {ACM} {SIGCOMM} '96 Conference:
    Applications, Technologies, Architectures, and Protocols for
    Computer Communication},
  year         = 1996,
  month        = aug,
  address      = {Stanford, California},
}
The Rover NNTP proxy Constantine Cristakos. Advanced Undergraduate Project, Massachusetts Institute of Technology, June 1996.
@mastersthesis{rover:nntp,
  title        = {The {Rover} {NNTP} proxy},
  author       = {Constantine Cristakos},
  school       = {Massachusetts Institute of Technology},
  year         = 1996,
  month        = jun,
  type         = {Advanced Undergraduate Project},
}
An evaluation of the Amoeba group communication system M. Frans Kaashoek, and Andrew S. Tanenbaum. ICDCS 1996.
@inproceedings{amoeba-eval:dcs16,
  title        = {An evaluation of the {Amoeba} group communication
    system},
  author       = {M. Frans Kaashoek and Andrew S. Tanenbaum},
  pages        = {436--448},
  booktitle    = {Proceedings of the 16th International Conference on
    Distributed Computing Systems},
  organization = {{IEEE} {C}omputer {S}ociety},
  year         = 1996,
  month        = may,
  address      = {Hong Kong},
}
Atomic recovery units: failure atomicity for logical disks Robert Grimm, Wilson C. Hsieh, Wiebren de Jonge, and M. Frans Kaashoek. ICDCS 1996.
@inproceedings{arus:dcs16,
  title        = {Atomic recovery units: failure atomicity for logical
    disks},
  author       = {Robert Grimm and Wilson C. Hsieh and Wiebren de
    Jonge and M. Frans Kaashoek},
  pages        = {26--37},
  booktitle    = {Proceedings of the 16th International Conference on
    Distributed Computing Systems},
  organization = {{IEEE} {C}omputer {S}ociety},
  year         = 1996,
  month        = may,
  address      = {Hong Kong},
}
VCODE: a retargetable, extensible, very fast dynamic code generation system Dawson R. Engler. PLDI 1996.
@inproceedings{vcode:pldi96,
  title        = {{VCODE}: a retargetable, extensible, very fast
    dynamic code generation system},
  author       = {Dawson R. Engler},
  pages        = {160--170},
  booktitle    = {Proceedings of the {ACM} {SIGPLAN} '96 Conference on
    Programming Design and Implementation ({PLDI} '96)},
  year         = 1996,
  month        = may,
  address      = {Philadelphia, Pennsylvania},
}
Exodisk: maximizing application control over storage management Robert Grimm. Master's thesis, Massachusetts Institute of Technology, May 1996.
@mastersthesis{exo:grimm-ms,
  title        = {Exodisk: maximizing application control over storage
    management},
  author       = {Robert Grimm},
  school       = {Massachusetts Institute of Technology},
  year         = 1996,
  month        = may,
}
Issues in building mobile-aware applications with the Rover toolkit Joshua A. Tauber. Master's thesis, Massachusetts Institute of Technology, May 1996.
@mastersthesis{rover:tauber-ms,
  title        = {Issues in building mobile-aware applications with
    the {Rover} toolkit},
  author       = {Joshua A. Tauber},
  school       = {Massachusetts Institute of Technology},
  year         = 1996,
  month        = may,
}
tcc: a template-based compiler for `C Massimiliano Poletto, Dawson R. Engler, and M. Frans Kaashoek. WCSSS 1996.
@inproceedings{tickc:wcsss96,
  title        = {tcc: a template-based compiler for {`C}},
  author       = {Massimiliano Poletto and Dawson R. Engler and M.
    Frans Kaashoek},
  pages        = {1--7},
  booktitle    = {Workshop Record of {WCSSS} '96: The Inaugural
    Workshop on Compiler Support for Systems Software},
  organization = {{ACM} {SIGPLAN}},
  year         = 1996,
  month        = feb,
  address      = {Tuscon, Arizona},
}
`C: A language for efficient, machine-independent dynamic code generation Dawson R. Engler, Wilson C. Hsieh, and M. Frans Kaashoek. POPL 1996.
@inproceedings{tickc:popl96,
  title        = {{`C}: A language for efficient, machine-independent
    dynamic code generation},
  author       = {Dawson R. Engler and Wilson C. Hsieh and M. Frans
    Kaashoek},
  pages        = {131--144},
  note         = {An earlier version is available as MIT-LCS-TM-526},
  booktitle    = {Proceedings of the 23rd {ACM} {SIGPLAN}-{SIGACT}
    Symposium on Principles of Programming Languages ({POPL} '96)},
  year         = 1996,
  month        = jan,
  address      = {St. Petersburg Beach, Florida},
}

1995

Rover: a toolkit for mobile information access Anthony D. Joseph, Alan F. deLespinasse, Joshua A. Tauber, David K. Gifford, and M. Frans Kaashoek. SOSP 1995.
@inproceedings{rover:sosp95,
  title        = {{Rover}: a toolkit for mobile information access},
  author       = {Anthony D. Joseph and {deLespinasse}, Alan F. and
    Joshua A. Tauber and David K. Gifford and M. Frans Kaashoek},
  pages        = {156--171},
  booktitle    = {Proceedings of the 15th {ACM} {S}ymposium on
    {O}perating {S}ystems {P}rinciples ({SOSP} '95)},
  year         = 1995,
  month        = dec,
  address      = {Copper Mountain Resort, Colorado},
}
Exokernel: an operating system architecture for application-level resource management Dawson R. Engler, M. Frans Kaashoek, and James O'Toole Jr.. SOSP 1995.
@inproceedings{exo:sosp95,
  title        = {{E}xokernel: an operating system architecture for
    application-level resource management},
  author       = {Dawson R. Engler and M. Frans Kaashoek and James
    {O'Toole Jr.}},
  pages        = {251--266},
  booktitle    = {Proceedings of the 15th {ACM} {S}ymposium on
    {O}perating {S}ystems {P}rinciples ({SOSP} '95)},
  year         = 1995,
  month        = dec,
  address      = {Copper Mountain Resort, Colorado},
}
CRL: high-performance all-software distributed shared memory Kirk L. Johnson, M. Frans Kaashoek, and Deborah A. Wallach. SOSP 1995.
@inproceedings{crl:sosp95,
  title        = {{CRL}: high-performance all-software distributed
    shared memory},
  author       = {Kirk L. Johnson and M. Frans Kaashoek and Deborah A.
    Wallach},
  pages        = {213--226},
  note         = {An earlier version of this work appeared as
    Technical Report MIT-LCS-TM-517, MIT Laboratory for Computer
    Science, March 1995},
  booktitle    = {Proceedings of the 15th {ACM} {S}ymposium on
    {O}perating {S}ystems {P}rinciples ({SOSP} '95)},
  year         = 1995,
  month        = dec,
  address      = {Copper Mountain Resort, Colorado},
}
High-performance all-software distributed shared memory Kirk L. Johnson. Ph.D. thesis, Massachusetts Institute of Technology, December 1995.
@phdthesis{crl:johnson-phd,
  title        = {High-performance all-software distributed shared
    memory},
  author       = {Kirk L. Johnson},
  school       = {Massachusetts Institute of Technology},
  year         = 1995,
  month        = dec,
}
Dynamic computation migration in distributed shared memory systems Wilson C. Hsieh. Ph.D. thesis, Massachusetts Institute of Technology, September 1995.
@phdthesis{dyn-comp-migration:hsieh-phd,
  title        = {Dynamic computation migration in distributed shared
    memory systems},
  author       = {Wilson C. Hsieh},
  school       = {Massachusetts Institute of Technology},
  year         = 1995,
  month        = sep,
  note         = {Also available as MIT LCS tech report
    MIT-LCS-TR-665.},
}
Optimistic active messages: a mechanism for scheduling communication with computation Deborah A. Wallach, Wilson C. Hsieh, Kirk Johnson, M. Frans Kaashoek, and William E. Weihl. PPoPP 1995.
@inproceedings{oam:ppopp95,
  title        = {Optimistic active messages: a mechanism for
    scheduling communication with computation},
  author       = {Deborah A. Wallach and Wilson C. Hsieh and Kirk
    Johnson and M. Frans Kaashoek and William E. Weihl},
  pages        = {217--226},
  booktitle    = {Proceedings of the 5th {ACM} {SIGPLAN} Symposium on
    Principles and Practice of Parallel Programming ({PPoPP} '95)},
  year         = 1995,
  month        = jul,
  address      = {Santa Barbara, California},
}
Implementing sequentially consistent shared objects using broadcast and point-to-point communication Alan Fekete, M. Frans Kaashoek, and Nancy Lynch. ICDCS 1995.
@inproceedings{formal-sequential-consistent:dcs15,
  title        = {Implementing sequentially consistent shared objects
    using broadcast and point-to-point communication},
  author       = {Alan Fekete and M. Frans Kaashoek and Nancy Lynch},
  pages        = {439--449},
  booktitle    = {Proceedings of the 15th International Conference on
    Distributed Computing Systems},
  organization = {{IEEE} {C}omputer {S}ociety},
  year         = 1995,
  month        = jun,
  address      = {Vancouver, British Columbia},
}
Implementing sequentially consistent shared objects using broadcast and point-to-point communication Alan Fekete, M. Frans Kaashoek, and Nancy Lynch. MIT LCS technical report, June 1995.
@techreport{formal-sequential-consistent:tr,
  title        = {Implementing sequentially consistent shared objects
    using broadcast and point-to-point communication},
  author       = {Alan Fekete and M. Frans Kaashoek and Nancy Lynch},
  institution  = {{MIT} Laboratory for Computer Science},
  year         = 1995,
  month        = jun,
  number       = {MIT-LCS-TR-518},
}
Rover Mosaic: e-mail communication for a full-function Web browser Alan F. deLespinasse. Master's thesis, Massachusetts Institute of Technology, June 1995.
@mastersthesis{rover-mosaic:delespinasse-thesis,
  title        = {{Rover} {Mosaic}: e-mail communication for a
    full-function {Web} browser},
  author       = {Alan F. {deLespinasse}},
  school       = {Massachusetts Institute of Technology},
  year         = 1995,
  month        = jun,
}
Exterminate all operating system abstractions Dawson R. Engler, and M. Frans Kaashoek. HotOS 1995.
@inproceedings{exo:hotos5,
  title        = {Exterminate all operating system abstractions},
  author       = {Dawson R. Engler and M. Frans Kaashoek},
  pages        = {78--83},
  booktitle    = {Proceedings of the 5th {W}orkshop on {H}ot {T}opics
    in {O}perating {S}ystems ({HotOS-V})},
  year         = 1995,
  month        = may,
  organization = {{IEEE} {C}omputer {S}ociety},
  address      = {Orcas Island, Washington},
}
AVM: application-level virtual memory Dawson R. Engler, Sandeep K. Gupta, and M. Frans Kaashoek. HotOS 1995.
@inproceedings{avm:hotos5,
  title        = {{AVM}: application-level virtual memory},
  author       = {Dawson R. Engler and Sandeep K. Gupta and M. Frans
    Kaashoek},
  pages        = {72--77},
  booktitle    = {Proceedings of the 5th {W}orkshop on {H}ot {T}opics
    in {O}perating {S}ystems ({HotOS-V})},
  year         = 1995,
  month        = may,
  organization = {{IEEE} {C}omputer {S}ociety},
  address      = {Orcas Island, Washington},
}
The operating system kernel as a secure programmable machine Dawson R. Engler, M. Frans Kaashoek, and James W. O'Toole Jr.. Operating Systems Review 29(1), January 1995.
@article{exo:osr,
  title        = {The operating system kernel as a secure programmable
    machine},
  author       = {Dawson R. Engler and M. Frans Kaashoek and {O'Toole
    Jr.}, James W.},
  year         = 1995,
  month        = jan,
  volume       = 29,
  number       = 1,
  pages        = {78--82},
  journal      = {Operating Systems Review},
  organization = {{ACM}},
}

1994

Dynamic documents: mobile wireless access to the WWW M. Frans Kaashoek, Tom Pinckney, and Joshua A. Tauber. WMCSA 1994.

We propose dynamic documents as an approach to extending and customizing the WWW/Mosaic for mobile computing platforms. Dynamic documents are programs executed on a mobile platform to generate a document; they are implemented as Tcl scripts. We have modified the NCSA Mosaic web client to run the dynamic documents it retrieves through a modified Tcl interpreter. The interpreter is designed to execute only commands that do not violate safety.

To hide the latencies of slow links we have modified the Mosaic client to perform caching and prefetching. The policies for caching and prefetching can be under control of dynamic documents, allowing the strategies to be document-specific. Using dynamic documents, we have built an adaptive email browser that employs application-specific caching and prefetching strategies. Both the browser and the displayed email messages are dynamically customized to the mobile computing environment in which they run.

@inproceedings{dynamic-documents:wmcsa,
  title        = {Dynamic documents: mobile wireless access to the
    {WWW}},
  author       = {M. Frans Kaashoek and Tom Pinckney and Joshua A.
    Tauber},
  pages        = {179--184},
  booktitle    = {Proceedings of the Workshop on Mobile Computing
    Systems and Applications ({WMCSA} '94)},
  organization = {{IEEE} {C}omputer {S}ociety},
  year         = 1994,
  month        = dec,
  address      = {Santa Cruz, California},
}
Storage alternatives for mobile computers Fred Douglis, Ramón Cáceres, M. Frans Kaashoek, Kai Li, Brian Marsh, and Joshua A. Tauber. OSDI 1994.
@inproceedings{mobile-storage-alt:osdi1,
  title        = {Storage alternatives for mobile computers},
  author       = {Fred Douglis and Ram{\'o}n C{\'a}ceres and M. Frans
    Kaashoek and Kai Li and Brian Marsh and Joshua A. Tauber},
  pages        = {25--37},
  booktitle    = {Proceedings of the 1st {USENIX} {S}ymposium on
    {O}perating {S}ystems {D}esign and {I}mplementation ({OSDI} '94)},
  year         = 1994,
  month        = nov,
  address      = {Monterey, California},
}
The exokernel approach to extensibility (panel statement) Dawson R. Engler, M. Frans Kaashoek, and James W. O'Toole Jr.. OSDI 1994.
@inproceedings{exo:osdi1,
  title        = {The exokernel approach to extensibility (panel
    statement)},
  author       = {Dawson R. Engler and M. Frans Kaashoek and {O'Toole
    Jr.}, James W.},
  pages        = 198,
  booktitle    = {Proceedings of the 1st {USENIX} {S}ymposium on
    {O}perating {S}ystems {D}esign and {I}mplementation ({OSDI} '94)},
  year         = 1994,
  month        = nov,
  address      = {Monterey, California},
}
Software prefetching and caching for translation lookaside buffers Kavita Bala, M. Frans Kaashoek, and William Weihl. OSDI 1994.
@inproceedings{software-tlb-prefetch:osdi1,
  title        = {Software prefetching and caching for translation
    lookaside buffers},
  author       = {Kavita Bala and M. Frans Kaashoek and William Weihl},
  pages        = {243--253},
  booktitle    = {Proceedings of the 1st {USENIX} {S}ymposium on
    {O}perating {S}ystems {D}esign and {I}mplementation ({OSDI} '94)},
  year         = 1994,
  month        = nov,
  address      = {Monterey, California},
}
Dynamic documents: extensibility and adaptability in the WWW M. Frans Kaashoek, Tom Pinckney, and Joshua A. Tauber. WWW 1994.
@inproceedings{dynamic-documents:www94,
  title        = {Dynamic documents: extensibility and adaptability in
    the {WWW}},
  author       = {M. Frans Kaashoek and Tom Pinckney and Joshua A.
    Tauber},
  edition      = {developers' day track},
  booktitle    = {Proceedings of the 2nd International {WWW}
    Conference: Mosaic and the Web},
  year         = 1994,
  month        = oct,
  address      = {Chicago, Illinois},
}
DCG: an efficient, retargetable dynamic code generation system Dawson R. Engler, and Todd A. Proebsting. ASPLOS 1994.
@inproceedings{dcg:asplos6,
  title        = {{DCG}: an efficient, retargetable dynamic code
    generation system},
  author       = {Dawson R. Engler and Todd A. Proebsting},
  pages        = {263--272},
  booktitle    = {Proceedings of the 6th International Conference on
    Architectural Support for Programming Languages and Operating
    Systems ({ASPLOS-VI})},
  year         = 1994,
  month        = oct,
  address      = {San Jose, California},
}
The operating system kernel as a secure programmable machine Dawson R. Engler, M. Frans Kaashoek, and James W. O'Toole Jr.. SIGOPS European Workshop 1994.
@inproceedings{exo:sigops-euro,
  title        = {The operating system kernel as a secure programmable
    machine},
  author       = {Dawson R. Engler and M. Frans Kaashoek and {O'Toole
    Jr.}, James W.},
  pages        = {62--67},
  booktitle    = {Proceedings of the 6th {ACM} {SIGOPS} {E}uropean
    workshop: Matching operating systems to application needs},
  year         = 1994,
  month        = sep,
  address      = {Dagstuhl Castle, Wadern, Germany},
}
Efficient implementation of high-level languages on user-level communication architectures Wilson C. Hsieh, Kirk L. Johnson, M. Frans Kaashoek, Deborah A. Wallach, and William E. Weihl. MIT LCS technical report, May 1994.
@techreport{user-level-comm:tr,
  title        = {Efficient implementation of high-level languages on
    user-level communication architectures},
  author       = {Wilson C. Hsieh and Kirk L. Johnson and M. Frans
    Kaashoek and Deborah A. Wallach and William E. Weihl},
  institution  = {{MIT} Laboratory for Computer Science},
  year         = 1994,
  month        = may,
  number       = {MIT-LCS-TR-616},
}

1993

The logical disk: a new approach to improving file systems Wiebren de Jonge, M. Frans Kaashoek, and Wilson C. Hsieh. SOSP 1993.
@inproceedings{logicaldisk:sosp14,
  title        = {The logical disk: a new approach to improving file
    systems},
  author       = {Wiebren de Jonge and M. Frans Kaashoek and Wilson C.
    Hsieh},
  pages        = {15--28},
  booktitle    = {Proceedings of the 14th {ACM} {S}ymposium on
    {O}perating {S}ystems {P}rinciples ({SOSP} '93)},
  year         = 1993,
  month        = dec,
  address      = {Asheville, North Carolina},
}
The persistent relevance of IPC performance Wilson C. Hsieh, M. Frans Kaashoek, and William E. Weihl. WWOS 1993.
@inproceedings{ipc-persistent-relevance:wwos4,
  title        = {The persistent relevance of {IPC} performance},
  author       = {Wilson C. Hsieh and M. Frans Kaashoek and William E.
    Weihl},
  pages        = {186--190},
  booktitle    = {Proceedings of the 4th Workshop on Workstation
    Operating Systems},
  organization = {{IEEE} {C}omputer {S}ociety},
  year         = 1993,
  month        = oct,
  address      = {Napa, California},
}