Q: Has RadixVM been adopted by other operating systems? A: Not that I know of. Replacing the complete VM system, which is what the paper proposes, is a gigantic change. Lots of Linux subsystems have their fingers in the VM code. It is not just replacing one implementation of an interface with another implementation. This kind of change would involve quite a bit of discussion among the maintainers of the VM system etc, and some exploration to see if there are any serious down sides. These kinds of changes do happen but more slowly, and mostly when there is a burning problem. Although the paper addresses a real problem in Linux, it is unclear if it is a burning problem that needs to be addressed right now. Q: How do researchers decide whether to base their implementations on Linux versus a research kernel such as xv6? A: In OS conferences you see papers that modify existing OSes and papers that prototype ideas in simpler kernels. It depends on the research question being investigated. If the question is relevant to existing OSes, then it is more convincing to modify an existing OS to demonstrate the solution to the question. However, in some cases it is difficult for a small team to modify an existing OS to implement their solutions (like in this paper) and researchers resort to prototyping solutions in simpler kernels. In this case, the hope is that the prototype might convince developers of an existing OS to implement the idea in their system, either to evaluate it further or to adopt it. If the question is less relevant to existing OSes (e.g., researchers are exploring a new radical overall design for an OS), then there is little choice other than to prototype the design with a simple kernel. Of course, it would be ideal to build a fully-functional OS in the new design style, but that is typically impossible with a small team of researchers. Q: Why does Metis scale near-linearly on RadixVM but very poorly on Linux? A: Because in Linux the data structure for an address space of a process is protected by a single lock, which becomes a bottleneck if many threads of the same process want to modify the process's address space. Q: What is a good length for an epoch? A longer length would avoid flushing to the global reference count, but it seems like it would cause memory to be freed up much more slowly, which might be a problem. A: Indeed, that is the key trade-off. Linux also uses epochs for RCU-based concurrent data structures, and has a similar trade-off. 10msec in practice seems to be a reasonable number to use (not too fast to avoid the overhead of the epoch scheme, not too slow so that memory is reclaimed reasonably quickly). I don't know if anyone has studied this carefully with realistic application workloads. Q: What kinds of applications are likely to benefit from RadixVM? A: The observation in the RadixVM paper is that any multithreaded application that calls mmap/munmap a lot will run the risk of not scaling, because of the lock that protects the VM datastructures for a process. All the threads in a process share the same VM data structures. In practice, developers have worked around this bottleneck by modifying their applications to allocate much memory at once and holding on to it, so that there are few mmap/munmap calls. This is also the case for the app where Bonsai does well: it allocates memory in large chunks (8 MB). If it allocates in more standard sizes (64 KB), then the application suffers with Bonsai. RadixVM avoids the work that developers have to go through to modify their applications. Whether the pain of modifying applications justifies a new VM design is unclear at this point. Q: Is RadixVM's worse sequential performance compared to Linux due solely to its high memory overhead? A: No. The Linux VM system is also spends less time executing instructions. If I remember correctly, the Linux page-fault handler is much more streamlined than RadixVM's page-fault handler. In general, the Linux developers have spent significant amount of energy making the VM system execute fast on a single core, which the RadixVM developers didn't do. Q: In real life applications, would memory consumption be a problem for RadixVM? A: I don't know. The paper's design is careful about memory consumption and the experimental results bear that out. However, the VM system hasn't been used in the real world; there might be hidden problems (incl. memory memory use) that show up with some real life applications. Engineers who want to adopt RadixVM in their OSes would first do some more exploration with real life applications before committing to the paper's design. Q: What is reference delta caching? A: This refers to the fact that each core doesn't update the shared ref count directly. Instead, each core maintains a delta that reflects its changes to the shared ref count (e.g., if the core increments 3 times, and decrements once, then the delta is +2). The caching part refers to the idea that RadixVM does this only for frequently-accessed ref counts, since for those scalability matters, but not for all ref counts (which keeps the measure pressure ref counts under control). Q: What is the reason for having weak reference counts? A: Here's an example. Consider the buffer cache. Each entry in the buffer cache has a pointer to an in-memory page that contains a disk block. When no kernel thread is using that disk block, the ref count in the entry is zero. The buffer cache may no want to delete the page, however, in case a thread comes along and wants to read the disk block. Weak references make it possible to support this scenario. The entry keeps a weak reference to the page along with the ref count. If a thread wants to read the disk block, then the buffer cache can try to revive the weak reference by calling tryget. If successful, tryget returns the reference to the page to the calling thread (after clearing the dying bit) and after incrementing the ref count (since now one thread has a reference). Q: How does one judge the complexity of an implementation? RadixVM was about 4000 lines of code, and might be complicated to implement correctly. But the paper seems to downplay this complexity. A: The paper makes an implicit comparison with an alternative design that isn't documented but only hinted at in the paper. In this alternative design, the VM system makes use of lock-less concurrent data structures (such as concurrent skip lists). With those data structures it is more difficult to implement mmap, munmap, etc. correctly because those designs try to avoid locking mappings for pages. RadixVM's implementation of mmap and munmap is easier compared to this alternative design because it does use locks for mappings. That doesn't mean that RadixVM is a simple design, it is just simpler than an even more complicated design. Q: How does RadixVM keep track of which cores have a particular VA in their TLB? A: It doesn't keep track directly, but approximates conservatively by allocating per-core page tables. If a core has mapped a page, it is possible that the page is in its local TLB. If a core hasn't mapped a page, the core knows for sure that it is not in the local TLB. When performing TLB shootdown, RadixVM sends shootdown messages the cores that have a page mapped, since they could have it in their TLBs.