Q: In Singularity, is it possible to run programs written in unsafe languages, such as C/C++/assembly? A: The system that the paper describes could not run programs written in anything other than Sing#. To support programs written in unsafe languages, Singularity would have to use hardware VM protection for isolation. Also, since IPC relies on all programs following the channel contracts, that would somehow have to be verified or enforced for programs not in Sing#. Q: A problem with this new way of designing systems is that of backwards compatibility. Most of the systems we have seen so far have major differences in design, but at a certain point, can all be made to act very similarly (POSIX has been implemented for all major systems, and can be implemented for JOS, even if inefficiently). However Sigularity is too different, which really prevents platform independent software. Why use such systems if the vast majority of preexisting software cannot be easily ported to it? Isn't this an insurmountable hurdle in adoption? A: Singularity only runs programs written in Sing# -- it doesn't run programs written in C. Most people run lots of C programs on their laptops, web servers, &c, and the effort required to re-implement all that software in Sing# would be huge. For most people, that cost would be far greater than the benefits they'd get from switching to Singularity. Q: In general, operating systems on typical computers don't seem to be particularly catered to a particular language. Typical programming languages are built based on the properties of the OS, not the other way around. From a normal user perspective, this enables freedom of choice in what language to use (although some are better than others). When would it be a good idea to use what basically seems like an OS + language combo? It seems a little impractical, although the considerations in designing the OS are interesting. A: An O/S that can support many languages does seem more generally useful than one that's tied to a single language. For that reason (and probably others) I suspect that it would be difficult to persuade users of laptops, desktops, and web servers to adopt Singularity. On the other hand, someone who cared a lot about reliability might think Singularity's approach was great. For example, if you are designing the embedded controller for a car's brake system, you might be very willing to write all the software in a single type-safe language, and care not at all about support for other languages. Singularity was a research project to explore questions like "what might the benefits be if programs were required to be written in a single type-safe language"? Now we know a lot about the answers, and we can make more informed decisions on these topics. In that sense Singularity is a successful research project. The authors are not trying to persuade anyone to use Singularity; at most they are trying to persuade other O/S designers to think about their ideas. Q: If Singularity is not being used for performance reasons, what aspects of Singularity are used in modern operating systems? A: Microsoft and others use lots of automated tools to help verify correctness and find bugs in systems code, an approach which Singularity espoused. The linear type system idea has been picked up and extended by Rust, a language intended for (among other things) kernel development. The general idea of signed software components with verified properties (the manifests) is older than Singularity, but has been gradually creeping into operating systems for a long time. Q: When can we expect an OS like Singularity to be in use? A: I expect a gradual shift to type-safe languages, but it's not clear that that implies that there will be a need or desire for operating systems like Singularity. I think one lesson from the paper is that the ideas, while interesting and thought-provoking, do not provide benefits that are so compelling that anyone is going to be motivated to switch their laptop or web server from Linux to Singularity. It's more likely that future systems will adopt individual techniques (e.g. channel contracts for IPC). Q: Is it very practical / used outside of research? A: I don't think Singularity is (or was) used outside the Singularity research group. It probably would not be practical for any of us to use Singularity instead of, say, Linux, because we'd have to first re-write our favorite programs from C to Sing#. It would be more practical if you wanted to build a small self-contained system, like an embedded controller. Q: Have there been other "memory safe" OSs developed since Singularity? Has this kind of work not seen attention by the community because of the performance overheads imposed by type and memory safety? A: I'm not aware of a subsequent single-address space operating system. There was a previous project you might be interested in called Spin. The idea of using high-level languages in the kernel is alive and well; have a look at the Rust language. I think Singularity's idea of using a single safe language for everything hasn't caught on because 1) it imposes significant limitations such as not being able to run existing programs written in C &c, and 2) the benefits are only modest, so people aren't very motivated to switch from whatever they are currently doing. The general idea of using automated analysis tools to look for bugs in systems code, on the other hand, has become a lot more popular in the last ten or 15 years. Q: Does the reliance on 1970s OS's designs also stem in a large part from the prevailing CPU architecture coming from the 1970s? If so, how does one address the chicken and egg problem of the cooperation of new OS and architecture technology? A: I don't think O/S architecture is significantly limited by hardware design right now. Most operating system services and abstractions are much higher-level than the hardware. For example, file systems require a disk (or SSD or perhaps RAM) to store the files, but file system architecture is not really constrained by what the disk can do. Similarly, network protocols like TCP require a network, but are architecturally fairly independent of the details of the network. I think O/S architectural innovation is driven mostly by high-level demand (e.g. the recent need for virtualization and containers in cloud computing). Of course the O/S changes all the time at a more detailed level in response to hardware changes. There has been a huge amount of this over the decades: operating systems have added support for paging, networks, graphical displays, big address spaces, multi-core, virtualization hardware, &c. A situation in which a new incompatible O/S might be attractive is in a brand-new platform for which compatibility isn't an issue. Smart phones were like this when they first arrived. Q: How hard is it to write something for Singularity? A: I don't know -- I've never used Singularity; my only basis for opinion is what is in the paper. My guess is it wouldn't be harder to write a given piece of code in Sing# than in C. In practice I think you would be more limited by lack of libraries &c than by Singularity's design. Q: How does garbage collection provide memory safety? A: GC is one part of Singularity's overall plan for memory safety. The Sing# byte-code verifier does much of the work for memory safety, by checking that the program never creates new pointers. That is, it's OK for the program to use a pointer it got from the memory allocator, or to store pointers in data structures and then read them and use them later, but the verifier checks that the program never creates invalid pointers like char *p = (char *) 9999; // cast integer to pointer One of the things that might go wrong is that the program might have a valid pointer, but then the memory the pointer points to is freed and re-used for an object with a different type. If that could happen, the program could dereference the pointer and perhaps use an arbitrary integer as a pointer. This is a common bug in C code, because C requires the program to explicitly free() memory. For example: struct T1 { char *p; }; struct T2 { int x; }; fn(){ T1 * t1 = new T1(); free(t1); T2 * t2 = new T2(); // maybe now t2 == t1 t2->x = 9999; t1->p[0] = 1; // writing to address 9999 -- illegal! } The garbage collector guarantees that memory is never freed and re-used while there is still a valid pointer pointing to it. The collector does this by looking at all the live pointers (on the stack, in registers, in reachable data), and only freeing memory that isn't reachable by any chain of pointers. So, in Sing#, the above code would have no free(t1), and the collector would notice that the program still had a reference to *t1 at the end of the function, and not free/reuse it until after the last use. Q: Why is doing everything in software faster? I've heard generally that hardware acceleration of things makes it faster/better. A: For process isolation, the traditional hardware approach involves using paging, which costs a little bit of time (maybe a cycle) on each memory reference while the CPU hardware checks in the page table (or TLB) to see if the access is allowed. Singularity instead performs static verification of programs before they run, which allows Singularity (in many cases) to do nothing at all at run-time to enforce isolation. So in this case, the software approach of doing nothing at run-time is faster than the hardware approach of checking each memory access against the page table. Q: How are larger programs or more demanding programs loaded into SIPs? Given that you can not dynamically load code into a SIP after it is created, how would you go about circumventing the restriction of program size then when using programs whose binaries are too large to all fit in memory at once? A: I agree that Singularity programs have to fit in RAM. If you were desperate, you could split a program into pieces, and run only one piece at a time. Q: At first I thought there several SIPs could access an endpoint on a channel but given the mutual exclusion guarantee mentioned in the last paragraph of section 3.2.1, does this mean that only one SIP can access an endpoint of a channel? (and therefore a locking scheme is not required to perform mutual exclusion?) A: Only one SIP can have a reference to each end point. There is never a situation where two SIPs could try to send on the same end point (or receive on it) at the same time, so I think you're right that SIPs don't have to lock end points. On the other hand the sender might send at the same time that the receiver receives, so the Singularity designers must have taken steps to make this potential race work out correctly. The paper says that channel contracts are required to enforce alternation in message direction, which probably helps. Maybe also the send and receive system call implementations (inside the kernel) use locks. Q: Why do the authors of Singularity view excessive SIP creation as an issue? One SIP is associated with a security principal, which seems to be an appropriate usage pattern. A: The authors are thinking of situations where a single server needs to act as different users (principals) at different times. Since there's no way in Singularity to change a SIP's principle, the only way to act as a different principle is to create a new SIP with the different principle. Q: Could the exchange heap become a bottleneck for communication because of the zeo allocation mechanism? A: I suspect that use of the exchange heap could be very efficient, but the coding style might be a bit awkward. The exchange heap is in shared memory, so it has the potential to be very fast by avoiding copies. But it might be hard in practice to avoid copying data from the sending SIP's heap to the exchange heap, and from the exchange heap into the receiving SIP's heap. I think applications that care a lot about performance in this area might, through careful coding, be able to manipulate data directly in the exchange heap, without copying. On the other hand, computers can copy data from memory to memory at gigabytes per second, so few applications are likely to be limited by the need to copy data into and out of the exchange heap. A: How is the safety of a MBP verified? What if there is an error in the manifest? Q: Have a look at this paper for more about manifests and sealing: https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/tr-2006-51.pdf Q: Doesn't the MBP feature limit portability across systems? A: I imagine the authors intend MBPs to allow programs to be portable from one Singularity computer to another, as long as both provide the API and server versions that the program needs. Q: It seems that linked stacks could be inefficient if one is calling several short-lived functions in a row from a stack frame whose stack pointer is near the end of one of the linked segments, because the OS would constantly be linking and unlinking a new stack segment to make space for the new function call. The benefit of linked stacks is stated in the paper as reducing thread memory overhead, but it seems that it could have more overhead from memory linking/unlinking and fragmentation than a standard dedicated stack approach in situations like this - are linked stacks actually worth it in practice or do such situations prevent their adoption? A: You are right, there is a potential bad case in which there are repeated calls right at the boundary of a stack segment. I don't know if Sing# has a solution. Maybe the problem doesn't come up very often for Singularity programs. Fixed-size thread stacks are also a potential problem due to memory use. They are fine if you have only hundreds of threads. But there are styles of programming which involve thousands of threads, and freqent thread creation/deletion, and then big fixed-sized stacks can be a disaster. People use both schemes. Most programs are happy either way; a few programs really need one or the other. Q: Does it result in unusually verbose code? For example, the contract-based channels seems to require a very detailed and complete contract knowing exactly every possible state for the channel. A: Sometimes things like contracts turn out to make programming easier, because they make interfaces clearer to programmers, or they allow use of automated tools. Ordinary type-checking is like that -- I'm glad that I have to tell the C compiler the types of my variables. And, having written lots of IPC and RPC code, I have found that automated support for typed messages is a huge win. But I don't know for the specific case of Singularity contracts. Q: I don't quite get the difference between a manifest and a contract. They both seem to be imposing restrictions/rules on how programs should behave. A: A contract declares the types of messages sent via IPC between SIPs. E.g. a contract might say "an OPEN request to the file server must consist of a string (the file name)." Manifests describe all the different parts of a program that it needs in order to run. E.g. which libraries it uses, what devices (if any) it needs access to, which servers it will talk to. When you run a program, the O/S uses the manifest to get the program started. Q: In the second-to-last paragraph on page 2, the authors mention that context switches between software-isolated processes (SIPs) have very low overhead as the TLB and virtually addressed caches need not be flushed. I understand that they don't need to be flushed because the two processes being switched between are running in the same virtual address space (because memory protection between processes is software- rather than hardware-based). It seems like even though the TLB and cache don't need to be flushed, as the two processes are working in non-overlapping areas of memory (as they can't share), there will still be a lot of cache misses (and possibly TLB misses) when the switched-to process starts running. Does invalidating the TLB and cache (the actual operations, not any fallout that might come from them) actually have a high enough cost where removing those operations significantly increases the speed of process switching? (It seems that most of the fallout that normally comes from flushing the TLB and cache would happen anyway on a process switch, as the two processes are not working from the same space in virtual/physical memory.) A: For a pagetable-per-process O/S like xv6 or Linux, flushing the TLB is not itself an expensive operation. But the consequence is that there will be lots of TLB misses immediately the flush, each of which takes dozens of cycles. Singularity doesn't use the TLB at all, and thus doesn't suffer from TLB misses. Singularity and its applications will still incur data and instruction cache misses, and the miss rates will probably be higher after context switches. So context switch will still have some indirect cost. Q: In the first paragraph of section 3.3.2 (Scheduler) on page 5, it says: "Whenever a scheduling timer interrupt occurs, all threads in the unblocked list are moved to the end of the preempted list, followed by the thread that was running when the timer fired. Then, the first thread from the unblocked list is scheduled and the scheduling timer is reset." I don't understand how this works - if, when a scheduling timer interrupt occurs, we move all threads on the unblocked list to the preempted list (so that the unblocked list is empty), how can we then schedule the first thread from the unblocked list? Does a scheduling timer interrupt do more than just signal for another process to be run? For that matter, why do all the unblocked threads move to the preempted list? Do they somehow lose priority over the preempted threads once the timer interrupt happens? ...Or, is it because the threads in the unblocked list are threads that were unblocked in the last quantum, between this scheduling interrupt and the last, and so theoretically don't have priority over threads in the preempted list, which might have been unblocked the quantum before, etc? -That doesn't quite explain the 'first thread from the (empty) unblocked list' thing though... A: I don't understand the paper's discussion of scheduling queues. Q: What are some of the limitations of the types of applications that can be built on top of Singularity, given the many restrictions imposed by the system? A: The main limitation is that applications have to be written in Sing#. They can't be written in C, for example. So Singularity cannot run any of the actual applications I currently use. If one is willing to re-write applications in Sing#, I'm not aware of any serious limitations. You have to live with Sing#'s rules (e.g. type safety), but they seem no more limiting that those of many mainstream languages (e.g. Java). Singularity doesn't support some O/S features such as read/write shared memory; but that feature is a convenience or performance trick, not a necessity for any application. Q: With respect 2.3 (last line), Why is it inconvenient to handle error conditions at send operations? Wouldn't it be easier to track the cause of the error and handle it at send operations than when it is received? A: I do not understand this aspect of the paper. uMaybe part of the justification is that all programs have to handle the case in which they never get a response because the target died while handling the request -- i.e. a receive error. So it's most convenient if programs have to handle only this kind of error.