Singularity is a Microsoft Research experimental O/S many people, many papers, reasonably high profile choice of problems maybe influenced by msft experience w/ windows we can speculate about influence on msft products Stated goals increase robustness, security particularly w.r.t. extensions decrease unexpected interactions incorporate modern techniques High level structure microkernel: kernel, processes, IPC they claim to have factored services into user processes (page 5) NIC, TCP/IP, FS, disk driver (sealing paper) kernel: processes, memory, some IPC, nameserver UNIX compatibility is not a goal, so avoiding some Mach pitfalls on the other hand there are 192 system calls (page 5) Most radical part of design: Only one address space (paging turned off, no use of segments) kernel and all processes User processes run w/ full h/w privs (CPL=0) Why is that useful? Performance Fast process switching: no page table switch Fast system calls: CALL not INT Fast IPC: no copying Direct user program access to h/w, for e.g. device drivers Table 1 shows they are a lot faster at microbenchmarks But their main goal wasn't performance! robustness, security, interactions Is *not* using pagetable protection consistent w/ goal of robustness? unreliability comes from *extensions* browser plug-ins, loadable kernel modules, &c typically loaded into host program's address space for speed and convenience so VM h/w already not relevant can we just do without hardware protection? How would an extension work in Singularity? e.g. device driver, new network protocol, browser plug-in Separate process, communicate w/ host process via IPC What do we think the challenges will be for single address space? Prevent evil or buggy programs from writing each other or kernel Support kill and exit -- avoid entangling general SIP philosophy: "sealed" No modification from outside: none of JOS calls that take target envid argument (except start/stop) probably no debugger only IPC No modification from within: no JIT, no class loader, no dynamically loaded libraries SIP rules only pointers to your own data no pointers to other SIP data or into kernel thus no sharing despite shared address space! limited exception for IPC messages in exchange heap SIP can allocate pages of memory from kernel different allocations are not contiguous Why so crucial that SIPs can't be modified? Can't even modify themselves? What are the benefits? no code insertion attacks probably easier to reason about correctness probably easier to optimize, inline e.g. delete unused functions SIP can be a security principle, own files Is it worth the pain? Why not like Java VM, can share all data? SIPs rule out all inter-process interactions except explicit via IPC SIPs more robust SIPs let every process have its own language run-time, GC scheme, &c though they are trusted and better not have bugs equivalent in sensitivity to kernel code so will be much harder for people to cook up their own SIPs make it easy for kernel to clean up after kill or exit How to keep SIPs from reading/writing other SIPs? Only read/write memory the kernel has given you Have compiler generate code to check every access? "does this point to memory the kernel gave us?" Would slow code down (esp since mem isn't contig) We don't trust compiler Why the overall structure: 1. compile to bytecodes 2. verify bytecodes during install 3. compile bytecodes -> machine code during install 4. run the verified machine code w/ trusted runtime Why not compile to machine code? Why not JIT at run time? Why not verify at compile time? Why not verify at run time? What does bytecode verification buy Singularity? Does it verify "only r/w memory kernel gave us"? Not exactly, but related: Only use reachable pointers [draw diagram] Cannot create a new pointer only trusted runtime can create pointers So if kernel/runtime never supply out-of-SIP pointers verified SIP can only use its own memory What does the verifier have to check to establish that? A. Don't cook up pointers (only use pointers someone gives you) B. Don't change mind about type Would allow violation of A, e.g. interpret int as pointer C. Don't use after free Re-use might change type, violate B Enforced with GC (and exchange heap linearity) D. Don't use uninitialized variables D. In general, don't trick the verifier Example? R0 <- new SomeClass; jmp L1 ... R0 <- 1000 jmp L1 ... L1: mov (R0) -> R1 potential problem: last mov is OK if via 1st jmp (assuming ptr legitimate) reads first element of SomeClass not OK if via 2nd jmp 0x1000 may point into kernel verifier tries to deduce type for every register by pretending to execute along each code path requires that all paths to a reg use result in same type check that all reg uses OK for type would decide R0 has type int, or type SomeClass * either way, verifier would say "no" Bytecode verification seems to do *more* than Singularity needs e.g. cooking up pointers might be OK, as long as within SIP's memory verifier may forbid some programs that might have been OK on Singularity Benefits of full verification: Fast execution, often don't need runtime checks at all Though some still needed: array bounds, OO casts, stack expansion Type check IPC types Need to allow r/w of exchange heap, but it is not SIP's memory Stack page allocation Do sys calls run on stack in SIP's memory? prevent thread X from wrecking thread Y's kernel syscall stack You could put an interpreter in a SIP to evade ban on self-modifying code Would that cause trouble? What parts are trusted vs untrusted? That is: All s/w has bugs Trusted s/w: if it has bugs, it can crash Singularity or wreck other SIPs Untrusted s/w: if it has bugs, can only wreck itself Let's consider some ordinary app, not a server. compiler. compiler output. verifier. verifier output. GC. Paper also talks about IPC How do SIPs communicate? endpoints, channels recv endpoint is a queue of messages message bodies are in "exchange heap" cool: no copy Exchange heap is shared memory! What are the dangers? send the wrong type of data modify my msg to you while you are using it modify a totally unrelated message use up all exchange heap memory and don't free How do they prevent abuse via exchange heap? verifier ensures SIP bytecodes keep only one ptr to anything in exchange heap never e.g. two and that SIP doesn't keep ptr after send() single-ptr rule helps here verifier knows when last ptr goes away via send via making another exchange heap obj point to it via delete single ptr rule prevents change-after-send and also ensures delete when done delete is explicit, no GC, but it's OK since verifier guarantees only a single ptr to each block runtime maintains owning-SIP entry in each exchg heap block updates on send() &c used to clean up in exit() What are channel contracts for? Are they just nice to have, or do other parts of Singularity rely on them? The type signatures clearly are important. bytecode verifier (or something similar) must check them. The state machine part guarantees finite queues, no blocking send(). and also catches protocol implementation errors e.g. sending msg when not expected How does receive work? checks endpoints in shared mem, block on condition variable if no msgs so send must do a wakeup syscall How do system calls into the kernel work? INT? CALL? what stack? since same stack, how does GC know? can a SIP pass pointers to kernel? Endpoints function as capabilities Can pass them Can't talk to other SIPs w/o a channel Page 5 says they use channels to restrict access to e.g. files Does evaluation support their claims? Robustness? Good model for extensions? Performance? e.g. real win from single address space, cheap syscall, switch, IPC Table 1, but only microbenchmarks Figure 5: unsafe code tax physical memory -- means paging disabled -- is this Singularity? Add 4KB pages -- means turn on paging, but single page table, all CPL=0 separate domain -- separate page table for one of the SIPs, so switching costs ring 3 -- CPL=3 thus INT costs (for just one of the SIPs) full microkernel -- pgtable+INT for each of three SIPs