6.828 2012 Lecture 17: Language/OS Co-Design / Singularity Why are we looking at this paper? completely different approach to isolation, protection language type-checking rather than hardware page protection Singularity is a Microsoft Research experimental O/S many people, many papers, reasonably high profile stated goals: better robustness, security ground-up design w/ modern techniques High level structure microkernel: SIP/thread mgmt, memory, IPC setup not so micro: 192 system calls (page 5) user-level service processes NIC, TCP/IP, FS, disk driver (sealing paper) not UNIX compatible Most radical part of design: No hardware protection! Paging is turned off, so all memory visible to all instructions CPL=0, so can always run privileged instructions Instead: programming language protections Why is that useful? Performance Fast process switching: no page table switch Fast system calls: CALL not INT Fast IPC: no copying Direct user program access to h/w, for e.g. device drivers Table 1 shows they are a lot faster at microbenchmarks Q: why does no paging contribute to their main goal, robustness? Remember what paging buys us: Protection Contiguous address space (starts at zero &c) Big arrays Contiguous stack Flexible address layout via mapping No fragmentation of physical memory Sharing/IPC via multiple mappings Tricks like copy-on-write fork, paging to disk Challenges for no paging / CPL=0 design? Read/write only to appropriate memory And define what inappropriate means! Allow allocation and freeing of memory Allow interaction (IPC) But not too much entangling, so kill/exit can work How to ensure SIP reads/writes only its own memory? Q: why not have compiler generate check code before each load/store? speed, trust The paper's approach: source compiler bytecodes install: verify and compile machine code trusted run-time running sip Q: why not compile source to machine code? Q: why verify/compile at install time? why not at run time -- JIT? What properties does verification establish? Only use reachable pointers [draw diagram] only trusted runtime can create pointers So if kernel/runtime never supply out-of-SIP pointers verified SIP can only use its own memory How does verification work? What does the verifier have to check? A. Don't invent or modify pointers B. Don't change mind about type Would allow violation of A, e.g. interpret int as pointer C. Don't use after free Re-use might change type, violate B Enforced with GC (and exchange heap linearity) D. Don't use uninitialized variables E. In general, don't trick the verifier Example bytecodes: R0 <- new SomeClass; jmp L1 ... R0 <- 1000 jmp L1 ... L1: mov (R0) -> R1 Q: is this code OK? verifier tries to deduce type for every register by pretending to execute along each code path requires that all paths to a reg use result in same type check that all reg uses OK for type verifying the example: R0 has type SomeClass at first jmp to L1 R0 has type integer at second jmp to L1 so verifier would reject this code Bytecode verification seems to do *more* than Singularity needs e.g. cooking up pointers might be OK, as long as within SIP's memory so verifier may forbid some OK programs this style of verification is off-the-shelf enforcing exactly what Singularity needs is not Singularity may actually need full verification can't allow jump to data, even if data is in process's memory since then executing unchecked code What parts of verification scheme are trusted vs untrusted? That is: All s/w has bugs Trusted s/w: if it has bugs, it can crash Singularity or wreck other SIPs Untrusted s/w: if it has bugs, can only wreck itself Source? Compiler? Compiler output? Verifier? Machine code output of bytecode compiler? Runtime / GC? IPC: what would we want? shared memory for efficiency send complex data structures but still have isolation, type checking How do SIPs communicate? IPC messages "exchange heap" -- memory shared among all SIPs thus zero-copy -- efficient msgs can have pointers &c thus can send complex data structures each receiver has a queue in the exchange heap send() system call to wake up receiving SIP receiver blocks in recv() sys call, then checks queue Q: dangers of shared-memory exchange heap? write someone else's message send the wrong type of data modify my msg while you are reading it use up all exchange heap memory and don't free How do they prevent abuse via exchange heap? verifier ensures SIP can only use ptrs someone gives it i.e. only if you allocated mem, or found it in your recv queue verifier ensures SIP bytecodes keep only one ptr to anything in exchange heap never e.g. two and that SIP doesn't keep ptr after send() single-ptr rule helps here verifier knows when last ptr goes away via send via making another exchange heap obj point to it via delete single ptr rule prevents change-after-send and also ensures delete when done delete is explicit, no GC, but it's OK since verifier guarantees only a single ptr to each block runtime maintains owning-SIP entry in each exchg heap block updates on send() &c used to clean up in exit() Limitations of exchange heap idea IPC can't carry existing language object -- not in exchange heap Single-pointer rule limits the code you write Need to use different types/functions for exchange heap data What are channel contracts for? Section 2.2 Are they just nice to have, or do other parts of Singularity rely on them? The type signatures clearly are important they probably mesh with verified language types perhaps you can't talk to a SIP that isn't verified to follow contract The state machine part guarantees finite queues, no blocking send(). and also catches protocol implementation errors e.g. sending msg when not expected How do system calls into the kernel work? INT? CALL? what stack? can a SIP pass pointers to kernel? how does SIP GC know to not examine kernel part of stack? Q: SIP allocates single pages -- how to have stack > 4096 bytes? Q: How to have array > 4096 bytes? 2.1 says SIPs are "sealed" Outlawed: JIT, dynamic library loader, self-modifying, debugger (?) Q: Why is this important? no code insertion attacks maybe easier to reason about correctness maybe easier to optimize, inline e.g. delete unused functions SIP can be a security principle, own files You could put an interpreter in a SIP to evade ban on self-modifying code Would that cause trouble? Why not use a Java VM as your operating system? Java has verification -- you can't make up pointers &c Could have a Java thread for each running application Singularity vs Java VM: One SIP can *never* affect another SIP's memory Not even with IPC Would be easy to have such bugs w/ interacting Java threads Exiting/killing a SIP releases all resources Java must at least wait for a GC SIPs let every process have its own language run-time, GC scheme, &c What should the evaluation show? What were their goals? To gain robustness -- perhaps better than paging / CPL=3 To re-examine traditional design choices What does the evaluation show? Mostly about performance Table 1 shows microbenchmark performance 10x reduction in sys calls -- why? 2x faster thread switch -- why? 5x faster IPC -- why? 2x-10x faster process creation -- why? Figure 5: unsafe code tax How much do they gain by static verification rather than run-time (or h/w) checks? Simple file reading benchmark -- client SIP, file server SIP, device driver SIP Figure 5 compare run-times; lower is better physical memory -- Singularity (no paging, CPL=0, static verification, &c) No runtime checks -- Singularity but no array bounds checking Add 4KB pages -- paging enabled, but single page table, all CPL=0 separate domain -- separate page table for one of the SIPs, so switching costs ring 3 -- CPL=3 thus INT costs (for just one of the SIPs) full microkernel -- pgtable+INT for each of three SIPs Figure 5 is useful: shows costs of various x86 features What did we learn? Are 1960s and 70s techniques now inadequate? Should we use verification &c instead of paging hardware? We *did* learn how to build O/S w/o paging -- very interesting! A few open questions: What are manifests for? Can IPC carry a capability? How does kernel learn? Does IPC receiver have to check msg format at run-time? Why is the exchange heap data reference counted? When can the count be > 1? When is it OK for SIP user code to execute a CPL=0 privileged instruction?