Address Spaces the primary way programs can interact is reading and writing memory isolation achievable by restricting memory access sharing means allowing controlled memory access we'll start with a very clean and simple virtual memory model that hides everything behind an "address space" abstraction over the next few lectures we'll come to regret that simplicity turns out programs can do cool stuff by knowing what's really going on Why isolate? [picture 1: physical address space, kernel and processes packed together.] 600,000 lines of code in GNU cc -- surely it has bugs do I want it writing to random places in the kernel or my emacs? other users on Athena dialup machines developing tcp proxies or trying to read my secret data want to force clean interfaces programmers would take advantage of unrestricted sharing... yuck. Why contiguous address space starting at 0? Easier to program if addresses are predictable. Hard to build large data structures if non-contiguous. What if Emacs wants to expand? [picture] Virtual machine model is easy to understand. Isolation is natural -- can't even name other programs' memory. [picture 2: multiple virtual address spaces] Alternate plan: Java. Implementation is single address space. Language enforces isolation. Think about why this might be good or bad. How do we implement the virtual address space model? Make the hardware support a level of addressing indirection. Programs use virtual addresses. CPU translates to physical addresses. Can implement any mapping, including our isolated contiguous spaces. Add some state to the CPU: Page Table Register (PTR). Points to array of Page Table Entries. Each PTE maps e.g. 4096 bytes of address space. [picture 3: cpu regs, emacs and gcc PTEs, phys mem] [picture 4: PTR + v page # -> PTE -> + offset -> phys addr] Example code 1: CPU address translation hardware. What about efficiency? Every memory reference seems to require an extra fetch. CPU contains a translation lookaside buffer (TLB) to cache mappings. Typically small, 64 entries, two columns: virtual page #, physical page # Each entry caches the mapping for a 4096-byte page. Conceptually associative, so searched, not indexed. How does hardware enforce the isolation? Don't map the PTEs into process' address space. Don't let the program modify the Page Table Register. Don't let the program see the TLB. So we have perfect isolation. How does a program perform I/O or talk to other processes? We have too much isolation! We need to be able to switch to the kernel address space. System calls. For example, x86 INT instruction. What is the kernel's address space? Need a convention, supported by O/S and hardware. Another protected CPU register: Kernel Page Table Register, KPTR. And we need a CPU kernel/user mode flag. Decides whether PTR or KPTR is used. Decides if PTR and KPTR can be written. Probably controls other functions, like I/O. [picture 5: new registers.] Exactly what has to happen during a system call? Example code 2. Note the TLB flushes. We have to flush the cached TLB mappings whenever address space changes. The operation itself is cheap. But it causes many subsequent memory accesses to be slow. How much hardware support do we need for this? Can't implement step 3 as a user-accessible instruction. It effectively changes the address space to the kernel. Better not execute user instructions when kernel mem is accessible. We need a combined 3+5 (and 16+18). But can't let the user specify the kernel entry point. Hardware must allow only jump to allowed kernel code. General point about transitions between address spaces: Destination must control what instructions are executed. "Protected procedure call." What about context switch between user processes? Much like system call return. Setting the Page Table Register actually does something. Note current process' state has already been saved. Example code 3: context switch. What about communication with other processes? We're still completely isolated from other processes. Easy solution: communicate indirectly through the kernel. We can do this entirely with kernel system calls. Basic facility: send a message. Two of these make an RPC. Example code 4: send() and recv() system calls. Good aspects of this mechanism? Very clean semantics. Sender and receiver are fairly isolated. Yet achieve carefully controlled sharing. Can't interact unless other process agrees. Other process controls how it is affected. Bad aspects of this mechanism? How many TLB flushes in an RPC? Two per system call. Two send()s, two recv()s. So eight in total. How many data copies? Four - twice for RPC argument, twice for return value. Easy to get stuck waiting for message_waiting. These problems are caused by our insistence on isolation. But they are not inevitable...