6.824 2001 Lecture 6: Address Spaces, VM hardware and software

Address Spaces
  One of the main ingredients in our notion of a program
  Often the end-point of communication, e.g. RPC
  Important boundary: programming language inside, communication outside
  How to implement a basic address space?
  How to make the boundary more flexible?
    Support advanced ideas described in next paper (Appel & Li)

Why isolate?
  [picture 1: physical address space, kernel and processes packed together.]
  My buggy TCP proxy might scribble on your emacs.
  I might read your private emacs buffers.
  want to force clean interfaces
    programmers would take advantage of unrestricted sharing... yuck.

Why address spaces?
  Starts at 0, contiguous.
  Easier for compiler to generate code if addrs are predictable.
  Easier to make large data structures of contiguous memory.
    Matrices, stack.
  Can't name other programs' memory -- so private.
  [picture 2: multiple virtual address spaces]

Alternate plan: Java.
  Implementation is single address space.
  Language enforces isolation.
  Think about why this might be good or bad.

How do we implement the virtual address space model?
  Make the hardware support a level of addressing indirection.
    Programs use virtual addresses.
    CPU translates to physical addresses.
    Can implement any mapping, including our isolated contiguous spaces.
  Add some state to the CPU: Page Table Register (PTR).
    Points to array of Page Table Entries.
    Each PTE maps e.g. 4096 bytes of address space.
  [picture 3: cpu regs, emacs and gcc PTEs, phys mem]
  [picture 4: PTR + v page # -> PTE -> + offset -> phys addr]

Example code 1: CPU address translation hardware.

What about efficiency?
  Every memory reference seems to require an extra fetch.
  CPU contains a translation lookaside buffer (TLB) to cache mappings.
  Typically small, 64 entries, two columns:
    virtual page #, physical page #
  Each entry caches the mapping for a 4096-byte page.
  Conceptually associative, so searched, not indexed.

How does hardware enforce the isolation?
  Don't map the PTEs into process' address space.
  Don't let the program modify the Page Table Register.
  Don't let the program see the TLB.
  So we have perfect isolation.

How does a program perform I/O or talk to other processes?
  We have too much isolation!
  We need to be able to switch to the kernel address space.
    System calls.
    For example, x86 INT instruction.

What is the kernel's address space?
  Need a convention, supported by O/S and hardware.
  Another protected CPU register:
    Kernel Page Table Register, KPTR.
  And we need a CPU kernel/user mode flag.
    Decides whether PTR or KPTR is used.
    Decides if PTR and KPTR can be written.
    Probably controls other functions, like I/O.
  [picture 5: new registers: KPTR, mode flag]

Exactly what has to happen during a system call?
  Example code 2.

Note the TLB flushes.
  We have to flush the cached TLB mappings whenever address space changes.
  The operation itself is cheap.
  But it causes many subsequent memory accesses to be slow.

How much hardware support do we need for this?
  Can't implement step 3 as a user-accessible instruction.
    It effectively changes the address space to the kernel.
    Better not execute user instructions when kernel mem is accessible.
  We need a combined 3+5 (and 16+18).
    But can't let the user specify the kernel entry point.
    Hardware must allow only jump to allowed kernel code.
  General point about transitions between address spaces:
    Destination must control what instructions are executed.
    "Protected procedure call."

What about context switch between user address spaces?
  Always via kernel; system call or interrupt.
  Much like system call return.
  Setting the Page Table Register actually does something.
  Note current process' state has already been saved.

Example code 3: context switch.

This is how many O/S's handled VM for a long time.
  Simple linear address spaces
  Hardware does most of the work and defines data structures.
  VM may be nearly invisible to O/S.

Why Advanced VM Page Table Management?
  VM hardware provides an efficient level of indirection for every mem access.
  Indirection often turns out to be very useful.
  Lets one play neat tricks for efficiency, makes it easier to program
  Both O/S and user programs could benefit.
    Appel & Li paper...

Desirable VM-related O/S features
  (Keep in mind whether simple PTE array model could support these.)
  Processes larger than physical memory.
    Not really mapping virtual to phys mem -- some mem is on disk.
  Mapped access to files.
    Program text, for example.
  Lazyness for better response time and maybe to save work.
    Demand fill for instant program start-up.
    Just don't map the pages in the VM hardware; wait for the fault.
  Efficient copying.
    Avoid copying all memory during a fork().
      Just mark PTEs read-only, and copy-on-write.
      Do the work during the write fault.
    Implement UNIX pipes with re-mapping.
  Sharing to conserve memory
    I.e. same phys pages mapped into multiple processes.
    Program text (r/o) and initialized data (r/w with copy-on-write).
  Avoid using lots of phys mem for page tables
    4GB of 4k pages requires 4 megabytes of 32-bit PTEs.
    Sparse mappings -- e.g. stack is at the top.
  
O/S and hardware design depend on each other.
  Not a simple layered abstraction:
    O/S is "under" the hardware during a page fault.
  So the O/S-vs-hardware split is fluid.
  A portable O/S must be sophisticated about VM hardware management.