Address spaces using segments

This lecture is about virtual memory, focusing on address spaces. It is the first lecture out of series of lectures that uses xv6 as a case study.

Address spaces

Recall the goals of the address spaces abstraction:
- Give each process a private memory area for code, data, stack.
- Prevent one process from reading/writing outside its address space.
- Allow sharing when needed.
Usually the implementation is split between the O/S and the hardware.
The O/S manages address spaces:
- Allocate physical memory for them (for creation, growth, deletion).
- Keep track of them when they are not executing.
- Switch between them (to switch processes).
- Configure the hardware.
The hardware performs address translation and protection:
- Translate user addresses to physical addresses.
- Detect and prevent attempts to use memory outside the address space.
- Allow cross-space transfers (system calls, interrupts).
Also:
- O/S has its own address space.
- O/S must be able to conveneiently read/write user memory.
Hardware support may or may not correspond well to what the O/S wants.
Two main approaches to implementing address spaces: using segments and using page tables. Often when one uses segments, one also uses page tables. But not the other way around; i.e., paging without segmentation is common.

Example hardware for address spaces: x86 segments

PC block diagram without virtual memory support:

physical address
base, IO hole, extended memory
Physical address == what is on CPU's address pins

The x86 starts out in real mode and translation is as follows:

segment*16+offset ==> physical address
no protection: program can load anything into seg reg

The operating system can switch the x86 to protected mode, which allows the operating system to create address spaces. Translation in protected mode is as follows:

selector:offset (logical addr)
==SEGMENTATION==>
linear address
==PAGING ==>
physical address

Next lecture covers paging; now we focus on segmentation.

Protected-mode segmentation works as follows:

protected-mode segments add protection and 32-bit addresses
segment register holds segment selector
selector indexes into global descriptor table (GDT)
segment descriptor holds 32-bit base, limit, type, protection
la = va + base ; assert(va < limit);
seg register usually implicit in instruction
- ESP uses SS, EIP uses CS, others (mostly) use DS
- many instructions can take far addresses:
  - ljmp $selector, $offset
LGDT instruction loads CPU's GDT register
you turn on protected mode by setting PE bit in CR0 register
what happens with the next instruction? CS now has different meaning...
What about protection?
- can o/s limit what memory an application can read or write?
- app can load any selector into a seg reg...
- but can only mention indices into GDT
- app can't change GDT register (requires privilege)
- Current privilege level (CPL) is in the low 2 bits of CS
- CPL=0 is privileged O/S, CPL=3 is user
- why can't app write the descriptors in the GDT?
- what about system calls? how do they transfer to kernel?
- app cannot just lower the CPL

Case study (xv6)

xv6 is a reimplementation of Unix 6th edition.

v6 is an early Unix operating system for DEC PDP11
- Thompson and Ritchie, 1976
- PDP11: 16 bit data and addresses, 18 bit physical addresses
- ancestor of Linux &c but much smaller
- recognizable: shell, multi-user, directories
- written in C
- 6.828 used to use it instead of xv6
- Unix papers.
xv6 written for 6.828:
- even smaller than v6, maybe not useable as is
- preserves basic structure (processes, files, pipes, &c)
- you don't have to learn PDP11 andx86
- runs on symmetric multiprocessing PCs (SMPs).

Newer Unixs have inherited many of the conceptual ideas even though they added paging, networking, graphics, improve performance, etc.

You will need to read most of the source code multiple times. Your goal is to explain every line to yourself.

Overview of address spaces in xv6

In today's lecture we see how xv6 creates the kernel address spaces, first user address spaces, and switches to it. To understand how this happens, we need to understand in detail the state on the stack too---this may be surprising, but a thread of control and address space are tightly bundled in xv6, in a concept called process. The kernel address space is the only address space with multiple threads of control. We will study context switching and process management in detail next weeks; creation of the first user process (init) will get you a first flavor.

xv6 uses only the segmentation hardware on xv6; it doesn't use paging. (In JOS you will use page-table hardware too, which we cover in next lecture.)

The kernel address space:

  the code segment runs from 0 to 2^32 and is mapped X and R
  the data segment runs from 0 to 2^32 but is mapped W (read and write).

Each process has an address space, laid out as follows starting at virtual address zero:
```
  text
  original data and bss
  fixed-size stack
  expandable heap
```
A process's code, data, and stack segments all map this virtual address space to the same range of linear addresses. That is, all three segments are the same.

The x86 designers probably had in mind more interesting uses of segments. What might they have been?

In xv6, each each program has a user stack and a kernel stack; when the user program switches to the kernel, it switches to its kernel stack. The switch is arranged with the TSS, which is covered later.

Since an xv6 process's address space is essentially a single segment, a process's physical memory must be contiguous. So xv6 may run into fragmentation if process sizes are a significant fraction of physical memory.

xv6 kernel address space

Let's see how xv6 creates the kernel address space by tracing xv6 from when it boots, focusing on address space management:

Start with bootasm.S, which the BIOS loads at 0x7c00 ( b 0x7c00, info reg )
0912: what segment is being used here?
0919: does the code really need to set up %ds etc?
0954: what values are loaded in the GDT?
- 0988: gdtr points to gdt
- 0984: entry 0 unused
- 0985: entry 1 (X + R, base = 0, limit = 0xffffffff, DPL = 0)
- 0986: entry 2 (W, base = 0, limit = 0xffffffff, DPL = 0)
- look at GDT after lgdt with info gdt 0 2
- how does address translation change right after lgdt completes?
0957: what is the immediate effect of setting CR0_PE_ON?
0961: far jump, load 8 in CS. why?
0967-0971: set up other segment registers
0974: where is the stack?
1117: bootmain in the bootloader (see lab 1), which calls main

1211: main ( b 0x102558 )

main prepares hardware and kernel data structures to run o/s.
where is the stack? (sp = 0x7bec)

what is on it? ( print-stack 6 )

   00007bec [00007bec]  7ce3  // return address in bootmain
   00007bf0 [00007bf0]  0080  // callee-saved ebx
   00007bf4 [00007bf4]  7369  // callee-saved esi
   00007bf8 [00007bf8]  0000  // callee-saved ebp
   00007bfc [00007bfc]  7c4a  // return address for bootmain: spin
   00007c00 [00007c00]  c031fcfa  // first instruction at 7c00 (start)

1228-1229: switch to cpu stack ( b 0x1025d8 )

why -32?

what values are now in ebp and esp?

esp: 0x10ad7c   1092988
ebp: 0x10ad9c   1093020

what is on the stack?

   0010ad7c [0010ad7c]  0000
   0010ad80 [0010ad80]  0000
   0010ad84 [0010ad84]  0000
   0010ad88 [0010ad88]  0000
   0010ad8c [0010ad8c]  0000
   0010ad90 [0010ad90]  0000
   0010ad94 [0010ad94]  0000
   0010ad98 [0010ad98]  0000
   0010ad9c [0010ad9c]  0000
   0010ada0 [0010ada0]  0001
   0010ada4 [0010ada4]  0001
   0010ada8 [0010ada8]  0000

what is 1 in 0x10ada0? is it on the stack?

1231: is it safe to reference bcpu? where is it allocated?
1249: main() calls userinit() to create first process
1254: then scheduler() to start running processes

creating the first process's address space

1757: userinit() calls copyproc(0) to set up kernel part of a new process
1704: copyproc() usually copies, for fork(), but not this time
1720: every process has a proc[] table entry
1714: allocate stack to use while in kernel (sys call or interrupt)
1718: space on stack for trap frame later needed to get into user space
1720: (if fork(), would copy parent's user memory &c)
1740: where will new process run in kernel? where is its kernel stack?
1758: allocate memory to hold new process's user code, data, stack
1762: trap frame holds initial user register contents
1762: what does the DPL_USER do?
1766: why set FL_IF?
1767: where will the user stack be?
1771: where in memory is this writing?
1774: where will the user program start executing?
1775: what is being copied here? to what address? (sheet 66)
about to run scheduler(). the new process has:
- entry in proc[] array
- p->context indicates where to start in kernel
- p->kstack points to phys mem for kernel stack
- kernel stack holds trap frame w/ initial user registers
- p->mem points to phys mem with user code/data/stack
- we are missing user segment descriptors...

running the first process

remember that main calls scheduler() after userinit()
we'll see how scheduler() works in a few lectures, overview now
1817: look for a RUNNABLE process, finds ours immediately
1826: setupsegs(p) sets up the segment descriptors
1679: make interrupts use process's kernel stack
1683-1693: set up gdt
1684: why phys 0? why size 0x100000 + 64*1024?
1684: why does the process need kernel segment descriptors at all?
1689: is p->mem logical or physical?
1689: what will the user program's address 0 refer to?
1689: why DPL_USER (= 3) instead of previous 0?
1696: use this new gdt now! why does this work?
1696: what's in the segment table? ( b 0x1030a3, info gdt 0 5 )
back to scheduler()...
1828: switches to eip/esp in p->context (forkret, p->kstack)

1884: let's look at the stack in forkret() ( b 0x1034d8, print-stack 17 )

   0020efbc [0020efbc]  0000
   0020efc0 [0020efc0]  0000
   0020efc4 [0020efc4]  0000
   0020efc8 [0020efc8]  0000
   0020efcc [0020efcc]  0000
   0020efd0 [0020efd0]  0000
   0020efd4 [0020efd4]  0000
   0020efd8 [0020efd8]  0000
   0020efdc [0020efdc]  0023
   0020efe0 [0020efe0]  0023
   0020efe4 [0020efe4]  0000
   0020efe8 [0020efe8]  0000
   0020efec [0020efec]  0000
   0020eff0 [0020eff0]  001b
   0020eff4 [0020eff4]  0200
   0020eff8 [0020eff8]  0ffc
   0020effc [0020effc]  0023

what are we looking at? what are 23, 1b, 200, ffc, and 23?
2484: let's look at forkret1 ( b 0x104bb8 )
2485: esp now points to trapframe
2475: "restore" user registers, es, ds
2479: what does the iret do?
now where are we?

Managing physical memory

To create an address space we must allocate physical memory, which will be freed when an address space is deleted (e.g., when a user program terminates). xv6 implements a first-fit memory allocator (see kalloc.c, sheet 22).

It maintains a list of ranges of free memory. The allocator finds the first range that is larger than the amount of requested memory. It splits that range in two: one range of the size requested and one of the remainder. It returns the first range. When memory is freed, kfree will merge ranges that are adjacent in memory.

Under what scenarios is a first-fit memory allocator undesirable?

Growing an address space

How can a user process grow its address space? growproc.

1657: allocate a new segment of old size plus n
1660: copy the old segment into the new (ouch!)
1661: why bother zeroing the rest?
1664: free the old physical memory

We could do a lot better if segments didn't have to contiguous in physical memory. How could we arrange that? Using page tables, which is our next topic. This is one place where page tables would be useful, but there are others too (e.g., in fork).