UNIX v6

Required reading: Chapter 6, and Chapter 7-1 through 7-4 of Lion's, and corresponding code.

Overview

UNIX 6th edition (for first half of the term):

Multi-user time-sharing operating system for PDP11
- PDP-11 (1972):
- 16-bit processor, 18-bit physical (40)
- UNIBUS
- memory-mapped I/O
- performance: less than 1MIPS
- register-to-register transfer: 0.9 usec
- 56k-228k (40)
- no paging, but some segmentation support
- interrupts, traps
- about $10K
- rk disk with 2MByte of storage
- with cabinet 11/40 is 400lbs
UNIX v6
- 1976; first widely available UNIX outside Bell labs
- R&T
- simplicity (reaction to MULTICS)
- complete (used for real work)
- small (43 system calls)
- modular (composition through pipes; one had to split programs!!)
- compactly written (2 programmers, 9,000 lines of code)
- advanced UI (shell)
- introduced C (derived from B)
- distributed with source
- V7 was sold by Microsoft for a couple years under the name Xenix
Lion's commentary
- surpressed because of copyright issue
- resurfaced in 1996

Newer UNIXs have inherited many of the conceptual ideas even though they added paging, networking, graphics, etc.

You will need to read most of the source code multiple times. Your goal is to explain every line to yourself without using the commentary. Read it one or multiple times with Lion's commentary until you reach the goal.

Address spaces

OS: kernel program and user-level programs. For fault isolation each program runs in a separate address space. The kernel address spaces is like user address spaces, expect it runs in kernel model. The program in kernel mode can execute priviledge instructions (e.g., writing the PDP11's segment registers).
One job of kernel is to manage address spaces (creating, growing, deleting, and switching between them)
- each address space (including kernel) consists of the binary representation for the text of the program, the data part part of the program, and the stack area.
- the kernel address space runs the kernel program, which manages all hardware and provides an API to user programs. In v6, this program is called the scheduler (proc[0]).
- each user address space contains a program. In v6, each each program has a user and a kernel stack; when the user program switches to the kernel, it switches to its kernel stack. The kernel stack is stored in process's u structure.
The main operations:
- Creation. Allocate physical memory to storage program. Load program into physical memory. Fill address spaces with references to physical memory. (Example: see accompanying picture of how v6 layout out of the kernel address space and user address space, and how they are mapped to physical memory.)
- Deletion. Remove mappings, free up physical memory.
- Switching from one program to another:
  - Switch address spaces; ask the MMU to point to new the address space. On the PDP-11, this means loading the segment registers, and perhaps PSW. To switch from the kernel address space to a user-address space, the v6 kernel loads the user segmentation registers (PDRs and PARs).
  - Unload the current program's state from processor and reload it with the new program's state. On the PDP-11, this means switching sp (pointing it to the new program's stack) and pc (pointing it to the new program's point of execution).

Case study (Lions's book)

In today's lecture we see how v6 creates the kernel address spaces, first user address spaces, and switches to it. To understand how this happens, we need to understand in detail the state on the stack---this may be surprising, but thread switching and address space creation are tightly bundled in v6, in a concept called process. We will study thread management in detail next week, but we will need to understand some to follow the creation and switching to the first address space. (In future lectures we will return in more detail to creating, growing/shrinking, and switching address spaces.)

C calling conventions

PDP-11 assembly (8 general register, pc (r7), sp (r6), environment (r5)). r0, r1 used for results

The compiler generates the following line at the beginning of every body of a C function:
- conceptually stack layout:
```
	   args
	   ra (return address)
           saved r5  <--- r5
	   saved r4
	   saved r3
	   saved r2  <--- sp
	   (local variables)
```
- with what instruction can a C function reference its first of n arguments? (be careful: arguments pushed in reverse order)
- the more precise layout is more complicated, because the calling and saving is done by csv, and cret:
```
	   args
	   ra (return address of where to return when C-func completes)
           saved r5  <--- r5
	   saved r4
	   saved r3
	   saved r2 
	   ra (cret) <---- sp
```
- Is it important that cret directly follows csv in m40.s? No, the "ra (cret)" entry is used for the first local variable (see this digressioin).

Booting v6 (chapter 6)

Lines 612 through 0669 set up the kernel address space. These lines are explained by Lions in Chapter 6, but the accompanying picture may be helpful, since it depicts the end result. The MMU registers of the PDP-11 may also be helpful.

The first user-level address space

We simplify the stack layout a bit and ignore the precise layout as described above and ignore the allocation of local variables.

612: where is sp pointing to?
646: what is the address in the sp? what is the content?
664: what is the content now?
669: what is the content of stack after this instruction completes?
```
	      | pc |   <- sp
```

1551: the stack is:

	      | pc (=670)|  
	      | r5 (=0) |  <- r5 (=usize+64. - 2)
	      | r4 |
	      | r3 |
	      | r2 |  <- sp

1827: the stack is:

	      | pc (=670)|  
	      | r5 (=0) |  <- r5 (=usize+64. - 2)
	      | r4 |
	      | r3 |
	      | r2 |  <- sp
	      | pc (=1627)|
	      | r5 (=usize+64. -2) | <- r5
	      | r4 |
	      | r3 |
	      | r2 |  <- sp

sp, and r5 are saved in u_rsave in uarea for proc[0]
1917: what is the stack in u for proc[1]? an identical copy of the one above (thus, we have two copies now)

1637: proc[0]'s stack is:

	      | pc (=670)|  
	      | r5 (=0) |  <- r5 (=usize+64. - 2)
	      | r4 |
	      | r3 |
	      | r2 |  <- sp

2189: proc[0]'s stack is:

	      | pc (=670)|  
	      | r5 (=0) |  <- r5 (=usize+64. - 2)
	      | r4 |
	      | r3 |
	      | r2 |  <- sp
	      | pc (=1638)|
	      | r5 (=usize+64. -2) | <- r5
	      | r4 |
	      | r3 |
	      | r2 |
	      | pc (=1969)|
	      | r5 (=prev r5) | <- r5
	      | r4 |
	      | r3 |
	      | r2 |
	      | pc (=2094)|
	      | r5 (=prev r5) | <- r5
	      | r4 |
	      | r3 |
	      | r2 |  <- sp

what is saved in u_rsave?
what is restored in 2193?

2228: what stack are we pointing too? (proc[1]'s!) what is its stack? (the copy we made earlier of proc[0]'s):

	      | pc (=670)|  
	      | r5 (=0) |  <- r5 (=usize+64. - 2)
	      | r4 |
	      | r3 |
	      | r2 |  <- sp
	      | pc (=1627)|
	      | r5 (=usize+64. -2) | <- r5
	      | r4 |
	      | r3 |
	      | r2 |  <- sp

2248: the stack is:

	      | pc (=670)|  
	      | r5 (=0) |  <- r5 (=usize+64. - 2)
	      | r4 |
	      | r3 |
	      | r2 |  <- sp

and proc[1] is about to return to 1627 (the if branch)

1634: what proc[1]'s stack at the end of 1635? empty! and pc is 670?
0672: what is proc[1]'s stack:
```
	      | 0170000) |
	      | 0 | <- sp
```
0672 (end): proc[1]'s:
- PSW: current mode is user, previous mode is kernel
- sp is the user mode stack point: 0 (probably cleared when starting machine)
- pc is 0 (instruction @ address zero 0 traps back into kernel; see first word of icode, 1518)