Required reading: xv6
trapasm.S
,
trap.c
,
syscall.c
,
initcode.S
,
usys.S
.
Skim
vectors.S
,
lapic.c
,
ioapic.c
,
picirq.c
.
You will need to consult
IA32 System
Programming Guide chapter 5 (skip 5.7.1, 5.8.2, 5.12.2).
Unrelated to lecture: Lab2 due tomorrow at 11:59pm, Lab3 is out.
last week we transferred from kernel to user today: how to get from user to kernel three reasons for transitions: system calls program faults (div by zero, page fault) external device interrupts why do we need to take special care for user -> kernel? security/isolation only kernel can touch devices, MMU, FS, other process' state, &c think of user program as a potential malicious adversary what has to happen? save user state for future transparent resume set up for execution in kernel (stack, segments) choose a place to execute in kernel get at system call arguments do it all securely it's neat that interrupts, faults, system call use same mechanism!
Execute the int. Now where are we? How did we get here?
The INT instruction takes the following steps (these will be similar to all interrupts and faults, though there are slight differences):
xv6 set up the IDT in tvinit(), set the IDTR in idtinit(), and set the SS and ESP in the TSS in setupsegs(). Run info idt 48 to see how the IDT is set up to handle vector 0x30.
What is the current CPL? How was it set? Could the user abuse the INT instruction to exercise privilege or break the kernel?
print-stack 5 in order to see what int put on the stack. Compare to Figure 5-4. What stack is being used?
vector48 pushes a few items on the stack and then jumps to alltraps. Why not have vector 48 in the IDT point directly to alltraps?
Single-step until the call to trap. print-stack 18. Compare with struct trapframe.
At the start of trap(), what is tf->trapno? How was it set?
syscall() dispatches to a function it finds by indexing into the syscalls array. It uses the eax from the trap frame as the index. What is in that eax? Where was it set?
Now we're in sys_open(). Where are the arguments the user program originally passed to open()? How can the kernel get at them?
sys_open() calls argint() to get its 2nd argument. Argint calculates the value cp->tf_esp + 4 + 4*n. What is this? Why the first 4? Why the 4*n?
fetchint() checks that the address is not beyond the end of user memory. But addr was just calculated by kernel code (in argint()); since the kernel code is trustworthy, is this check really neccessary?
Why do we do seemingly redundant checks for addr and then addr+4? Can't we just check addr+4?
Why does fetchint() add p->mem to addr?
Back to sys_open(). It does its job (which we will talk about later) and finally returns a file descriptor using the ordinary C return statement. syscall() puts that return value in cp->tf->eax. Why?
single-step until iret, print-stack 5, single-step into user space. Print the registers and stack. What will the return value to the original call to open() be?
What would happen if a user program divided by zero? What if kernel code divided by zero?
In Unix, traps often get translated into signals to the process. Some traps, though, are (partially) handled internally by the kernel -- which ones?
Some traps push an extra error code onto the stack (typically containing the segment descriptor that caused a fault). But this error code isn't pushed by the INT instruction. Can the user confuse the kernel by invoking INT 0xc (or any other vector that usually pushes an error code)? Why not?
info idt 32, then set a breakpoint at vector32 (vb 0x8:...)
print-stack 5. What was the CPU doing at the time of the interrupt? What stack is being used?
The interrupt will have pushed different numbers of words on the stack depending on whether the CPU was in user-space or the kernel; how does iret know how many words to pop?
What prevents lots of interrupts from coming in all at once and overflowing the kernel stack? Print the registers; IF=0x200. info idt 32, info idt 48.
trap(), when it's called for a time interrupt, does just two things: increment the ticks variable, and call yield(). The latter, as we will see, may cause the interrupt to return in a different process!
Turns out our kernel had a subtle security bug in the way it handled traps... vb 0x1b:0x11, run movdsgs, step over breakpoints that aren't mov ax, ds, dump_cpu and single-step. dump_cpu after mov gs, then vb 0x1b:0x21 to break after sbrk returns, dump_cpu again.
Since JOS does not use segmentation, where do traps vector in JOS?
JOS also has a very different kernel architecture: only one kernel stack, as opposed to one per process in xv6. The kernel is not re-entrant (cannot be interrupted), so all IDT entries are interrupt gates in JOS.