6.828 2017 Lecture 8: System calls, Interrupts, and Exceptions

Let's start with the homework

alarmtest.c
  alarm(10, periodic)
  asks kernel to call periodic() every 10 "ticks" in this process
  that is, every 10 ticks of CPU time that this process consumes
  three pieces:
    add a new system call
    count ticks as the program runs (timer interrupt)
    kernel "upcall" to periodic()
  the call to periodic() is a simplified UNIX signal

glue for a new system call
  syscall.h: #define SYS_alarm 22
  usys.S: SYSCALL(alarm)
    alarmtest.asm -- mov $0x16,%eax -- 0x16 is SYS_alarm
  syscall.c syscalls[] table
  sysproc.c sys_alarm()

why all this machinery?
  at a high level, alarmtest just wants to make a function call to sys_alarm
  it has to be indirect (via INT, SYS_alarm) to maintain isolation

break sys_alarm
  where
  how did syscall know which system call?
    trapframe, on kernel stack, has saved user eax
    print myproc()->tf->eax
  where does sys_alarm find the arguments, ticks and handler?
    on the user stack
    x/4x myproc()->tf->esp
  does the handler value make sense? look in alarmtest.asm

now we need to take some action whenever the timer h/w interrupts
  decrement ticksleft
  if expired
    upcall to handler (periodic())
    reset ticksleft

device interrupts arrive just like INT and pagefault
  h/w pushes esp and eip on kernel stack
  s/w saves other registers, into a trapframe
  vector, alltraps, trap()

timer interrupts served by IRQ_TIMER case in trap()
  original IRQ_TIMER task is to keep track of wall-clock time, in ticks

execute to trap without an implementation
  break vector32
  where
  print/x tf->eip
  print/x tf->esp
  x/4x tf->esp

what was the user program doing at this point?
  tf->eip in alarmtest.asm
  user code could have been interrupted anywhere
    so we can't rely on anything about the user stack
    and we need to restore registers exactly, since program didn't save anything

Q: how to arrange for upcall to alarm handler?
   call myproc()->alarmhandler() ?
   tf->eip = myproc()->alarmhandler ?
Q: how to ensure handler returns to interrupted user code?

add our code...
run alarmtest without gdb

let's run with gdb
  list trap to find breakpoint
  print/x tf->eip before assignment
  print/x tf->eip after assignment
  break *0x74
  c
  info reg
  will it return somewhere reasonable in alarmtest.asm?
  x/4x $esp

Q: what's the security problem in my new trap() code?

Q: what if trap() directly called alarmhandler()?
   it's a bad idea
   but what exactly would go wrong?
   let's try it
     it doesn't crash!
     but it doesn't print alarm! either. why not?
     fetchint...
   apparently it gets back to user space (to print .) -- how?
     program, timer trap, alarmhandler(), INT, sys_write("alarm!"), return...
     stack diagram

it is disturbing how close this came to working!
  why can kernel code directly jump to user instructions?
  why can user instructions modify the kernel stack?
  why do system calls (INT) work from the kernel?
  none of these are intended properties of xv6!
  the x86 h/w does *not* directly provide isolation
    x86 has many separate features (page table, INT, &c)
    it's possible to configure these features to enforce isolation
    but isolation is not the default!

Q: what happens if just tf->eip = alarmhandler, but don't push old eip?
   let's try it
   user stack diagram

Q: what if trap() didn't check for CPL 3?
   let's try it -- seems to work!
   how could tf->cs&3 == 0 ever arise from alarmtest?
   let's force the situation with (tf->cs&3)==0
     and making alarmtest run forever
     unexpected trap 14 from cpu 0 eip 801067cb (cr2=0x801050cf)
   what is eip 0x801067cb in kernel.asm?
     tf->esp = tf->eip in trap().
   what happened?
     it was a CPL=0 to CPL=0 interrupt
     so the h/w didn't switch stacks
     so it didn't save %esp
     so tf->esp contains garbage
     (see comment at end of trapframe in x86.h)
   the larger point is that interrupts can occur while in the kernel (in xv6, not JOS)

Q: what will happen if user-supplied alarm handler fn points into the kernel?
   (with the correct trap() code)

Q: what if another timer interrupt goes off while in user handler?
   works, but confusing, and will eventually run out of user stack
   maybe kernel shouldn't re-start timer until handler function finishes

Q: is it a problem if periodic() modifies registers?
   how could we arrange to restore registers before returning?

let's step back and talk about interrupts a bit more generally

the general topic: h/w wants attention now!
  s/w must set aside current work and respond

where do traps come from?
  (I use "trap" as a general term)
  device -- data ready, or completed an action, ready for more
  exception/fault -- page fault, divide by zero, &c
  INT -- system call
  IPI -- kernel CPU-to-CPU communication, e.g. to flush TLB

where do device interrupts come from?
  diagram:
    CPUs, LAPICs, IOAPIC, devices
    data bus
    interrupt bus
  the interrupt tells the kernel the device hardware wants attention
  the driver (in the kernel) knows how to tell the device to do things
  often the interrupt handler calls the relevant driver
    but other arrangements are possible (schedule a thread; poll)

how does trap() know which device interrupted?
  i.e. where did tf->trapno == T_IRQ0 + IRQ_TIMER come from?
  kernel tells LAPIC/IOAPIC what vector number to use, e.g. timer is vector 32
    page faults &c also have vectors
    LAPIC / IOAPIC are standard pieces of PC hardware
    one LAPIC per CPU
  IDT associates an instruction address with each vector number
    IDT format is defined by Intel, configured by kernel
  each vector jumps to alltraps
  CPU sends many kinds of traps through IDT
    low 32 IDT entries have special fixed meaning
  xv6 sets up system calls (IRQ) to use IDT entry 64 (0x40)
  the point: the vector number reveals the source of the interrupt

diagram:
  IRQ or trap, IDT table, vectors, alltraps
  IDT:
    0: divide by zero
    13: general protection
    14: page fault
    32-255: device IRQs
    32: timer
    33: keyboard
    46: IDE
    64: INT

let's look at how xv6 sets up the interrupt vector machinery
  lapic.c / lapicinit() -- tells LAPIC hardware to use vector 32 for timer
  trap.c / tvinit() -- initializes IDT, so entry i points to code at vector[i]
    this is mostly purely mechanical, IDT entries correspond blindly to vectors
    BUT T_SYSCALL's 1 (vs 0) tells CPU to leave interrupts enabled during system calls
    but not during device interrupts
    Q: why allow interrupts during system calls?
    Q: why disable interrupts during interrupt handling?
  vectors.S (generated by vectors.pl)
    first push fakes "error" slot in trapframe, since h/w doesn't push for some traps
    second push is just the vector number
      this shows up in trapframe as tf->trapno

how does the hardware know what stack to use for an interrupt?
  when it switches from user space to the kernel
  hardware-defined TSS (task state segment) lets kernel configure CPU 
    one per CPU
    so each CPU can run a different process, take traps on different stacks
  proc.c / scheduler()
    one per CPU
  vm.c / switchuvm()
    tells CPU what kernel stack to use
    tells kernel what page table to use

Q: what eip should the CPU save when trapping to the kernel?
   eip of the instruction that was executing?
   eip of the next instruction?
   suppose the trap is a page fault?

some design notes
  * interrupts used to be relatively fast; now they are slow
    old approach: every event causes an interrupt, simple h/w, smart s/w
    new approach: h/w completes lots of work before interrupting
  * an interrupt takes on the order of a microsecond
    save/restore state
    cache misses
  * some devices generate events faster than one per microsecond
    e.g. gigabit ethernet can deliver 1.5 million small packets / second
  * polling rather than interrupting, for high-rate devices
    if events are always waiting, no need to keep alerting the software
  * interrupt for low-rate devices, e.g. keyboard
    constant polling would waste CPU
  * switch between polling and interrupting automatically
    interrupt when rate is low (and polling would waste CPU cycles)
    poll when rate is high  (and interrupting would waste CPU cycles)
  * faster forwarding of interrupts to user space
    for page faults and user-handled devices
    h/w delivers directly to user, w/o kernel intervention?
    faster forwarding path through kernel?

we will be seeing many of these topics later in the course