6.S081/6.828 2019 Lecture 9: Processes, threads, and scheduling

Topic: more "under the hood" with xv6
  Previously: system calls, interrupts, page tables, locks
  Today: process/thread switching

Why support multiple tasks?
  time-sharing among many users, as on Athena dialups.
  multiple programs on a laptop.
  concurrent activities on a controller.

Potential multi-tasking goals
  Isolation vs bugs, malice.
  Sharing of hardware resources (CPU, memory, disk, net).
  Mutual ignorance / transparency.
  Efficiency.

There are two main approaches to multi-tasking:
  Threads: program execution appears sequential.
  Events: programs are driven by arrival of events, like interrupts.

UNIX and xv6 use a thread model, embodied in processes.

Examples of sequential code suitable for the thread model.
  1:
    while(1){
      x = wait for input;
      x = x + 1;
      print(x);
    }
  2:
    while(1){
      compute;
    }

thread design challenges
  how to interleave threads if few CPUs?
    how to make interleaving transparent?
  what to do while first example waits for input?
  how to prevent 2nd example from hogging a CPU?

first, how to cope with compute-bound threads?
  kernel isn't even running -- how can it switch among threads?
  all CPUs have timer hardware, which can be told to interrupt periodically
  kernel uses timer interrupts to grab control from looping threads
  this is "pre-emptive" scheduling -- a forced yield of oblivious code

second, what to do with thread that isn't running?
  we need to set aside its state: registers, stack, memory
  no need to worry about memory, it won't go anywhere
  but we need to re-use CPU registers, so we must save them
  and we need to preserve the thread's stack (contains call chain, local vars)
  so implementation provides each thread with saved registers, stack

xv6 provides threads for both user and kernel code:
  [diagram]
  1 user thread per process, to execute application
  1 kernel thread per process, to execute system calls
  1 kernel scheduler thread per CPU

A process consists of two threads (user and kernel)
  plus lots of other stuff -- memory, file descriptors, &c

Overview of xv6 process switching
  user -> kernel thread (via system call or timer interrupt)
  kernel thread yields, due to pre-emption or waiting for I/O
  kernel thread -> scheduler thread
  scheduler thread finds a RUNNABLE kernel thread
  scheduler thread -> kernel thread
  kernel thread -> user

struct proc in proc.h
  trapframe holds saved user thread's registers
  context holds saved kernel thread's registers
    kstack holds kernel thread's stack

terminology:
  I will often use "thread" as shorthand for "a process's kernel thread"
  an xv6 process is really a kernel thread, plus a user thread, plus
    user memory, plus an array of file descriptors, &c

Each xv6 process has a proc->state
  RUNNING
  RUNNABLE
  SLEEPING
  ZOMBIE
  UNUSED

Note:
  xv6 has lots of kernel threads sharing the single kernel address space
  xv6 has only one user thread per process
  more serious O/S's (e.g. Linux) support multiple user threads per process

Context-switching was one of the hardest things to get right in xv6
  multi-core
  locking
  interrupts
  process termination

# Code

pre-emptive switch demonstration
  user/spin.c -- two CPU-bound processes
  my qemu has only one CPU
  let's watch xv6 switch between them

make qemu-gdb
gdb
(gdb) c
show user/spin.c
spin

you can see that they alternate, despite running continuously.
xv6 is switching its one CPU between the two processes.
how does the switching work?

I'm going to cause a break-point at the timer interrupt.
(gdb) finish
(gdb) where

we're in usertrap(), handling a device interrupt from the timer
(timerinit() in kernel/start.c configures the RISC-V timer hardware).

what was running when the timer interrupt happened?

(gdb) print p->name
(gdb) print p->pid
(gdb) print/x *(p->tf)
(gdb) print/x p->tf->epc

let's look for the saved epc in user/spin.asm
timer interrupted user code at pretty random location

(gdb) step (into yield() in proc.c)
(gdb) next
(gdb) print p->state

p->state = RUNNABLE -- giving up CPU but want to run again.
note: yield() acquires p->lock
  since modifying p->state
  and to prevent another CPU from running this RUNNABLE thread!

(gdb) step (into sched())

sched() just calls swtch(), after some sanity checks
this is the context switch from a process's kernel thread to the scheduler thread
  we want to save away one set of registers (including PC and SP)
  and restore another set of registers

swtch():
  it saves current 32 registers in first argument (p->context)
  it restores previously-saved registers from 2nd argument (mycpu->scheduler)
let's see what register values swtch() will restore

(gdb) next
(gdb) print/x cpus[0].scheduler (this is a context)

where is cpus[0].scheduler.ra?
  i.e. where will swtch() return to?
  kernel.asm says it's in the scheduler() function in proc.c

(gdb) tbreak swtch
(gdb) c

we're in kernel/swtch.S
a0 is the first argument, p->context
a1 is the second argument, cpus[0].scheduler
swtch() saves current registers in xx(a0) (p->context)
swtch() then restores registers from xx(a1) (cpus[0].scheduler)
then swtch returns

Q: note swtch() neither saves nor restores the PC (program counter)!
   so how does it know where to start executing in target thread?

(gdb) print $pc
(gdb) print $ra
(gdb) print $sp
(gdb) stepi -- until ret
(gdb) print $pc
(gdb) print $ra
(gdb) print $sp
(gdb) where
(gdb) stepi

we're in scheduler() now, in the "scheduler thread",
  on the scheduler's stack

scheduler() just returned from a call to swtch()
  it made that call a while ago, to switch to our process's kernel thread
  that previous call saved scheduler()'s registers
  our processes's call to swtch() restored those saved registers
  p still refers to the interrupted process

(gdb) print p->name
(gdb) print p->pid
(gdb) print p->state

remember sched() acquired the process's lock
  now scheduler releases it
  it may *look* like scheduler aquires then releases
    but in fact scheduler acquires, yield releases
    an yield acquires, scheduler releases
  unusual: the lock is released by a different thread than acquired it!

Q: why does yield() acquire p->lock, but scheduler() release it?
   what if another core's scheduler() sees the RUNNABLE process?

scheduler()'s loop looks at all processes, finds a RUNNABLE one
  will likely be the *other* spin process
let's fast-forward to when scheduler() finds a RUNNABLE process

(gdb) tbreak swtch

now we're swtch()ing from scheduler() to a different thread
  we can tell it's the other spin process

(gdb) where
(gdb) up
(gdb) print p->name
(gdb) print p->pid
(gdb) print p->state

let's see where the new thread will start executing after swtch()
  by looking at $ra (return address) in its context

(gdb) print/x p->context

look for ra in kernel/kernel.asm
  it's in sched(), just after a call to swtch()
look at kernel/swtch.S (again)

(gdb) stepi 28 -- now just about to execute swtch()'s ret
(gdb) print $ra
(gdb) where
(gdb) step
(gdb) print p->pid
(gdb) print p->name
(gdb) print/x p->tf->epc

now we're in a timer interrupt in the *other* spin process
  at some point in the past, it yielded the CPU
  and called swtch()
  (then other threads ran)
  now swtch() is returning
  and kernel will soon return from timer interrupt back to other spin process
    and restore user registers from trapframe

Q: what is the scheduling policy? 
   will the thread that called yield() run immediately again?
   is it a good policy?

Q: why does scheduler() briefly enable interrupts, with intr_on()?
   There may be no RUNNABLE threads
     They may all be waiting for I/O, e.g. disk or console
   Enable interrupts so device has a chance to signal completion
     and thus wake up a thread
   Otherwise, system will freeze

sched() and scheduler() are "co-routines"
   caller knows what it is swtch()ing to
   callee knows where switch is coming from
   e.g. yield() and scheduler() cooperate about p->lock
   different from ordinary thread switching, where neither
     party typically knows which thread comes before/after

Q: how does sched() know scheduler() thread is ready for it to swtch() into?
   could it be anywhere other than swtch()?

Q: could sched() directly swtch() to a new thread?
   i.e. get rid of separate scheduler thread?
   so that sched() looks for next process to run?
   yes, and it would be faster.
   but care would be needed to maintain above invariants.
     i.e. some tricky locking and stack switching.
   give it a try!

Q: is there pre-emptive scheduling of kernel threads?
   yes -- timer interrupt and yield() can occur while in kernel.
   yield() called by kerneltrap() in kernel/trap.c
   where to save registers of interrupted kernel code?
     can't save them in p->tf trapframe, since already has user registers.
     kernelvec.S pushes them on the kernel stack (since already in kernel).
   is pre-emption in the kernel useful?
     not critical in xv6.
     valuable if some system calls have lots of compute.
     or if we need a strict notion of thread priority.

p->lock protects multiple invariants:
  (invariant = true when p->lock not held, maybe not true if held)
  if RUNNING, CPU registers hold the values (not in p->context)
  if RUNNABLE, p->context holds its saved registers
  if RUNNABLE, no CPU is using its stack
  holding p->lock from yield() all the way to scheduler enforces:
    another CPU can't execute p until after this CPU stops using its stack
    interrupts off, so no nested yield()+swtch() during swtch() save/restore

fact: xv6 always turns off interrupts while a thread holds any spinlock
  motivating example:
    consoleread() looking for keyboard input
      acquire(cons.lock), looks at cons.buf, release(cons.lock)
    what if console interrupt while consoleread() holds lock?
    consoleintr() tries to acquire cons.lock
      but interrupt runs on the same stack as interrupted kernel code
      consoleread() can't release until consoleintr() returns
    that's a deadlock!
  so: acquire() disables interrupts, release() enables them
    just on the local CPU
    other CPUs can still take interrupts

Q: why does sched() forbid locks from being held when yielding the CPU?
   (other than p->lock)
   i.e. sched() checks that noff == 1
   suppose process P1 holding lock L1, yields CPU
   process P2 runs, tries acquire(L1)
   P2's acquire spins with interrupts turned off
     so timer interrupts won't occur
     so P2 won't yield the CPU
     so P1 can't execute
     so P1 won't release L1, ever

Summary
  xv6 provides a convenient thread model, for user and kernel
  pre-emptive via timer interrupts
  transparent via switching registers and stack
  multi-core prompts careful handling of stacks, locks
  next lecture: mechanisms for threads to wait for each other