Required reading: proc.c (focus on scheduler() and sched()), swtch.S.
Also process creation: sys_fork() and copyproc().
Big picture: more programs than CPUs. How to share the
limited number of CPUs among the programs?
Idea: have the kernel transparently switch the CPU(s) among
programs, giving them a "virtual CPU" abstraction.
Terminology: I'm going to call a switch from one program to
another a context switch.
The program's view of context switching
For this discussion we're going to assume that the O/S runs each
program in a "process", which couples an address space with a single
thread of control. For the most part each process appears to have a
dedicated CPU, but if processes compare notes they can see that the CPU
is actually multiplexed, with this process-visible behavior:
- When a process makes a system call that blocks, the kernel
arranges to take the CPU away and run a different process.
For example, if read() needs to wait for the hard disk to
finish reading a block, the kernel will cause a different
process to run until the disk has finished.
- The kernel time-slices the CPU among processes that
compute a lot.
- Both of the above are largely transparent to the process,
in the sense that the kernel resumes a process with the
same register and memory state it had when the kernel suspended it.
(Except that a system call may explicitly modify state required
to express its return value or data read by e.g. read().)
The kernel also exposes system calls for handling
processes; here are the main XV6 process system calls:
- fork: create a new process, which is a copy of the parent
- exec: execute a program file
- exit: terminate the calling process
- wait: wait for a child process to call exit
- kill: kill process
- sbrk: grow the address space of a process.
The kernel's view of context switching
As you know, the xv6 kernel maintains a kernel stack for each process,
on which to run system calls for the process. At any given time,
multiple processes may be executing system calls inside the kernel.
This is a common kernel arrangement -- multiple threads of control in
the kernel address space, one per user-level process. A kernel thread
must be capable of suspending and giving the CPU away, because the
implementation of many system calls involves blocking to wait for
I/O. That's the reason why each xv6 process has its own kernel stack.
Terminology: if a process has entered the kernel, the kernel stack
and CPU registers of the resulting thread of control is called the
processes kernel thread.
In this arrangment, the kernel doesn't directly switch CPUs
among processes. Instead, the kernel has a number of mechanisms
which combine to switch among processes:
- User->kernel transitions, which we've seen when we looked at
system calls and interrrupts.
If a process is running and makes a user->kernel
transition, we're then running in the process's kernel thread.
This transition saves the state of the
user process -- saving its register in a struct trapframe
on the process's kernel stack. A user->kernel transition
initializes fresh register state for the process's kernel thread.
- Switching between kernel threads. If a kernel thread calls
sched() or yield() or sleep(), the kernel saves that kernel
thread's registers, picks another runnable kernel thread, and
restores its registers -- restoring ESP and EIP cause a
stack switch and control switch.
scheduler() also switches user segments; this has no immediate effect,
it only affects what happens on a future return from kernel to user.
- Kernel->user transitions. A return to user space restores
the user registers from the trapframe, and implicitly loses
the register state of the process's kernel thread.
- Timer interrupts. The timer interrupt handler implements
time-slicing by calling yield(). At a low level yield() only
switches between kernel threads, but the interrupt may have caused
a user->kernel transition, so the net effect is often to
switch between user processes.
xv6 code examples
Let's watch what happens when a timer interrupt causes an
involuntary context switch between two compute-bound user
processes. The user process:
Break in yield(). On kernel stack of one of the processes.
The return EIP should be trap() (but might be alltraps; why?).
Why does yield() need to acquire proc_table_lock?
When will the lock be released?
Why state = RUNNABLE? What was state's previous value?
yield() calls sched() which calls swtch(). Look at definition
of struct context: it's a set of saved registers. swtch()
will save into one context, restore from another. What are
the two contexts?
Where do we think swtch() will return to?
Why doesn't swtch() save EAX in struct context?
Why doesn't swtch() just leave the return EIP on the
stack? Why does it copy the return EIP from the stack
to the struct context, then back from the struct context to
At end of swtch(), what stack are we running on?
Where does print-stack show swtch() will return to? Why did
scheduler() is in the middle of a loop that checks to see whether
each process would like to run (is RUNNABLE).
If scheduler() finds no RUNNABLE process, it finishes the
loop, releases the lock, but then immediately re-acquires
the lock at the start of the outer loop. Why release
just to re-acquire?
That is, suppose for a while there are no RUNNABLE processes.
What will the CPUs spend their time doing?
scheduler() has its own stack (in cpus[x]). Why? Why not just
call it directly from sched()? I.e. why not have scheduler()
run on the stack of the last process that gave up the CPU?
What policy does scheduler() use to give CPU time to processes?
We know scheduler() is going to run the other cpu-bound process,
by calling setupsegs(p) and swtch() to restore other process's registers
and stack. Set state to RUNNING to prevent another CPU's
scheduler() from also running this process.
Break in swtch(), and check that we are indeed switching to the
other process's stack, and that we are returning from its
call to yield() from trap().
More sophisticated operating systems usually allow multiple threads of
control in a process. How could we do this for xv6? Each thread would
have its own user-space stack in the process address space. There
would be a few more system calls: perhaps thread_fork(),
thread_exit(), and thread_wait(). The kernel would have to split
struct proc into a structure describing overall process
resources (address space and open files), and a kernel stack and set
of saved registers per thread. That way multiple threads from a
process could be executing system calls inside the kernel.
It's worth comparing xv6's fairly conventional context switches with
those of JOS. JOS has only one thread active in the kernel at a time,
and thus only a single kernel stack (not one per process) and no
thread switching inside the kernel. It only has an analogue to xv6's
kernel entry/exit to save/restore user thread state. But as a direct
consequence JOS 1) can't run the kernel concurrently on multiple CPUs
and 2) cannot have system calls that block in the middle (e.g. to look
up a multi-component path name).
We've seen the complete story of how xv6 switches between user
processes. This is the concrete realization of the virtual CPU
abstraction. The core mechanism is swtch(), which saves and restores
thread register state and thus switches stacks (ESP) and changes the
control flow (EIP). We also saw a bit of what it takes to integrate
thread switching with the concurrency brought by multiple CPUs and by