6.824 2001 Lecture 9: Scheduling What is scheduling? A system usually needs to allocate resources to consumers. Usually many types of resources and consumers. Many issues common to all types, e.g. fairness. Typically want to enforce a high-level policy. Often hard to get the different schedulers to cooperate to get that policy. Resource / consumer examples. CPU time / threads. CPU time / async callbacks. Physical memory / address spaces. Disk movement / processes. Network bandwidth / connections. System bus bandwidth / devices. Many policies are possible; examples for CPU: Divide fairly among processes. Divide fairly among users. Divide according to how much users are willing to pay. Earliest deadline first, for real time systems. Favor interactive over batch (i.e. emacs over gcc). General plan 1. Realize that scheduling is occuring. 2. Make scheduling decisions visible. 3. Provide knobs / hooks to set policy. Example: thread scheduler easy to make round-robin. Is it always that easy? Pitfall: Priority Inversion Assume policy is strict priority. Thread T1: low priority. Thread T2: medium priority. Thread T3: high priority. T1: acquire(l) context switch to T3 T3: acquire(l)... must wait for T1 to release(l)... context switch to T2 T2 computes for a while T3 is indefinitely delayed despite high priority. Can solve if T3 lends its priority to holder of lock it is waiting for. So T1 runs, not T2. [this is really a multiple scheduler problem.] [since locks schedule access to locked resource.] Pitfall: Efficiency Efficiency often conflicts with fairness (or any other policy). Time quantum in CPU scheduling (vs low delay). Shortest seek vs FIFO disk scheduling. Contiguous read-ahead vs data needed now. Swap out my idle emacs to let gcc run faster w/ more phys mem. What happens when I type a key? These don't fit well into a "who gets to go next" scheduler framework. Inefficient scheduling may make *everybody* slower, including hi pri users. Pitfall: Multiple Interacting Schedulers Suppose you want your emacs to have priority over everything else. Give it high CPU priority. Does that mean nothing else will run if emacs wants to run? Disk scheduler might not know to favor emacs's disk I/Os. Typical UNIX disk scheduler favors disk efficiency, not process prio. Suppose emacs needs more memory. Other processes have dirty pages; emacs must wait. Does disk scheduler know these other processes' writes are high prio? Pitfall: Server Processes Suppose emacs uses X windows to display. The X server must serve requests from many clients. Does it know that emacs' requests should be given priority? Does the O/S know to raise X's priority when it is serving emacs? Similarly for DNS and NFS. Does the network know to give emacs' NFS requests priority? Pitfall: Hardware Schedulers The hardware decides which CPU gets to use memory next. The hardware decides which device gets to use I/O bus next. The hardware decides which interrupt is serviced next. But the hardware doesn't know about O/S scheduling policy. And the O/S may not realize that the hardware is scheduling. Scheduling is a system problem There are many schedulers; they interact. The CPU/thread/process scheduler is usually the easy part. The hardest part is system structure. I.e. the *existence* of interrupts is bad for scheduling. Conflicting goals may limit effectiveness. Case study: UNIX Goals: Simplicity (e.g. avoid complex locking regimes). Quick response to device interrupts. Favor interactive response. UNIX has a number of execution environments. We care about scheduling transitions among them. Some transitions aren't possible, some can'be be controlled. UNIX Execution Environments Process, user half Process, kernel half Soft interrupts: timer, network Device interrupts Hard timer interrupt [diagram] UNIX: Process User Half Runs in process address space, on per-process stack. Interruptible. Pre-emptible: interrupt may cause context switch. We don't trust user processes to yield CPU. Voluntarily enters kernel half via system calls and faults. UNIX: Process Kernel Half Runs in kernel address space, on per-process kernel stack. Executes system calls and faults for its process. Interruptible (but can defer interrupts in critical sections). Not pre-emptible. Only yields voluntarily, when waiting for an event. E.g. disk I/O done. This simplifies concurrency control; locks often not required. No user process runs if any kernel half wants to run. Many process' kernel halfs may be sleeping in the kernel. UNIX: Device Interrupts Hardware asks CPU for an interrupt to ask for attention. Disk read/write completed, or network packet received. Runs in kernel space, on special interrupt stack. Interrupt routine cannot block; must return. Interrupts are interruptible. They nest on the one interrupt stack. Interrupts are not pre-emptible, and cannot really yield. The real-time clock is a device and interrupts every 10ms (or whatever). Process scheduling decisions can be made when interrupt returns. E.g. wake up the process waiting for this event. You want interrupt processing to be very fast, since it has priority. Don't do any more work than you have to. You're blocking processes and other interrupts. UNIX: Soft Interrupts Used when device handling is expensive. But no obvious process context in which to run. Examples: IP forwarding, TCP input processing. Runs in kernel space, on interrupt stack. Interruptable. Not pre-emptable, can't really yield. Triggered by hardware interrupt. Called when outermost hardware interrupt returns. Periodic scheduling decisions are made in timer s/w interrupt. Scheduled by hardware timer interrupt. I.e. current process has run long enough, switch. Is this good software structure? Let's talk about Mogul and Ramakrishnan...