System call interface: microkernels

Required reading: Improving IPC by kernel design

Overview

This lecture looks at the microkernel organization. In a microkernel, services that a monolithic kernel implements in the kernel are running as user-level programs. For example, the file system, UNIX process management, pager, and network protocols each run in a separate user-level address space. The microkernel itself supports only the services that are necessary to allow system services to run well in user space; a typical microkernel has at least support for creating address spaces, threads, and inter process communication.

The potential advantages of a microkernel are simplicity of the kernel (small), isolation of operating system components (each runs in its own user-level address space), and flexibility (we can have a file server and a database server). One potential disadvantage is performance loss, because what in a monolithich kernel requires a single system call may require in a microkernel multiple system calls and context switches.

One way in how microkernels differ from each other is the exact kernel API they implement. For example, Mach (a system developed at CMU, which influenced a number of commercial operating systems) has the following system calls: processes (create, terminate, suspend, resume, priority, assign, info, threads), threads (fork, exit, join, detach, yield, self), ports and messages (a port is a unidirectionally communication channel with a message queue and supporting primitives to send, destroy, etc), and regions/memory objects (allocate, deallocate, map, copy, inherit, read, write).

Some microkernels are more "microkernel" than others. For example, some microkernels implement the pager in user space but the basic virtual memory abstractions in the kernel (e.g, Mach); others, are more extreme, and implement most of the virtual memory in user space (L4). Yet others are less extreme: many servers run in their own address space, but in kernel mode (Chorus).

All microkernels support multiple threads per address space. V6 and UNIX until recently didn't; why? Because, in UNIX system services are typically implemented in the kernel, and those are the primary programs that need multiple threads to handle events concurrently (waiting for disk and processing new I/O requests). In microkernels, these services are implemented in user-level address spaces and so they need a mechanism to deal with handling operations concurrently. (Of course, UNIX supporters will also argue that if you make fork efficient enough, there is no need to have threads.)

L3/L4

L3 is a predecessor to L4. L3 provides data persistence, DOS emulation, and ELAN runtime system. L4 is a reimplementation of L3, but without the data persistence. L4KA is a project at sourceforge.net, and you can download the code for the latest incarnation of L4 from there.

L4 is a "second-generation" microkernel, with 7 calls: IPC (of which there are several types), id_nearest, fpage_unmap, thread_switch, lthread_ex_regs, thread_schedule, task_new). These calls provide address spaces, tasks, threads, interprocess communication, and unique identifiers. An address space is a set of mappings. Multiple threads may share mappings, a thread may grants mappings to another thread. Task is the set of threads sharing an address space.

A thread is the execution abstraction; it belongs to an address space, a UID, a register set, a page fault handler, and an exception handler. A UID of a thread is its task number plus the number of the thread within that task.

IPC passes data by value or by reference to another address space. It also provide for sequence coordination. It is used for communication between client and servers, to pass interrupts to a user-level exception handler, to pass page faults to an external pager. In L4, device drivers are implemented has a user-level processes with the device mapped into their address space. Linux runs as a user-level process.

L4 provides quite a scala of messages types: inline-by-value, strings, and virtual memory mappings. The send and receive descriptor specify how many, if any.

In addition, there is a system call for timeouts and controling thread scheduling.

L3/L4 paper discussion