6.828 2017 Lecture 4: Shell & OS organization Lecture Topic: kernel system call API both details and design illustrate via shell and homework 2 Overview Diagram user / kernel process = address space + thread(s) app -> printf() -> write() -> SYSTEM CALL -> sys_write() -> ... user-level libraries are app's private business kernel internal functions are not callable by user xv6 has a few dozen system calls; Linux a few hundred details today are mostly about UNIX system-call API basis for xv6, Linux, OSX, POSIX standard, &c jos has very different system-calls; you'll build UNIX calls over jos Homework solution * Let's review Homework 2 (sh.c) * exec why two execv() arguments? what happens to the arguments? what happens when exec'd process finishes? can execv() return? how is the shell able to continue after the command finishes? * redirect how does exec'd process learn about redirects? [kernel fd tables] does the redirect (or error exit) affect the main shell? * pipe ls | wc -l what if ls produces output faster than wc consumes it? what if ls is slower than wc? how does each command decide when to exit? what if reader didn't close the write end? [try it] what if writer didn't close the read end? how does the kernel know when to free the pipe buffer? * how does the shell know a pipeline is finished? e.g. ls | sort | tail -1 * what's the tree of processes? sh parses as: ls | (sort | tail -1) sh sh1 ls sh2 sort tail * does the shell need to fork so many times? - what if sh didn't fork for pcmd->left? [try it] i.e. called runcmd() without forking? - what if sh didn't fork for pcmd->right? [try it] would user-visible behavior change? sleep 10 | echo hi * why wait() for pipe processes only after both are started? what if sh wait()ed for pcmd->left before 2nd fork? [try it] ls | wc -l cat < big | wc -l * the point: the system calls can be combined in many ways to obtain different behaviors. Let's look at the challenge problems * How to implement sequencing with ";"? gcc sh.c ; ./a.out echo a ; echo b why wait() before scmd->right? [try it] * How to implement "&"? $ sleep 5 & $ wait the implementation of & and wait is in main -- why? What if a background process exits while sh waits for a foreground process? * How to implement nesting? $ (echo a; echo b) | wc -l my ( ... ) implementation is only in sh's parser, not runcmd() it's neat that sh pipe code doesn't have to know it's applying to a sequence * How do these differ? echo a > x ; echo b > x ( echo a ; echo b ) > x what's the mechanism that avoids overwriting? UNIX system call observations * The fork/exec split looks wasteful -- fork() copies mem, exec() discards. why not e.g. pid = forkexec(path, argv, fd0, fd1) ? the fork/exec split is useful: fork(); I/O redirection; exec() or fork(); complex nested command; exit. as in ( cmd1 ; cmd2 ) | cmd3 fork() alone: parallel processing exec() alone: /bin/login ... exec("/bin/sh") fork is cheap for small programs -- on my machine: fork+exec takes 400 microseconds (2500 / second) fork alone takes 80 microseconds (12000 / second) some tricks are involved -- you'll implement them in jos! * The file descriptor design: * FDs are a level of indirection - a process's real I/O environment is hidden in the kernel - preserved over fork and exec - separates I/O setup from use - imagine writefile(filename, offset, buf size) * FDs help make programs more general purpose: don't need special cases for files vs console vs pipe * Philosophy: small set of conceptually simple calls that combine well e.g. fork(), open(), dup(), exec() command-line design has a similar approach ls | wc -l * Why must kernel support pipes -- why not have sh simulate them, e.g. ls > tempfile ; wc -l < tempfile * System call interface simple, just ints and char buffers. why not have open() return a pointer reference to a kernel file object? * The core UNIX system calls are ancient; have they held up well? yes; very successful and evolved well over many years history: design caters to command-line and s/w development system call interface is easy for programmers to use command-line users like named files, pipelines, &c important for development, debugging, server maintenance but the UNIX ideas are not perfect: programmer convenience is often not very valuable for system-call API programmers use libraries e.g. Python that hide sys call details apps may have little to do with files &c, e.g. on smartphone some UNIX abstractions aren't very efficient fork() for multi-GB process is very slow FDs hide specifics that may be important e.g. block size for on-disk files e.g. timing and size of network messages so there has been lots of work on alternate plans sometimes new system calls and abstractions for existing UNIX-like kernels sometimes entirely new approaches to what a kernel should do ask "why this way? wouldn't design X be better?" OS organization * How to implement a system-call interface? * Why not just a library? I.e. no kernel, just run app+library directly on the hardware. flexible: apps can bypass library if it's not right apps can directly interact with hardware a library is OK for a single-purpose device but what if the computer is used for multiple activities? * Key requirements for kernels: isolation multiplexing interaction * helpful approach: abstract resources rather than raw hardware File system, not raw disk Processes, not raw CPU/memory TCP, not ethernet packets abstractions often ease isolation, multiplexing and interaction also more convenient and portable * Start with isolation since that's often the most constraining requirement. * Isolation goals: apps cannot directly interact with hardware apps cannot harm operating system apps cannot directly affect each other apps can only interact with world via the OS interface * Processors provide mechanisms that help with isolation * Hardware provides user mode and kernel mode - some instructions can only be executed in kernel mode device access, processor configuration, isolation mechanisms * Hardware forbids apps from executing privileged instructions - instead traps to kernel mode - kernel can clean up (e.g., kill the process) * Hardware lets kernel mode configure various constraints on user mode most critical: page tables to limit user s/w to its own address space * Kernel builds on hardware isolation mechanisms * Operating system runs in kernel mode - kernel is a big program services: processes, file system, net low-level: devices, virtual memory all of kernel runs with full hardware privilege (convenient) * Applications run in user mode - kernel sets up per-process isolated address space - system calls switch between user and kernel mode the application executes a special instruction to enter kernel hardware switches to kernel mode but only at an entry point specified by the kernel * What to put in the kernel? * xv6 follows a traditional design: all of the OS runs in kernel mode - one big program with file system, drivers, &c - this design is called a monolithic kernel - kernel interface == system call interface - good: easy for subsystems to cooperate one cache shared by file system and virtual memory - bad: interactions are complex leads to bugs no isolation within kernel * microkernel design - many OS services run as ordinary user programs file system in a file server - kernel implements minimal mechanism to run services in user space processes with memory inter-process communication (IPC) - kernel interface != system call interface - good: more isolation - bad: may be hard to get good performance * exokernel: no abstractions apps can use hardware semi-directly, but O/S isolates e.g. app can read/write own page table, but O/S audits e.g. app can read/write disk blocks, but O/S tracks block owners good: more flexibility for demanding applications jos will be a mix of microkernel and exokernel * Can one have process isolation WITHOUT h/w-supported kernel/user mode? yes! see Singularity O/S, later in semester but h/w user/kernel mode is the most popular plan Next lecture: x86 hardware isolation mechanisms and xv6's use of them