O/S overview

Overview

6.828 goals:
- Understand operating systems in detail by designing and implementing a small O/S
- Hands-on experience with building systems ("Applying 6.033")
What do applications want from an O/S?
- Abstract the hardware for convenience and portability
- Multiplex the hardware among multiple applications
- Isolate applications to contain bugs
- Allow sharing among applications
How is an O/S built?
- e.g. OSX, Windows, Linux
- the small view: a h/w management library
- the big view: physical machine -> abstract one w/ better properties
- layer picture:
  - h/w: CPU, mem, disk
  - kernel: [various services]
  - user: applications, e.g. vi and gcc
- we care a lot about the interfaces and internel kernel structure
what services does an O/S kernel typically provide?
- processes
- memory
- file contents
- directories and file names
- security
- many others: users, IPC, network, time, terminals
What does an O/S abstraction look like?
- Applications only see them via system calls
- Examples, from UNIX / Linux:
- fd = open("out", 1);
- write(fd, "hello\n", 6);
- pid = fork();
Why is O/S design/implementation hard/interesting?
- the environment is unforgiving: weird h/w, no debugger
- it must be efficient (thus low-level?)
- ...but abstract/portable (thus high-level?)
- powerful (thus many features?)
- ...but simple (thus a few composable building blocks?)
- features interact: fd = open(); ...; fork();
- behaviors interact: CPU priority vs memory allocator.
- open problems: security, multi-core
You'll be glad you learned about operating systems if you...
- want to work on the above problems
- care about what's going on under the hood
- have to build high-performance systems
- need to diagnose bugs or security problems

Class structure

http://pdos.lcs.mit.edu/6.828
Lectures
- basic O/S ideas
- extended inspection of xv6, a traditional O/S
- a series of recent topics
Homework to prepare for each lecture
Lab: JOS, a small O/S for x86 in an exokernel style
- you build it, six labs, final project of your choice
- kernel interface: expose hardware, but protect -- no abstractions!
- unprivileged library: fork, exec, pipe, ...
- applications: file system, shell, ..
- development environment: gcc, qemu
- lab 1 is out
Code review
Two quizzes
- mid-term (in class)
- final (during exam week)

Case study: the shell (simplified)

6.828 is largely about design and implementation of system call interface. let's start by looking at how programs use that interface. example: the Unix shell.
the shell is the Unix command UI
the shell is also a programming/scripting language
typically handles login session, runs other processes
look at some simple examples of shell operations, how they use different O/S abstractions, and how those abstractions fit together. See Unix paper if you are unfamiliar with the shell.

Basic structure:

	while (1) {
	    write (1, "$ ", 2);
	    readcmd (cmd, args);   // parse user input
	    if ((pid = fork ()) == 0) {  // child?
		exec (cmd, args, 0);
	    } else if (pid > 0) {   // parent?
		wait (0);   // wait for child to terminate
	    } else {
		perror ("fork");
	    }
	}

system calls: read, write, fork, exec, wait. conventions: -1 return value signals error, error code stored in errno, perror prints out a descriptive error message based on errno.
what does fork do?
copies user memory
copies process kernel state (e.g. user id)
child gets a different PID
child state contains parent PID
returns twice, with different values
what does exec do?
replaces memory of current process with instrs/data from file
i.e. runs a file created by compiler/linker
still the same process, keeps most state (e.g. user id)
what does wait do?
waits for any child to exit
what if child exits before parent calls wait?
the fork/exec split looks wasteful, but it turns out to be useful.
Example:
```
	$ ls
```
how does ls know which directory to look at?
cwd in kernel-maintained process state, copied during fork
how does it know what to do with its output?
I/O: process has file descriptors, numbered starting from 0.
index into table in process's kernel state
system calls: open, read, write, close
numbering conventions:
- file descriptor 0 for input (e.g., keyboard).
```
     read (0, buf, bufsize)
```
- file descriptor 1 for output (e.g., terminal)
```
     write (1, "hello\n", strlen("hello\n"))
```
- file descriptor 2 for error (e.g., terminal)
so ls writes output to file descriptor 1
on fork, child inherits open file descriptors from parent (show in process diagram).
on exec, process retains file descriptors.
This shell command sends ls's output to the file out:
```
     $ ls > out
```
Q: how could our simple shell implement output redirection?
A: just before exec insert:
```
	close(1);
	creat("out", 0666);   // fd will be 1
```
The kernel always allocates the first free file descriptor, 1 in this case. Could use dup2() to clone a file descriptor to a new number.
Good illustration of why it's nice to have separate fork and exec.

Many commands use 0/1 by default, so they work with redirection. Here's a simplified cat:

main(int argc, char *argv[])
{
  int fd, n;
  char buf[512];
  if(argc > 1)
    fd = open(argv[1], 0);
  else
    fd = 0;
  while((n = read(fd, buf, 512)) > 0)
    write(1, buf, n);
  close(fd);
  exit(0);
}

system call interface is very simple, just ints and char buffers. why not have open() return a pointer reference to a kernel file object?
You can run the shell, redirect its stdin/stdout, etc.
I'll run this shell script with sh < script:
```
     echo one
     echo two
```
Q: what will this shell command do?
```
     $ sh < script > out
```
the script itself didn't redirect the echo output, but it did inherit a fd 1 that was redirected to out.
Q: I'll run the following directly; is it the same as above?
```
     echo one > out
     echo two > out
```
The shell lets you redirect any FD:
```
    $ ls script xxx 2> out
```
after creat, insert:
```
	close(2);
	creat("out", 0666);   // fd will be 2
```
But this feature can interact poorly with itself, e.g. if you want stdout and stderr to go to the same file:
```
    $ ls script xxx > out 2> out
```
why did this turn out badly?
here's a better plan:
```
    $ ls script xxx > out 2>&1
```
```
	close(2);
	dup(1);		       // fd will be 2
```
Linux has a nice representation of a process and its FDs, under /proc/PID/
- maps: VA range, perms (p=private, s=shared), offset, dev, inode, pathname
- fd: symlinks to files pointed to by each fd. (what's missing in this representation?)
- try exec 3>/tmp/xx ; ls -l /proc/$$/fd
one often wants to run a series of programs on some data:
```
	$ sort < in > out
	$ uniq out > out2
	$ wc out2
	$ rm out out2
```
the shell supports this more concisely with "piping" of "filters":
```
        $ sort < in | uniq | wc
```

A pipe is a one-way communication channel. Here is a simple example:

        int fds[2];
        char buf[512];
        int n;

        pipe(fds);
        write(fds[1], "hello", 5);
        n = read(fds[0], buf, sizeof(buf));
        // buf[] now contains 'h', 'e', 'l', 'l', 'o'

file descriptors are inherited across fork(), so this also works:

        int fds[2];
        char buf[512];
        int n, pid;

        pipe(fds);
        pid = fork();
        if(pid > 0){
          write(fds[1], "hello", 5);
        } else {
          n = read(fds[0], buf, sizeof(buf));
          exit(0);
        }

How does the shell implement pipelines (i.e., cmd 1 | cmd 2 |..)? We want to arrange that the output of cmd 1 is the input of cmd 2. The way to achieve this goal is to manipulate stdout and stdin.

The shell creates processes for each command in the pipeline, hooks up their stdin and stdout, and waits for the last process of the pipeline to exit. Here's a sketch of what the shell does, in the child process of the fork() we already have, to set up a pipe:

	    
	    int fds[2];

  	    if (pipe(fds) < 0) panic ("error");
	    if ((pid = fork ()) == 0) {  child (left end of pipe)
	       close (1);
	       dup (fds[1]);   // fds[1] is the write end, ret will be 1
	       close (fds[0]);       // close read end
	       close (fds[1]);       // close fds[1]
	       exec (command1, args1, 0);
	    } else if (pid > 0) {        // parent (right end of pipe)
	       close (0);
	       dup (fds[0]);   // fds[0] is the read end, ret will be 0
	       close (fds[0]);
	       close (fds[1]);       // close write end
	       exec (command2, args2, 0);
	    } else {
	       printf ("Unable to fork\n");
            }

Who waits for whom? (draw a tree of processes)
Why close read-end and write-end? ensure that every process starts with 3 file descriptors, and that reading from the pipe returns end of file after the first command exits.
Notes about the file descriptor design:
- nice interaction with fork
- FDs help make programs more general purpose: don't need special cases for files vs console vs pipe
- shell pipelines only work for programs w/ common formats (lines of text)
How do you create a background job?
```
        $ sleep 2 &
```
Q: How does the shell implement "&"?
Q: What if a background process exits while sh waits for a foreground process?