6.824 2006 Lecture 1: Introduction and O/S Review Opening building distributed systems construction techniques, robustness, good performance lectures on design, paper reading for case studies you'll build real systems why take the course? synthesize many different areas in order to build working systems Internet has made area much more attractive and timely hard/unsolved: not many deployed sophisticated distrib systems Example: how to build HotMail? mail arrives from outside world store it until... user's Outlook/Eudora reads/deletes/saves it Simple Solution: One server w/ disk to store mail-boxes [picture: MS, sending "clients", reading clients] What happens as your mail service gets popular? Topic: Stable performance under high load Example: Starbucks. 5 seconds to write down incoming request. 10 seconds to make it. [graph: x=requests, y=output] max thruput at 4 drinks/minute. what happens at 6 req/min? thruput goes to zero at 12 requests/minute. Efficiency *decreases* with load -- bad. Careful system design to avoid this -- flat line at 4 drinks. Peets, for example. Better: build systems whose efficiency *increases* w/ load w/ e.g. batching, disk scheduling Topic: scalable performance What if more clients than one Hotmail server can handle? How to use more servers to handle more clients? Idea: partition users across servers bottlenecks: how to ensure incoming mail arrives at the right server? scaling: will 10 servers allow us to handle 10x as many users? load balance: what if some users get much more mail than others? layout: what if we want to detect spam by looking at all mailboxes? Topic: high availability Can I get at my HotMail mailbox if some servers / networks are down? Yes: replicate the data. Problem: replica consistency. delete mail, re-appears. Problem: physical independence vs communication latency Problem: partition vs availability. airline reservations. Tempting problem: can 2 servers yield 2x availability AND 2x performance? Topic: global scalability this is really an opportunity we have the entire Internet as a resource what neat new big systems can we build that take advantage? are there any principles to be discovered? finding objects storing objects "out there" serving same objects to many consumers widely distributed computing (e.g. grid computing) Topic: security old view: secrecy via encryption (msg to Moscow embassy) user authentication via passwords &c all parties know each other! Internet has changed focus. global exposure to random attacks from millions of bored students and serious hackers, e.g. intrusions for spam bot nets you fetch a new Firefox binary, how do you know it hasn't been hacked? how do you know that was Amazon you gave your credit card number to? how does Amazon know it was you? no purely technical approach is likely to solve these problem We want to understand the individual techniques, and how to assemble them. -------------- Course structure URL meetings: 1/2 lectures, 1/2 paper discussions research papers on working systems, starting next week must read papers before class otherwise boring, and you can't pick it up by listening we will post paper questions 24 hours in advance hand in answer on paper in class, one or two paragraphs two in-class quizzes (no final) Labs: build a real cluster file server, cache consistency, locking Project. look at the project information page! design, implement, report teams proposal conferences two drafts demo report Emil is TA, office hours TBA Look at the web site: sign up for course machine accounts look at the first lab, due in a week -------------- O/S kernel overview context in which you build distributed systems o/s has big impact on design, robustness, performance sometimes because of o/s quirks mostly because o/s solves some hard problems This should be review for most of you Want to tell what I think is important Give you a chance to ask questions What problems does o/s solve? sharing hardware resources protection communication hardware independence (everyone faces these problems) Approach to solutions? o/s designers think like programmers, abstractions + interfaces UNIX abstractions (we'll be programming UNIX in labs, my favorite O/S) process address space thread of control user ID file system file descriptor on-disk file pipe network connection device All this is implemented by a "kernel" with hardware privileges Note we're partially virtualizing o/s multiplexes physical resource among multiple processes CPU, memory, disk, network to share, to control, to provide a simple model to apps abstraction helps virtualization: easier to share TCP conns than enet Can't completely virtualize file system and network stack not the same as physical foundation the differences make sharing possible abstractions interact, must form a coherent set if o/s can start programs, it must know how to read files System call interface to kernel abstractions looks like function call, but special fork, exec open, read, creat Standard picture app (show two of them, mark addresses from zero) libraries ----- FS disk driver (mention address spaces, protection boundaries) (mention h/w runs kernel address space w/ special permissions) Why Big Kernels have been successful. easy for kernel subsystems to cooperate disk buffer shares phys mem with virtual mem system all kernel code is 100% privileged very simple security model easy to implement sophisticated and efficient services Why UNIX abstractions are not perfect kernel is big kernel has room for lots of bugs; it's all privileged kernel limits flexibility multiple threads per process? single thread crossing into a different address space? control disk layout of files for performance? don't like the kernel's TCP implementation? we'll discuss a number of improved abstractions Alternate set of abstractions: micro-kernel Move complex abstractions to server processes Talk to FS server, rather than FS module in kernel Kernel mostly handles IPC also grants h/w access to privileged servers e.g. FS server can read/write disk h/w Looks like a miniature distributed system! Move FS server to a different machine, via network? Lots of overlap with our concerns in this class. Let's review some basics which will come up a lot: process / kernel communication how processes and kernel wait for events (disk and network i/o) Life-cycle of a simple UNIX system call [diagram. process, kernel] See the handout... Interesting points: protected transfer h/w allows process to get kernel permissions but only by jumping to *known* entry point in kernel process suspended until system call finishes What if the system call needs to wait, e.g. for the disk? We care: this is what busy servers do sys_open(path) for each pathname component start read of directory from disk sleep waiting for the disk read process the directory contents sleep() save *kernel* registers to PCB1 (including SP) find runnable PCB2 restore PCB2 kernel registers (SP...) return Note: each user process has its own kernel stack [draw in diagram] kernel stack contains state of partially executed system call "kernel half" trap handler must execute on the right stack "blocking system call" What happens when disk completion interrupt occurs? Device interrupt routine finds the process waiting for that I/O. Marks process as runnable. Returns from interrupt. Someday process scheduler will switch to the waiting process. Now let's look at how services use this kernel structure. Explain server_1 web server in handout Problem [draw this time-line] Time-lines for CPU, disk, network Server alternates waiting for each of them CPU, disk, network are each idle much of the time OK if only one client. Not OK if there are clients waiting for service. We may have lots of work AND idle resources. Not good. s/w structure forces one-at-time processing How can we use the system's resources more efficiently?