6.824 2001 lecture 1 Opening design and construction of robust, high-performance server systems not just study, also actually build real systems understand and synthesize many O/S and dist sys ideas Picture examples: airline reservation system, hotmail clients, lots of them abstract service box what's inside the service? hardware and s/w modules DB back end. web formatter front ends. easy for small systems Challenges blueprint for course content Performance on one machine s/w structure. modules vs interaction complexity. flow of control among all the activities. example: serve client2 while waiting for client1's disk data Performance on clusters again, modules and interaction. example: bank xfer. s/w structure. danger of perf being dominated by comms costs, not cpu time. more servers -> less performance Performance under high load Example: Starbucks. 5 seconds to write down incoming request. 10 seconds to make it. max thruput at 4 drinks/minute. thruput goes to zero at 12 requests/minute. [graph: x=requests, y=output] Efficiency *decreases* with load -- bad. Careful system design to avoid this -- flat line at 4 drinks. Peets, for example. Better: build systems whose efficiency *increases* w/ load w/ e.g. batching, disk scheduling Fault tolerance hardest problem is partial failure client failures don't affect service hardware or s/w modules 1. give up, 2. wait for fix, 3. tolerate tolerance itself leads to complexity, bugs, potential failure Consistency fault tolerance and performance often addressed with replication what if replicas diverge? airline seat reservations prevent or repair what if multi-step operation interrupted by a crash? bank xfer between machines Security systems with good security are often *easier* to use want your service to be accessible by everybody hardening against outside attack for legitimate users accept deposits vs view balance who are you? authentication hard: what does identity mean? example: pub-key signed e-mail; verify sig is not enough what are you allowed to do? authorization privacy and anonymity send me complains/comments. i may want to reply! curious mixture of authentication and anonymity We want to understand the individual techniques, and how to assemble them. Course structure meetings: 1/2 lectures on fundamentals, 1/2 reading discussions research papers on ideas and working systems must read papers before class two exams Labs: build real servers. 5 of them. on our machines. Frank's office hours Project. Programming and paper (in style of readings). programm committee Don't forget to sign up, read paper, do assignment May be limited enrolment Depends on how many people end up taking it Will see after first assignment comes back Topic: basic server software structure Life-cycle of typical request, in server [draw this] accept connection from client read request from client find data on disk (or talk to back end DB) read data from disk write data over network to client Could just wrap this in a big while() loop. Single program. Very simple! Look at the code in hand-out. Consequences? Problem [draw this] Draw a time-line of when the CPU is busy Actually CPU, disk, network i/f? Explain "blocking" system calls. We're wasting CPU! We may have lots of work AND an idle system. Not good. s/w structure forces one-at-time processing How can we use the system's resources more efficiently? Typical concurrency solutions: This is about s/w structure. There are any number of potential structures. 0. (One process) 1. Multiple processes 2. One process, many threads 3. Asynchronous Depends on O/S facilities. Concurrency with multiple processes Start a new UNIX process for each client connection / request O/S scheduler knows if one process blocks, runs another one This is important... [draw picture of kernel, process addr spaces, each w/ doit() code] Master processes hands out connections. Now plenty of work available to keep system busy Still simple: see handout. Preserves original s/w structure. Isolated: bug for one client does not crash the whole server Most interaction hidden by O/S. E.g. lock the disk queue. Probably easy to take advantage of multiprocessors Multiple process problems Cost of starting a new process (fork()) may be high. New address space &c. 300 microseconds *min* on my computer. Perhaps a static pool of processes Master hands out new connections to existing idle processes How many concurrent requests to serve? Matters if the existence of a process is expensive. I.e. consumes memory, table space, &c. Processes are fairly isolated by default E.g. they do not share memory What if you want a web cache? Must be shared among processes. Or even just keep statistics? Concurrency with threads Looks a bit like multiple processes But thread_fork() leaves address space alone So all threads share memory [picture: thread boxes inside process boxes] Seems simple -- still preserves single-process structure. Potentially easier to have e.g. shared web cache But programmer needs to know about some kind of locking. Also easier for one thread to corrupt another Thread details matter Where is the thread scheduler? Does it actually see all blocking operations? Kernel threads O/S kernel knows about each thread It knows a thread was just blocked, e.g. in disk read wait Can schedule another thread from that same process [picture: thread boxes dip down into the kernel] Kernel knows enough to schedule threads on different CPUs This sounds like just what we want for our server BUT kernel threads are usually expensive, just like processes Many O/S do not support kernel threads User threads Implemented purely inside program, kernel does not know User scheduler for threads inside the program In addition to kernel process scheduler [picture] How does user scheduler know a thread blocked? How does user scheduler gain control again to run another thread? Answer: thread library has fake read(), write(), accept(), &c system calls Scheduler remembers that a thread is waiting for some event Arranges with O/S to be notified of event Resumes a thread when its event arrives Problem: O/S usually doesn't support this event model well Works for socket I/O, not for file operations Threads are hard to program The point is to share data structures Hard for programmer to use locks correctly Threads produce efficiency by making it look to the programmer as if many things are happening at once Needlessly confusing In fact, events occur one at a time Alternate view Suggested by user threads implementation Organize the s/w around arrival of events [list these] New client connections Client data Disk read completions Packet departures (i.e. buffer space is free) Library support to register interest in events Write s/w in state-machine style When this event occurs, execute this function The point: this preserves the serial natures of the events Programmer sees events/functions occuring one at a time s/w never blocks, just registers for an event We get efficient use of system w/o confusing illusion of concurrency Example: Very simple server. Accepts connections, waits for data, ignores it. Note that many of each kind of even can be waiting. So we can handle multiple connections/clients at a time. Note that we could have different read callbacks, or whatever, for different stages of processing. Why async programming is attractive Can get high utilization w/o concurrency. The callback functions are called one at a time. Thus no locking required. Why async programming isn't perfect No story for multiprocessors You have to chop up your program at points that might block. For example, assembling complete input. The O/S cannot turn all blocking sys calls into events Particularly file system I/O Most libraries are not written for async I/O If they block, they block the whole program Don't forget to sign up at this URL. Assignment due next Thurs, on web site. Reading for discussion on Tuesday.