6.824 Lecture 1: Introduction and lab overview Introduction engineering distributed systems abstractions, implementation techniques, etc. lectures on general ideas, paper reading for case studies you'll build real systems (yfs, inspired by Frangipani) why take the course? synthesize many different areas in order to build working systems Internet has made area much more attractive and timely hard/unsolved: not many deployed sophisticated distrib systems background: 6.033 ideas and readings much more depth: build a real distributed system What is a distributed system? multiple computers interconnections cooperate to provide some service [Example: Athena, Akamai CDN, Google file system, TinyDB, ...] [Counter example: shared UNIX time-sharing system] Why care about distributed computing? conquer geographic separation Web build reliable systems out of unreliable components Argus, BFT, Frangipani aggregate many computers for high capacity aggregate cycles+memory: (ThreadMarks, Dryad) aggregate bw (Coral, Shark) aggregate disks (Frangipani) isolate computers for specific tasks authentication server backup server Challenges system design what does the client do, what does the server do? which servers? what are the right protocols? for example, today: frangipani versus NFS consistency shared data with multiple readers and writers failures communication and hardware how do you tell the difference? which node is the last one to fail? security adversary may compromise machines or manipulate messages network properties sensors net clusters wide-area implementation concurrency with servers and clients network is often scarce resource avoid disk writes easy to make distributed system less scalable, less reliable, etc. than a centralized system. (Lamport's definition: a distributed system is one where a computer you don't know about renders your own useless.) 6.824: distributed *systems* infrastructure for building servers and clients: threads+RPC distributed programming consistency fault tolerance peer-to-peer/decentralized security case studies [continous thread: system design] Example: how to build a distributed file system? say AFS. users should have same view of the file system and be able to share files Simple System Design One server w/ disk to store directories and files [picture: file server, "clients", read file, write file, create, etc.] Topic: implementation Server serves multiple clients concurrently To get good performance out of disk Concurrency challenges: Server threads may modify share state (e.g., file cache) concurrently. Avoid Race conditions! One thread's request may depent on another thread's. Avoid deadlock and livelock! Topic: consistency & protocol design What if user operations require multiple file system operations? If I move a file from one directory to another, does another use see intermediate states? What if two users move a file to the same destination directory? To offload network and server probably cache files at client When does a write on a client become visible to other clients? Do we want it to behave like a single-machine file system? Topic: system design What happens if your file server must serve a large community (say MIT)? What if more clients than one server can handle? What if more users than one server can store files and directories? How to use more servers to handle more clients? Idea: partition users across servers How to do load balance of users? Statically? Partition name space? What if some user need suddenly a lot more space? What if some files are much more popular than others? Topic: fault tolerance Can I get to my files when some servers are down or network fails? Yes: replicate the data. Problem: replica consistency. delete file, re-appears. Problem: physical independence vs communication latency Problem: partition vs availability. airline reservations. [[???]] Tempting problem: can 2 servers yield 2x availability AND 2x performance? Topic: security Internet provides global exposure to random attacks from millions of bored studentsand serious hackers, e.g. intrusions for spam bot nets How does the server know that a request is from me? How is the file server protected? How much do i have to trust the system administors? Etc. Topic: system design What if we want to provide an Internet file system? aggregate all computers in a gigantic file system How do we need to do this? more failures, longer delays, multiple administrative domains We want to understand the individual techniques, and how to assemble them. COURSE STRUCTURE All announcements and info at http://www.pdos.csail.mit.edu/6.824 look at the first lab, due in a week Meetings: 1/2 lectures, 1/2 paper discussions (or lab help) Research papers on working systems, starting today must read papers before class otherwise boring, and you can't pick it up by listening in future, we will post paper questions 24 hours in advance hand in answer on paper in class, one or two paragraphs Two class-length quiz (one in final's week, sorry!) Labs: build a real cluster file system a la Frangipani Labs are due on Thursdays Project: extend lab in any way you like. short paper demo in last class meeting David Schultz is TA, office hours on Web. LAB: YET ANOTHER FILE SYSTEM (YFS) Lab is inspired by Frangipani, a scalable distributed file system (see paper). You will build a simplified version of Frangipani. Frangipani goals aggregate many disks into a single shared file system capacity can be incrementally added without stopping the system bricks for storage good load balance of load/users across disks no manual assignment tolerates and recovers from machine, network, and disk failures without operator intervention consistent backups while running the system Frangipani design Each machine runs a Petal server, a Frangipani Server, and a lock service this is not strictly necessary, but simplifies thinking about the system Petal aggregates disks into one big virtual disk interface: put and get replicates blocks/extents add petal servers to increase storage capacity and throughput Frangipani file server servers file system requests and uses shared petal to store data inodes, directory, data blocks all stored on petal all servers serve same file system add servers to scale up Frangipani Lock server file servers uses lock server to provide consistency e.g., when creating a file, lock directory etc. locks are also replicated No security beyond traditional file system security intended for a cluster, not wide-area Contrast: NFS (or AFS) Clients relay file system calls to server clients are simple; they don't implement a file system for performance client may implement cache (or use OS file system cache) Server implements the file system and exports its local disks How do you scale a NFS server? buy more disks partition file systems over servers or, buy file server "mainframe" How do you make a NFS server fault tolerant? buy RAID system or NAS YFS Same basic structure as Frangipani, but single extent server i.e. you don't have to implement Petal. Draw picture yfs_client (interfaces with OS through fuse) extent_server lock_server Each server written in C++ and pthreads, our own RPC library Next two lectures cover infrastructure in detail Labs: build YFS incrementally L1: simple lock server RPC semantics Programming with threads L2: yfs_server basic file server (no sharing, no storage) L3: yfsclient + extent_server reading/writing L4: yfsclient + extent_server + lock_server mkdir, unlink, and use lock server (sharing) L5: caching lock_server protocol design L6: caching extent_server consistency using lock_server L7: paxos library agreement protocol L8: fault tolerant lock server replicated state machine using paxos L9: your choice Lab 1: simple lock server - let's look at handout (the tar ball from web site)