6.824 2001 Lecture 12: Distributed Operating Systems The operating system is one of the great success stories of CS. Everybody uses them. Provide useful abstractions for programming single-machine systems. Why not extend the idea to multi-machine systems? What's the distributed O/S vision? All the convenience and unity of a single timesharing machine. E.g. you type "ps" and see your processes on all workstations But implemented on a LAN of workstations. So higher performance: Example: my process are automatically run on idle machines. But I can list, kill, debug them. And my processes work the same as they would on single machine. See the same files, pipes, devices, processes, &c The specific goal here is transparency. A distributed operating system should make a collection of computers behave like a single computer with better performance and reliability. Transparency can be classified in different categories: 1. Location transparency---Users cannot tell from the name where resources are located. 2. Migration transparency---Resources can move at will without changing their names. 3. Replication transparency---The users cannot tell how many copies exist. 4. Concurrency transparency---Multiple users can share resources 5. Parallelism transparency---Activities might happen in parallel without users knowing it. Naming turns out to be particularly important We often do the location binding during name lookup And return a special handle bound to actual location Also we often do access checks during name lookup List of names for things that need to be resolved transparently? File names Device names (printer, X display) File descriptors Process IDs User/group IDs Memory addresses -- particularly for shared memory TCP/UDP port numbers List of object types we need to be able to use transparently? Files - create, delete File blocks - read, write Directories - get a listing Devices - rewind tape, scroll region of graphics display Processes - create, kill, debug Pipes - read, write Sockets - create, connect, accept Memory pages - allocate, load, store, map If we got all this right, would process migration be relatively easy? We already know about transparency for some names/objects. Network file systems. X window protocol. Are they fully transparent? Is it easy for me to move a file from one server to another? How would we implement, say, location-transparent PIDs? Process is running on a machine, so embed machine's IP addr in PID? What if the process moves to a different machine? Maybe we need a process location database? PID just an index into that database. Update the database when the process moves. How we would implement location-transparent shared memory? I.e. distributed shared memory. Need consistency. Could we make a universal name binding system? To avoid re-implementing one for each type of name? Would probably require separation of naming from object access. In contrast, NFS ties them together. LOOKUP is an NFS op. Where is PID/page/socket/display XXX? Could we make a universal object access protocol? Perhaps modeled on file descriptor I/O? Might give us UNIX stdio style composability. And would save us from having a huge number of slightly different protocols. Maybe, but there are subtly different I/O models. Assume we want to agree on one particular network protocol. Files: read and write send RPC to single server. Pipes: does the write do the RPC (to reader)? Or does the read do an RPC to the writer? What if multiple readers? (on different machines) cat script | ( sleep 10; /bin/tcsh ) What if multiple writers? ( date ; ls ) | tee ls.out How transparent is Athena? Yes: files, users, e-mail, X windows, printers. No: sound card, process IDs, TCP port numbers.