6.824 Lecture 3: Threads

Thread is short for thread of control, a running program with its own
program counter, stack pointer, etc.  (For this class a process is a
one of more threads executing in a single address space.)

The primary reasons to use concurrent programming with threads:
  exploit several processors to run an application faster
  hide long delays (e.g., while waiting for a disk do something else
    on processor)
  run long-running ops concurrenty with short ones in user interfaces
  network servers and RPC
    For example,  in lab 1, if one  client is waiting for  lock a, the
    server  may  want  to  process  requests from  other  clients,  in
    particular ones for different locks.

Pitfall of multithreaded programming
  race condition; can you give me an example?
    may be difficult to reproduce
  deadlock; can you give me an example?
    better bug to have than race; you program stops when it happens
  livelock; can you give me an example?
  starvation
  wrong lock granularity; can you give me an example?
  other potential problems: priority inversion

At a minimum, a thread interface must support:
  creating and managing threads
  ways of avoiding race conditions for updates to shared variables
    assume each treads runs on its own processor, sharing a memory
    instructions that appear to be atomic, might not be (e.g., x = x + 1)
  ways of coordinating different threads

Pthread interface
  standard interface, not unlike the one described in the paper, but
    programming language independent.
  we use it in the labs

Interface
  threads
    create
    join
  mutex
    lock
    unlock (must typically be performed by the thread that did the acquire)
  condition variables
    wait
    signal
      wakes up one thread
      is it the right one?  only use it if there is only one thread waiting
    broadcast
      wakes up all threads waiting
      put waiting condition in a loop around wait if only one should proceed
        e.g., if multiple clients are waiting on a lock from the lock server
  other stuff
    which we don't use.

Paper does a nice job how to teaching to program with threads, in
particular for programmers who write multithreaded servers---that is
you.  Worth to reread the paper as you get more experience.

How to use the interface: fifo.cc

Scoped locks.
  int main(int ac, char **av) {
    scoped_pthread_lock _spl(&mu);	// I acquired mu..

    if (foo > 0)
	return foo--;  // _spl.~scoped_pthread_lock releases mu
    return foo++;      // _spl.~scoped_pthread_lock releases mu
  }


Pitfalls:
  which statement to remove to cause a race?
  a deadlock?
  livelock?
  starvation?
  is modularity broken?  that is, does the caller need to know that
    this module uses locks and which ones?

Invariants
  one way to think about lock is that they protect an invariant.
    when the lock is not held, some invariant is true
    when the lock is acquire, the invariant is true
    when the lock is released, the invariant be better true again
  what invariant is the mutex in fifo enforcing?

Recursive/reentrant locks
  generally bad idea, because it doesn't obey the invariant view.   
  better have a deadlock and/or fix code.

Deadlock
  enumerate locks, acquire them in fixed order
  will show in yfs when using the lock server

Breaking of abstractions
  killing another thread is a mess
    it might be waiting for a lock
    it might be sitting in a condition variable (e.g., in fifo)
      does have the lock after it returns?
    thread may have to do some cleanup holding the lock
      thread might be having references to other objects
        cannot delete thread until it releases those resources
    pthread_cancel supposed to help handling this, but doesn't really work
    this RPC library is redesign to avoid the problems w. pthread_cancel

Locking granularity
  one lock for all fifos?
  one lock for each instance of fifo?
  one lock for each element?
  danger of many locks---deadlock and races 

More challenging/larger case study:
  RPC library

YFS RPC structure
  one poll manager, polling many connections
  each connection is a TCP connection (pipelined, reliable, etc.)
  each poll manager has a thread pool to run requests 

YFS RPC library
  how to use: lock_demo.cc, lock_client.c, lock_smain.cc, and lock_server.cc
     lock_client.cc calls bind() first---why?  (we will see)
  rpc.h: the interface to the RPC system; let's look at it briefly.
     rpcc (+caller) and rpcs
     marshaling
  rpc.cc: the implementation.
    rpcc::
       client side
    rpcc::bind
       a remote procedure to get a unique ID from server
    rpcc::call1: an RPC!
       must bind first
    rpcc::got_pdu: a reply
       how come that replies can arrive out of order?
    rpcs::rpcs
      server side
    rpcs::got_pdu
      a new request
      a new thread for each request
    rpcs::dispatch:
      switch statement: how can easy case happen?
        what should checkduplicate_and_update do?
      why remember replies?
  connection.cc
    uses tcp; why?
    one pollmgr for all connections; why?
  pollmgr.cc
    one thread polls the network socket (select)
    when packet arrives, start threads (but thr_pool limits the number
     of threads)  
  thr_pool.cc

Interactions betweens threads and RPC
  can one hold locks across an RPC call?
    should one?
  can one make an RPC in a handler?
    what is the risk of this kind of a code in a handler?
      for (iterator i = beginlist(); !eol(); i++) {
        unlock(&l)
        call(client[i], ...)
        lock(&l)
       }