6.828 2005 Lecture 21: Asbestos

introduction
  general topic is hardening software against attacks
  and how the o/s can help
  key problem is that you can't really trust the s/w you're running
  extreme example: running an applet in a browser
  our example: web services
    complex, keep private info, custom buggy s/w, exposed to Internet attack
  three stages: naive server, what can UNIX do, what can Asbestos do

what might a simple web server look like?
  perhaps a dating site (mention okcupid, experience motivated Asbestos)
  assume most/all dynamic content
  one big server, sitting on kernel network stack
  lots of modules: login
  keep persistent user state in files or in SQL DB
    e.g. name, e-mail address, credit card number, profile

what are the threats?
  store user's state in /state/username, can user create name ../etc/passwd?
  sql injection:
    select id from users where city = '...';
    xx'; update users set password = '
  buffer overflow
    char buf[512];
    gets(buf);
  buffer overflows result in server executing arbitrary malicious code!
  these accidental bugs might are as bad as a malicious programmer

what went wrong?
  why can it read /etc/passwd?
  why can it read Alice's DB entries when executing Bob's request?
  why can it read Alice's state from memory when executing for Bob?
  server has much more privilege than it needs!

what do we want to achieve?
  goal: principle of least privilege
    s/w should have no more power than it needs to do its job
  we need a concrete definition of privilege
    subjects, objects, actions
    for every combination, is it allowed?
  so we need:
    a complete enumeration of subjects, objects, and actions
    mechanisms to limit privilege despite bugs
    mechanisms to change privilege depending on context
      e.g. are we executing for Bob or Alice?

what can we do under unix?
  (look at okws.org for what okcupid does)
  what tools do we have?
    address spaces for isolation
    user IDs, file permissions
    subjects are processes and users, objects are files,
      actions are read/write memory and files
  isolate with processes
    per service
    per user
    now you can't read state from other processes
    and we can give processes UNIX user IDs
    and make sure files have appropriate permissions
  privilege separation
    demux that decides each connection's user
    front end that parses input and formats html
    back end that is allowed to read/write real state files
    different user IDs for back and and front end?
    (another example: ssh-agent vs ssh)

definitions
  privileged: has some power (typically a dangerous one)
  trusted: a bug can violate security goals
  so demux is fully trusted and privileged
  front end almost untrusted/unprivileged (has trust/priv for one user)
  DB may have no special privileges, but it's trust to keep data in order
  POLP implies we want to minimize the "Trusted Computing Base"
    split trusted processes into trusted/untrusted parts
    move functionality into less trusted processes

why is POLP hard to apply on UNIX
  we've only restricted a few specific actions (reading memory and some files)
    so we can't claim to have satisfied POLP
  still easy to divulge data via bugs
    cgi script stores something in a readable /tmp file
  there are other operations that we haven't restricted
    Linux has about 200 system calls!
    default is often +privilege (e.g. make network connections)
    can still read lots of random files, hard to keep all permissions correct
    IPC to database server or other unix services?
    can it make outgoing net connections and pretend to be server host?
  awkward to set up UNIX users
    only root can create new users, or start a process as a user
    don't want to run our buggy server as root!
  UNIX can do better: chroot, jail, SELinux reference monitor

what are we looking for, that might be better than UNIX?
  simpler set of privileges (subjects/objects/actions)
  control over who can talk to who
  track accidental/malicious disclosures
  focus on data as it moves around
  no privilege required to set up
  default is no privilege

how does asbestos support secure programming?
  only action is ipc to ports (like JOS) (no shared mem)
  can't send to a port unless someone gives you permission (capabilities)
  labels track/control flow of sensitive information

high-level view of Information Flow Control
  key idea: kernel track+control flow of data, rather than limiting individual actions
    allows policies like "if P reads Alice's data, it can't send it to Bob"
  just show simple send labels: "taint"
  web server, cgis for alice and bob, state servers for alice and bob
  web server knows user for each connection
  alice's state server sends tainted data to alice's cgi process
  this taints alice's cgi process (changes its label to reflect taint)
  web server only allows alice's data to leave on alice's connection

Asbestos' send and receive labels on processes
  two state servers, two cgis, two http servers
  http recv labels
  state server send labels
  rules when P sends to Q:
    Rule 1) PS <= QR
    Rule 2) QS = QS U PS
  explain recv label is safeguard, for edges of system, and covert channel
  the point: app sets up initial label setup
    then kernel automatically tracks data and enforces flow rules
    even buggy app s/w can't escape the rules

what is in an Asbestos label?
  really a set of components, to represent multiple taint
  components are 64-bit numbers, allocated uniquely by kernel
  handle:level
  x:3 means tainted with x's private data
  x:* means privileged w.r.t. x

can a process change its labels?
  only in ways that restrict information flow
  Rule 3) can raise (add taint to) send label, but not lower it
  Rule 4) can lower (remove taint from) receive label, but not raise it

label component creation
  anyone can create a unique new handle
  P gets x:* in send label when it creates x
  holding x:* detaints any x:3 you receive (DECLASSIFICATION)
  you can explicitly give away x:* in a message
  you can explicitly set someone's recv label to x:3 if you have x:*

why do the * rules make sense?
  initially, no-one else is using component x (it's unique and new)
    so having x:* (and giving it away) doesn't help you receive someone else's data
  creator of x:* "owns" data with taint x:*
  so it's ok for creator to declassify x:3
  and ok for creator to give away that privilege

diagram of whole web server...
  (look at figure 1)
  netd, ok-demux, idd, ok-dbproxy, database

how does it all start running?
  launcher starts ... and gives them each others' ports
  tells netd to start listing for http TCP connections
  (look at figure 5)
  netd creates a port for that connection
  netd sends first few packets of new connection to ...
  extract login and password
  send to idd
  idd checks login and password
  asks kernel for new random taint handle uT, if not already cached
  idd gives uT:* to ok-demux
  ok-demux gives uT:* to netd (so it can declassify)
  ok-demux creates worker, taints w/ uT:3

which processes are privileged?

which processes are trusted?

what literally stops wrong data from being exposed via netd?
  each netd port (to worker) has its own "port label"
  Rule: PS <= PortLabel
  TCP connection to user u has port label uT:3, everything else :0
  so if process is tainted with vT:3, can't send to that netd port

how does netd know which worker a message is from?
  worker has just the one netd handle
  only one netd port label will accept that worker's taint

why can't worker create a new TCP connection via netd?
  doesn't know netd control port
  you need control:* to send to netd'd control port

how does ok-dbproxy know who is sending the data?
  is this worker allowed to overwrite this row?
  not a question of taint, but of identity
  additional "grant" handle allocated by idd
  given to worker at *