Secure Untrusted Data Repository (SUNDR) Li, Krohn, Mazieres, and Shasha OSDI 2004 goal: store your files on a server usually we "trust" the file server but what if it's outsourced over the internet maybe owning company doesn't apply Microsoft patches server may be completely corrupt users can verify that they got the correct data back so "secure" is about authentication, not privacy can we design our own? i sign what i write to the server put(filename, signed contents) get(filename) -> signed contents users know each others' public keys check signatures when you retrieve now the server cannot forge data! are we done? this might be OK for a single user what might go wrong? server shows me stale data: signed != fresh do i have to keep track of latest write TS for every file i own? server shows me stale data from other users i can't even know when other users wrote server omits or re-orders operations in general: server shows different users different contents what about directory contents if many users create files? or file contents if many users write parts of the same file? how can anyone sign the data? idea: ordinary FS protocol: operations to server server replies with contents you can secure the channel you have to trust the server to interpret the operations SUNDR approach: signed operations to server server replies with the same signed operations! the clients interpret the operations the server only stores operations the paper's straw man [explain log scheme] client asks server for a copy of the current log validates log (is my latest operation there? does last signature check?) appends new operation, signs over the whole new log writes log back to server server must implement a lock it's a log of operations, not really of contents so clients have to interpret (play) the log prevents server from showing me my own stale data prevents server from changing the order of operations since each operation signs its place in the log what *can* a malicious server do in straw man? it doesn't have to show me anything after my last operation i.e. it can conceal other users' recent operations suppose server conceals U2's last operation from U1 now U1 appends an operation to the log it got from the server can the server ever show U1's operation to U2? can the server ever show U1 any more of U2's operations? so: from now on the server can only show U1 its own new operations and can only show U2 its new operations this is a forking attack fork consistency: only attack is a fork attack: conceal operations all users see the same log before the first concealed op no user sees another user's ops after the first concealed op why is fork consistency good enough? the server *can* perform a forking attack! it's good that it's a violent attack after a while it will be obvious that we've been attacked easy to detect if users compare notes much better than allowing a concealed op, but showing subsequent ops why must fetches be in the log? fork consistency implies that if two clients A and B communicate with each other, they must be able to determine if they have been forked or not. Consider the scenario where a malicious server that hides a modification to file f by A from B, and B takes an action on the earlier version of f. A -> S: log with new modification to F A calls B on the phone, to alert B about the change B -> S: fetch log S -> B: log without last modification to F (fork attack) B: returns results to application B -> S: fetch log (B fetches again) S -> B: complete log (log looks fine to B) A -> B: get log When A fetches the log from B, assumes there is no fork attack since A's last mod is there. But, B has been forked. why is the straw man not enough? need to xfer the whole log to check signatures and you need to play/interpret the log entries caching optimizations possible, but expensive if you are away for a weekend can we get rid of log, just keep current directory tree? each i-node/directory block contains crypto hashes of children? i tell server just blocks changed by my operation? then i sign the root block? root block contains content-hash of previous root block? why is a signed directory tree not quite right? how can i check that server didn't drop one of my operations? would need a log of root blocks? how can U1 prevent U2 from writing U1's files/directories? e.g. /u owned by root, /u/rtm owned by rtm first, a user should sign only its own files/directories so we can't have a directory tree representation which requires change all the way up to the root for any file change i-handle points to current i-table i-table maps i-number to hash of i-node directory block maps name to user#/i-number thus my directories can hold your files i-number lets you modify those files w/o changing my directory now the file system is the collection of users' i-handles we really have a sequence of new i-handles arrange as a time-line per user second, use version vectors to verify that operations are uniquely ordered each user numbers successive i-handles, puts vers number into i-handle each i-handle also contains versions of all users i-handles at time of op how do version vectors evolve in correct operation? U1: 1,0 2,2 U2: 1,1 1,2 2,3 validation: server shows me my my recent i-handle i-handle version vectors can be totally ordered i.e. 2,2 < 2,3 what would version vectors look like if server hid an i-handle update? U1: 1,0 [2,1] U2: 1,1 1,2 server hides U1:2 do the version vectors give us fork consistency? can the server show future U1 i-handles to U2? e.g. 3,1 no: 3,1 and 1,2 cannot be ordered! can the server show future U2 i-handles to U1? e.g. 1,3 no: 1,3 and 2,1 cannot be ordered can the server show 2,1 to U2 at a later time? i.e. after U2 finishes 1,2, when it's about to start 1,3? no: 2,1 and 1,2 cannot be ordered why aren't we done at this point? the server still needs to serialize operations server will be idle between sending current VVs, and waiting for new one i.e. between UPDATE and matching COMMIT we cannot just allow concurrent operations: not orderable, would look like an attack