6.824 2013 Lecture 12: Eventual Consistency Managing Update Conflicts in Bayou, a Weakly Connected Replicated Storage System Terry, Theimer, Petersen, Demers, Spreitzer, Hauser, SOSP 95 some material from Flexible Update Propagation for Weakly Consistent Replication, SOSP 97 Big picture Last lecture: file sync, optimistic consistency, detect conflicts This lecture: automatic conflict resolution update functions update log logical clocks eventual consistency Paper context: Early 1990s (like Ficus) Dawn of PDAs, laptops, tablets H/W clunky but clear potential Commercial devices did not have wireless No pervasive WiFi or cellular data Let's build a meeting scheduler Only one meeting allowed at a time (one room). Each entry has a time and a description. We want everyone to end up seeing the same set of entries. Traditional approach: one server Server processes requests one at a time Checks for conflicting time, says yes or no Updates DB Proceeds to next request Server implicitly chooses order for concurrent requests Why aren't we satisfied with central server? I want my calendar on my iPhone. I.e. database replicated in every node. Modify on any node, as well as read. Periodic connectivity to net. Periodic bluetooth contact with other calendar users. Straw man 1: merge DBs. Similar to iPhone calendar sync, or file sync. Might require lots of network b/w. What if there's a conflict? IE two meetings at same time. iPhone just schedules them both! But we want automatic conflict resolution. Idea: update functions Have update be a function, not a new value. Read current state of DB, decide best change. E.g. "Meet at 9 if room is free at 9, else 10, else 11." Rather than just "Meet at 9" Function must be deterministic Otherwise nodes will get different answers Challenge: A: staff meeting at 10:00 or 11:00 B: hiring meeting at 10:00 or 11:00 X syncs w/ A, then B Y syncs w/ B, then A Will X put A's meeting at 10:00, and Y put A's at 11:00? Goal: eventual consistency OK for X and Y to disagree initially But after enough syncing, everyone should agree Idea: ordered update log Ordered list of updates at each node. DB is result of applying updates in order. Syncing == ensure both nodes have same updates in log. How can nodes agree on update order? Update ID: