6.824 Lecture 7: Release consistency What makes a good consistency model? There are no "right" or "wrong" models A model may make it harder or easier to program i.e. lead to more or less intuitive results A model may be harder or easier to implement efficiently Last week: Strict consistency Sequential consistency Today Release consistency (eager and lazy) Lab 6 will implement lazy release consistency on extents. Lab 5 lays the ground work by doing lazy releasing of locks. Treadmarks high level goals? Better DSM performance. Run existing sequential consistent parallel code. What specific problems with previous DSM are they trying to fix? false sharing of large pages only one writer of a page at a time write of irrelevant data may invalidate my page reduce overhead in particular when a variable is contended What are write diffs? And what is the point? Are they just like object/variable granularity rather than page granularity? No sub-page invalidate, so you would need to send updates after each store. Do diffs make sense by themselves? I.e. w/o RC or LRC? What is release consistency? And what is the point? When would RC be more efficient than just write diffs? What is lazy release consistency? And what is the point? When would LRC be more efficient than RC? Example of why LRC might speed up your program. CPU0: while (1) al1 many writes to x rl1 CPU1: while (1) al1 many writes to x rl1 What would happen under sequentially-consistent memory (e.g., IVY)? Assume each variable on its own page. Or that granularity is a single word... CPU0: al1 x=1 rl1 al2 z=99 rl2 CPU1: al1 y=x rl1 CPU2: al1 print x, y, z rl1 How would an eager release consistent system handle this? What does Treadmarks do? And why is that faster? Why is it legal for z to be out-of-date at CPU2? How does Treadmarks *know* it is legal? Example for why you need the vector timestamps. CPU0: al1 x=1 rl1 CPU1: al1 y=x rl1 CPU2: al1 print x, y rl1 What's the "right" answer? How would an eager release consistent system handle this? How does lazy release consistency handle this? How does TreadMarks know what to do? What if you get different values from different sources? CPU0: al1 x=1 rl1 al2 y=9 rl2 CPU1: al1 x=2 rl1 CPU2: al1 al2 z = x + y rl2 rl1 CPU2 is going to hear "y=9" from CPU0, and "x=2" from CPU1. How does CPU2 know what to do? What if the VTs for the two values are not ordered? Could this happen? CPU0: al1 x=1 rl1 CPU1: al2 x=2 rl2 CPU2: al1 al2 print x rl2 rl1 What model of consistency does the programmer need to have in mind? What rules does the programmer have to follow? What can the programmer expect from the memory system? Example of when LRC might still do too much work. CPU0: al2 z=99 rl2 al1 x=1 rl1 CPU1: al1 y=x rl1 In this case, CPU1 didn't really need z. This suggests that further improvements might be possible, at what expense? Compiler or programmer notices dependencies? What happens in this case? CPU0: al1 x=1 rl1 al1 x=2 rl1 CPU1: al1 al2 y=x rl2 rl1 CPU2: al2 z=y rl2 Does CPU2 get the x=2 update? Should it? Does it make any difference? In most cases TreadMarks could avoid sending x=2 if it wanted to. But there might have been a GC, forcing CPU2 to know about x=2. Are there programs that work under RC that break under LRC? RC is blind to which lock variable was involved. CPU0: al1 x=1 rl1 al1 if(y==0) ... rl1 CPU1: al2 y=1 rl2 al2 if(x==0) ... rl2 TreadMarks keeps complete history of write notices. I.e. accurate record of what changed in each interval. Write notice and interval record lists in Figure 2. It does not just keep the latest data. Why? Is the history needed for LRC? What's the point of the diffs? Why do they send write notices separately from the diffs? They can delay the diffs until you actually use the page: you might not. Maybe you already have the write notice? But VT indicates this? When you ask for diffs, how far back in the past do you ask for them? Does this imply unbounded amount of history?