Vigor and Wisdom: keeping past alive in the Immortal system Liuba Shrira Brandeis University Decreasing storage costs and new efficient versioning techniques allow applications to retain virtually unlimited amounts of past states. >From a casino that analyzes past table states to detect card counters, to a health-care system that audits past states to prove compliance with patient privacy policies, we are moving into a world that does not update, just inserts new states[1]. At the same time, applications are becoming increasingly interconnected. Isolated systems are an exception. Interconnected distributed systems are the norm. The question is how to organize a system (we call it Immortal) that allows applications to access the interconnected distributed past states over unlimited time? One promising approach is to capture past states as snapshots that support back in-time-execution where read-only applications run over consistent past states. Recent results in snapshot systems[2,3] indicate that when a storage system is designed with a snapshot system in mind, even high frequency snapshots can be captured in a non-disruptive way. Storing past states over unlimited time, however, is a huge challenge compared to short term storage. Some of the hard problems have to do with the threats to availability, reliability and security of the stored distributed states. Work in archival storage systems is addressing these problems, and promising solutions are in sight [5]. There is, however, a fundamental challenge that has not been addressed adequately - software upgrades. The challenge, as software evolves over long time-scale, is twofold - to provide application access to past software versions, and to insure this does not compromise access to the current objects. The challenge is real. Inaccessible data due to obsolete software versions, and bloated software versions supporting legacy behavior, are its recognizable signs. The reason the problem has been inadequately addressed is because its true importance only comes into focus when considering long time-scales. Over long time-scales the software of the objects accessed by an application will be upgraded. To keep the past states of an upgraded object "alive", that is accessible by back-in-time execution, Immortal will need to retain its past software versions. Most existing distributed software upgrade systems today solve the problem of past software versions by mandating that upgrades are compatible, that is that new software versions support legacy behavior. The compatible upgrade approach is simpler in the short term because it avoids the difficult problem of incompatible distributed upgrades where some nodes run incompatible versions communicating by incompatible protocols. Support for legacy behavior, however, over long time-scales leads to software bloat, causing software versions to become complex and fragile. Needless to say, many problems in todays software systems, can be traced to software bloat. We believe, Immortal must adopt an automatic upgrade methodology that supports incompatible distributed upgrades[4] and controls software bloat. Such methodology should be designed with long time-scale in mind. It should be based on precise specifications that reflect well-defined semantics of multi-version protocols. The upgrader should be able to define how long the legacy behavior needs to be supported by defining the deployment schedule for incompatible upgrades[4]. The Immortal system is a grand challenge for distributed systems. Building Immortal is a hard multifaceted problem, that requires reconsidering many basic issues in todays distributed systems, of which we outlined only one. Yet, if successful, Immortal would provide an important step forward: distributed systems would gain the "wisdom" of the past without the loss of "vigor" in the present. References: [1] J.Gray. Communications. [2] H. Xu and L. Shrira,"SNAP: non-disruptive high-frequency snapshot system", ICDE 2005. [3] R. Moh and B. Liskov, "Timeline: Efficient Distributed Snapshots", NSDI 2004. [4] S. Ajmani, B. Liskov, L. Shrira and D. Curtis, "Automatic Software Upgrades in Distributed Systems", Technical Report MIT/CSAIL, March 2005. [5] M. Baker, K. Keeton and S. Martin, "Why traditional storage systems do not help us to save stuff forever", HotDep 2005.