Parallel & Distributed Operating Systems Group

Noria: data-flow for web applications

Project overview

Noria is an attempt at designing a database specifically tailored for web applications, providing automatic caching, safe and effortless schema migrations, and native support for reactive use.

Noria observes that, by having developers provide the set of queries their application will make in advance, the database can be smarter about how to execute those queries. In particular, it can choose to pre-compute, and incrementally maintain, the results for queries. This allows Noria to answer those queries quickly, and essentially obviates the need for application caches.

Project components

Streaming data model

Noria is built from the bottom-up to be a streaming data system by using data-flow. This allows web applications to observe a stream of changes to the result set of queries they are interested in, which fits well with the new reactive-style web applications inspired by Meteor.

Distribution and scaling

The data-flow computation model used in Noria enables efficient multi-core and cross-machine implementation of the application’s set of queries. By carefully analyzing the graph, Noria can make strategic choices about what operators should be placed on which computers, what state should be materialized, and how to shard and partition the data for availability and performance.

Safe schema migrations

By smartly re-using this materialized state, Noria can also provide fast schema migrations. Since the raw data log is always kept, migrations can be undone easily, and queries using the pre-migration queries can still be satisfied. Furthermore, since the users only specify queries, not the underlying schema, the system can choose to internally implement the user’s queries in whatever schema it deems to be most efficient.

People

Publications

Open source

Noria is available on GitHub at mit-pdos/noria.