Metis is an in-memory MapReduce library optimized for multicore architectures. The high efficiency of Metis relies on the use of hash tables to store intermediate key/value pairs. To guarantee high performance, Metis determines the size of hash table by sampling the input, so that key/value insertions and queries are O(1). Metis also organizes key/value pairs within each hash table slot as a B+Tree to prevent the inacurracy of sampling from degrading the performance. Metis uses Parallel Sorting by Regular Sampling sorting algorithm for the Merge phase to achieve high parallelism.

New release!

Oct 23rd, 2012: We are happy to release the C++ version of Metis through github.


Mar 7th, 2012: We have included matrix_mult2, a version of matrix multiply optimized by Mark Roth.

Metis code is avaliable at:
The code comes up with modified versions of the eight MapReduce applications from Phoenix, and a modified version of Streamflow. Metis is tested on Linux x86_64. To compile Metis, you may need to install libnuma and the GCC packages. See the file README in the top-level directory of the Metis source tree for detailed instructions.

The data for the tests described in the paper is avaliable at (you need to run data/ to generate all the data files):

MIT CSAIL Parallel & Disributed Operating Systems Group