Fork me on GitHub

The Cross-Platform
Data Processing

RHEEM is a system designed to fully support cross-platform data processing. That is, it enables users to run data analytics over multiple data processing platforms. For this, it provides an abstraction on top of existing platforms in order to run data analytic tasks on top of any set of platforms. As a result, users can focus on the logics of their applications rather on the intricacies of the underlying platforms.

Turning a Zoo into a Circus

Read more on how RHEEM tame the Zoo of existing data processing platforms to work together.


How we tame the jungle for you


Run a single data analytic task on top of any set of data processing platforms.


It selects the best available data processing platform for any incoming query.


User defined functions (UDFs) as first-class citizens, enabling extensibility and adaptability.


A simple interface that allows developers to focus only on the logics of their application.

Cost Saving

Fast development of data analytic applications.

Open Source

All code is on GitHub under Apache License.