RHEEM is a system designed to fully support cross-platform data processing. That is, it enables users to run data analytics over multiple data processing platforms. For this, it provides an abstraction on top of existing platforms in order to run data analytic tasks on top of any set of platforms. As a result, users can focus on the logics of their applications rather on the intricacies of the underlying platforms.
Read more on how RHEEM tame the Zoo of existing data processing platforms to work together.
How we tame the jungle for you
Run a single data analytic task on top of any set of data processing platforms.
It selects the best available data processing platform for any incoming query.
User defined functions (UDFs) as first-class citizens, enabling extensibility and adaptability.
A simple interface that allows developers to focus only on the logics of their application.
Fast development of data analytic applications.
All code is on GitHub under Apache License.