Evaluating End-to-End Optimization for Data Analytics Applications in Weld. S. Palkar et al. PVLDB 2018

Taro L. Saito
Database Journal Club
1 min readJul 13, 2018

Weld is an approach to optimize the whole program by using a common intermediate representation (IR) for Spark SQL, TensorFlow, Python Pandas, NumPy, etc. IRs generated from various libraries will be combined together to optimize the entire code as a whole. This PVLDB paper is a follow-up work of the initial paper published at CIDR 2017, and it presents the efficiency of rule-based optimizations, such as code fusion between libraries, loop-unrolling, vectorization, etc. Weld generates multi-threaded LLVM code from an optimized IR, and the experimental results show comparable performance with specialized data processing engine like HyPer (VLDB 2011), which also generates LLVM code from SQL.

--

--

Taro L. Saito
Database Journal Club

Ph.D., researcher and software engineer, pursuing database technologies for everyone http://xerial.org/leo