Evaluating End-to-End Optimization for Data Analytics Applications in Weld. S. Palkar et al. PVLDB 2018
Weld is an approach to optimize the whole program by using a common intermediate representation (IR) for Spark SQL, TensorFlow, Python Pandas, NumPy, etc. IRs generated from various libraries will be combined together to optimize the entire code as a whole. This PVLDB paper is a follow-up work of the initial paper published at CIDR 2017, and it presents the efficiency of rule-based optimizations, such as code fusion between libraries, loop-unrolling, vectorization, etc. Weld generates multi-threaded LLVM code from an optimized IR, and the experimental results show comparable performance with specialized data processing engine like HyPer (VLDB 2011), which also generates LLVM code from SQL.
- Shoumik Palkar, James Thomas, Deepak Narayanan, Pratiksha Thaker, Rahul Palamuttam, Parimarjan Negi, Anil Shanbhag, Malte Schwarzkopf, Holger Pirk, Dr.saman Amarasinghe, Samuel Madden, Matei Zaharia:
Evaluating End-to-End Optimization for Data Analytics Applications in Weld. PVLDB 2018.