sparkavro: Manupilate Apache Avro file with sparklyr

I created a simple sparklyr extension to handle Apache Avro file. It is just a simple wrapper of DataBrick’s spark-avro. It is listed in the official document of sparklyr extensions.

Visualize your massive data with Impala and Redash

Redash is a famous OSS visualization tool, which enables to visualize your data with SQL. It supports Apache Impala (incubating), fast SQL-on-Hadoop suitable for BI tools and exploratory analysis. With Impala, you can query SQLs to tables on…

Building predictive Model with Ibis, Impala and scikit-learn


  • visualizing MovieLens 20M data (famous movie rating data) with Ibis
  • build predictive model for movie favor with scikit-learn
Democratizing Data
Think like an amateur, write as a professional
