My favorite python libraries
In this article i want to jot down about the library that i really like to use because of the library design.
Dask is library that help you running algorithm in parallel and/or distributed way.
Most of the data sciencist use Spark for doing distributed programming but the problem in spark is there no intuitive and efficient way for doing matrix operation. For example in my case in image processing, i have designed algorithm using opencv to do feature extraction and now i want to run that in distributed fashion and do machine learning stuff. Spark is not flexible enough to integrate process that use numpy.array.
Dask help me by doing it in graph computation. I just design my algorithm in separate function then feed it to dask. Also dask has 3 type data that really mimic most popular python library:
- dask.array == numpy.array
- dask.dataframe == pandas.dataframe
Also you can design graph computation to run in 4 available technique for running application in parallel. There are:
Project Dask was funded by DARPA XData and Continuum Analytics so i think it has promising future.
Bottle is library and/or framework for building webserver. It has similiar design with Flask except Bottle have global routing (so i don’t need to make blueprint like in Flask). Because of that newbie can learn and develop faster ☺. The nice feature that Bottle have is you can choose which server engine you preferred. There are 9 server engine that you can choose (including CherryPy and Tornado) ranging from synchronous, mutithreading, fork based, and asynchronous. Also there are many plugin available. In version 0.13 bottle can use server engine that nodejs used.
One sentence for MXNet, versatile and flexible Deep Learning framework. Want to run deep learning in hybrid cluster (half gpu half cpu)? She can do that. Want to deploy your model in android? She can nail it (clever way using almagamation). You want the pretrained model? MXNet has the model zoo so you can just download it. The finishing sentence, it has promising future because now its part of Apache Foundation project (incubation phase).