Making Image Classification Simple With Spark Deep Learning

Zied Sellami
Jun 28, 2017 · 5 min read

Introduction

Prerequisites

curl -O https://d3kbcqa49mib13.cloudfront.net/spark-2.1.1-bin-hadoop2.7.tgz
tar xzf spark-2.1.1-bin-hadoop2.7.tgz
https://github.com/zsellami/images_classification
https://www.tensorflow.org/install/
sudo pip install nose
sudo pip install pillow
sudo pip install keras
sudo pip install h5py
sudo pip install py4j

Run pyspark with spark-deep-learning library

export SPARK_HOME=PATH/TO/spark-2.1.1-bin-hadoop2.7
export set JAVA_OPTS="-Xmx9G -XX:MaxPermSize=2G -XX:+UseCompressedOops -XX:MaxMetaspaceSize=512m"
$SPARK_HOME/bin/pyspark --packages databricks:spark-deep-learning:0.1.0-spark2.1-s_2.11 --driver-memory 5g

Let’s code images classification on pyspark shell

Running another sample very quickly

curl -O http://download.tensorflow.org/example_images/flower_photos.tgz
tar xzf flower_photos.tgz

Conclusion

Linagora Engineering

We are Open Source Engineers, Hacking Awesome Stuff