Oracle Developers
Published in

Oracle Developers

Running Apache Zeppelin on the cloud

Introduction

OCI: get started quickly

  • Introduction
  • Signing up
  • Setup
  • Actions to get on the cloud

Zeppelin: get started quickly

Running Apache Zeppelin

Logging into the VM instance

### Oracle Linux and CentOS images, user name: opc
### the Ubuntu image, user name: ubuntu
$ ssh -i ~/.ssh/id_rsa ubuntu@132.145.60.249or$ ssh ubuntu@132.145.60.249
The authenticity of host '132.145.60.249 (132.145.60.249)' can't be established. 
ECDSA key fingerprint is SHA256:USafjsySmPItXTdBOsQyiYbEdiFSa7Cs1so+9EnKC4M.
Are you sure you want to continue connecting (yes/no)? yes

Cloning the git repo

$ git clone https://github.com/neomatrix369/awesome-ai-ml-dl/
$ cd examples/apache-zeppelin

Installing Docker

$ ./installDocker.sh

Building Apache Zeppelin Docker image (optional)

$ DOCKER_USER_NAME=<your Docker Hub username> IMAGE_VERSION=0.2 ./buildZeppelinDockerImage.sh
Sending build context to Docker daemon  34.82kBStep 1/21 : ARG ZEPPELIN_VERSION
Step 2/21 : FROM apache/zeppelin:${ZEPPELIN_VERSION}
---> 353d7641c769
Step 3/21 : ARG SPARK_VERSION
---> Using cache
---> 2ca1b6703dd7
Step 4/21 : ENV SPARK_VERSION=${SPARK_VERSION:-2.4.3}
---> Using cache
---> f507d31d0aca
Step 5/21 : RUN echo "$LOG_TAG Download Spark binary" && wget -O /tmp/spark-${SPARK_VERSION}-bin-hadoop2.7.tgz http://archive.apache.org/dist/spark/spark-${SPARK_VERSION}/spark-${SPARK_VERSION}-bin-hadoop2.7.tgz
---> Running in c94542e7eb00
[ZEPPELIN_0.8.1]: Download Spark binary
--2019-10-13 19:55:16-- http://archive.apache.org/dist/spark/spark-2.4.4/spark-2.4.4-bin-hadoop2.7.tgz
Saving to: ‘/tmp/spark-2.4.4-bin-hadoop2.7.tgz’
[--snipped--]213350K .......... .......... .......... .......... .......... 94% 51.4K 3m0s
213400K .......... .......... .......... .......... .......... 94% 88.1K 2m59s
213450K .......... .......... .......... .......... .......... 95% 58.7K 2m59s
213500K .......... .......... .......... .......... .......... 95% 45.5K 2m58s
213550K .......... .......... .......... .......... .......... 95% 4.40M 2m57s
213600K .......... .......... .......... .......... .......... 95% 83.8K 2m56s
213650K .......... .......... .......... .......... .......... 95% 91.9K 2m55s
213700K .......... .......... .......... .......... .......... 95% 67.2K 2m55s
213750K .......... .......... .......... .......... .......... 95% 166K 2m54s
213800K .......... .......... .......... .......... .......... 95% 79.8K 2m53s
[--snipped--]Step 21/21 : CMD ["bin/zeppelin.sh"]
---> Running in 843684f60302
Removing intermediate container 843684f60302
---> 5833f13ff7c7
Successfully built 5833f13ff7c7
Successfully tagged neomatrix369/zeppelin:0.2
  • amendments made to Zeppelin-Dockerfile)
  • the build and run scripts also looks different (buildZeppelinDockerImage.sh and runZeppelinDockerImage.sh)
  • and we are also using to 0.2 see CLI usages in the post

Pushing the Docker Image to Docker Hub (optional)

$ DOCKER_USER_NAME=<your Docker Hub username> IMAGE_VERSION=0.2 ./push-apache-zeppelin-docker-image-to-hub.sh
  • an account on Docker Hub (i.e. neomatrix369) — of course, your own account
  • you are logged into your Docker Hub account locally
  • you have set up the DOCKER_USER_NAME with your Docker hub account

Running Apache Zeppelin from the Docker Image

$ docker pull neomatrix369/zeppelin:0.1
$ ./runZeppelinDockerContainer.sh
$ docker pull neomatrix369/zeppelin:0.2
$ IMAGE_VERSION=0.2 ./runZeppelinDockerContainer.sh
ubuntu@instance-20191014-0101:~/awesome-ai-ml-dl/examples/apache-zeppelin$ IMAGE_VERSION=0.2 ./runZeppelinDockerContainer.shPlease wait till the log messages stop moving, it will be a sign that the service is ready! (about a minute or so)Once the service is ready, go to http://localhost:8080 to open the Apache Zeppelin homepagePid dir doesn't exist, create /zeppelin/runOpenJDK GraalVM CE 19.0.0 warning: ignoring option MaxPermSize=512m; support was removed in 8.0SLF4J: Class path contains multiple SLF4J bindings.SLF4J: Found binding in [jar:file:/zeppelin/lib/interpreter/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]SLF4J: Found binding in [jar:file:/zeppelin/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory][---snipped---]WARNING: A HTTP GET method, public javax.ws.rs.core.Response org.apache.zeppelin.rest.CredentialRestApi.getCredentials(java.lang.String) throws java.io.IOException,java.lang.IllegalArgumentException, should not consume any entity.WARNING: The (sub)resource method createNote in org.apache.zeppelin.rest.NotebookRestApi contains empty path annotation.WARNING: The (sub)resource method getNoteList in org.apache.zeppelin.rest.NotebookRestApi contains empty path annotation.

Opening the Apache Zeppelin notes in your browser

http://132.145.60.249:8080

Using Apache Zeppelin notes

Create a custom image for reuse

Power-user

Signing off

[--snipped--]Oct 14, 2019 1:02:40 AM org.glassfish.jersey.internal.Errors logErrors
WARNING: The following warnings have been detected: WARNING: A HTTP GET method, public javax.ws.rs.core.Response org.apache.zeppelin.rest.InterpreterRestApi.listInterpreter(java.lang.String), should not consume any entity.
WARNING: A HTTP GET method, public javax.ws.rs.core.Response org.apache.zeppelin.rest.CredentialRestApi.getCredentials(java.lang.String) throws java.io.IOException,java.lang.IllegalArgumentException, should not consume any entity.
WARNING: The (sub)resource method createNote in org.apache.zeppelin.rest.NotebookRestApi contains empty path annotation.
WARNING: The (sub)resource method getNoteList in org.apache.zeppelin.rest.NotebookRestApi contains empty path annotation.
^C

Conclusion

  • similar flexibility as Jupyter notebooks, and allows extending functionality via configurations and extensions
  • execution progress per paragraph (per cell) is always displayed (in real-time) unlike Jupyter notebooks
  • lazy execution to help efficiency
  • round-trip navigability between table data and visualisation in the cell (paragraph)
  • execution may appear a bit slower than Jupyter notebooks at times
  • but there are solutions to speed this up (for future posts to cover)
  • all-in-all a great place for Java/JVM developers to feel at home and do ML experiments on the JVM
  • an easy-to-use cloud environment
  • quickly set up our environment to get to market with our apps and solutions we want to bring to market quick
  • enables us to run Apache Zeppelin (natively or via Docker image)
  • instances that can be shared publicly or privately depending on your network security settings
  • provides ways to secure your infrastructure on the cloud (we didn’t cover it with much depth here), but please check out the docs on Security on the OCI docs page to learn more.

Resources

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Mani Sarkar

Java/JVM dev, Software Crafter, @GraalVM, VMs, Dev. communities, DevOps, containers, JS, speaker, blogs @theNeomatrix369 https://www.linkedin.com/in/mani-sarkar