Training Julia ML model in GCP

Matthew Leung
Analytics Vidhya
Published in
2 min readJan 25, 2021

Although Google Cloud AI platform is supposed to run the ML module in Python module, I use a trick to run Julia training module on GCP AI platform by packaging the Julia ML module in the Docker image. The

  1. Create the Dockerfile
FROM julia
WORKDIR /app
COPY *.jl /app/
COPY OsRSIConv/ /app/OsRSIConv/
RUN julia installPkg.jl
ENTRYPOINT ["julia", "run.jl"]

2. Build and push the docker image to GCP image repository.

docker build -t gar.io/<project>/osrsi .
docker push gcr.io/<project>/osrsi

3. Submit docker image as the training job to GCP AI platform by referring the image in the parameter: master-image-uri.

gcloud ai-platform jobs submit training $job_id \
--region "us-central1" \
--master-image-uri=gcr.io/$PROJECT_ID/osrsi:latest

There is no need to create any longing VM, and GCP only charge you the CPU credit the job used.

Any output including the model file, csv output can be stored in the GCP storage. Although there is third-party GCP Storage API for Julia, just run the shell command gsutil using run function is much easiler.

run(`google-cloud-sdk/bin/gsutil cp file.csv gs://bucket/file.csv`)

In order to run the google-cloud-sdk in the shell, the above dockerfile should be modified for the sdk and python installation. Here is the line to be added.

RUN apt-get update && apt-get install -y curl && apt-get install wget -y
RUN curl -O https://dl.google.com/dl/cloudsdk/channels/rapid/downloads/google-cloud-sdk-324.0.0-linux-x86_64.tar.gz
RUN gzip -cd google-cloud-sdk-324.0.0-linux-x86_64.tar.gz|tar -xvf -
RUN wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
RUN bash Miniconda3-latest-Linux-x86_64.sh -b -p /app/miniconda
ENV PATH $PATH:/app/google-cloud-sdk/bin:/app/miniconda/bin

CI/CD pipeline

Use the GCP cloud build to build the automated docker build and training pipeline.

  1. create build trigger for the GitHub repo.
  2. The trigger will refer the custom clouldbuild.yaml to build the docker image and submit training job.

cloudbuild.yaml

steps:- name: 'gcr.io/cloud-builders/docker'
args: ['build','-t','gcr.io/$PROJECT_ID/osrsi','.']
timeout: 3600s
- name: 'gcr.io/cloud-builders/docker'
args: [ 'push', 'gcr.io/$PROJECT_ID/osrsi' ]
timeout: 3600s
- name: 'gcr.io/cloud-builders/gcloud'
entrypoint: 'bash'
args:
- '-eEuo'
- 'pipefail'
- '-c'
- |-
ts=`date +%Y%m%d%H%M`
job_id="julia_osrsi_stock_training_$ts"
gcloud ai-platform jobs submit training $job_id \
--region "us-central1" \
--master-image-uri=gcr.io/$PROJECT_ID/osrsi:latest
timeout: 3600s

3. Any push to the main branch will trigger the build and run the job on AI platform automatically.

4. (optional) use Cloud scheduler to run the Cloud Build Trigger periodically, say daily, by using the following trigger REST API:

https://cloudbuild.googleapis.com/v1/projects/<project>/triggers/<trigger-id>:run

--

--