Building a *smarter* BeerBot

Josh Friedman
KI labs Engineering
10 min readOct 17, 2019

Here we go again — BeerBot is now much more intelligent.

BeerBot?

For those wondering “what is BeerBot?”, please check Building BeerBot.

TLDR: camera + beer fridge + slackbot + 💪 = BeerBot

Latest Improvement

BeerBot 1.0 was quite dumb

Problem

Let’s begin with a simple truth, BeerBot 1.0 was quite dumb. Its tragic flaw was its simplistic image processing. The typical output led to *gasp* incorrectly identified bottles causing *shock* uncertainty about the current inventory of cold beer. A disastrous example is depicted.

Solution

Make BeerBot Smart ̶A̶g̶a̶i̶n̶. A logical next step is to educate BeerBot with an object detection model (as previously suggested).

Motivation

The obvious next question is “why even improve BeerBot?”. There are many reasons including the following:

(1) general dissatisfaction with the first MVP

(2) the inherent desire at KI labs for constant improvement

(3) a perfect opportunity to demonstrate kaos, our open source ML platform.

Object Detection

Introduction

Wikipedia best describes object detection as “detecting instances of semantic objects of a certain class (such as humans, buildings, or cars) in digital images and videos.

Object detection consists of both object localization (where objects are physically located) and object classification (which class represents the located object). See the simplistic example within the context of BeerBot.

YOLOv3

YOLOv3 was chosen for BeerBot given the following main advantages.

  • It is extremely fast.
  • It has a robust network capable of understanding generalized objects.
  • It is widely accepted with over 8000 citations and 34,000 forks.
  • It is easy to train a custom model given countless blogs.
  • It is open source.

You Only Look Once (YOLO) is one of the most popular deep learning approaches for object detection. I will not go into its architecture and/or implementation since countless other blogs (tabulated above) have already done that. Check the YOLO homepage for further information.

Implementation

Training an object detection model with kaos consists of the following steps.

overview of building a smart BeerBot

Label

Training a machine learning model requires (sufficient) training data. The YOLOv3 model requires an annotation file corresponding to the input image. An annotation file can contain multiple bounding boxes, that are scaled to normalized image units. A detailed explanation is provided here.

BeerBot labels were generated for all unique* high resolution images. A grand total of 88 unique images were labeled. All labeled images are flashed below.

*unique since the fridge door triggers an image and not necessarily a change

annotated training images (including someone’s lunch)

kaos

kaos is the platform for deploying scalable reproducible machine learning workflows in your own private environment.

Development was fueled by our (KI labs) ambition to mimic natural incremental model development, simplify model reproducibility and collaboration, and automate ML infrastructure deployment in a flexible language-agnostic environment.

kaos is the backbone of a smarter BeerBot since it simplifies the ML process.

  • It mimics natural incremental model development.
  • It ensures code, environment, and data are fully reproducible.
  • It automates infrastructure deployment with Infrastructure as Code.
  • It supports any framework with language agnostic data pipelines.
  • It is open source.

Train, serve and predict in your own automatically deployed running cluster with 8 commands!

kaos in action

Deploy

All required kaos infrastructure is deployed with code — known as Infrastructure as Code (IaC). Its usage is extremely beneficial since it is somewhat self-service, meaning DevOps knowledge is democratised and avoids bottlenecks as the platform grows. The main benefits of deploying infrastructure with code are the following.

  • Faster and Easier. Deploy an entire infrastructure by running code.
  • Consistent. Avoid human error with standardization.
  • Immediate Documentation. Code overview detailing all components.
  • Cost Effective. Avoid wasting time on manual tasks.
  • Versioned. Save snapshots of infrastructure within VCS (i.e. GitHub).

A new private BeerBot cluster was deployed within Amazon Web Services (AWS) to speed up processing and enable collaboration. Check out the documentation for full details on deploying kaos in other environments.

$ kaos build -c AWS -e dev -v...Apply complete! Resources: 85 added, 0 changed, 0 destroyed.Endpoint successfully set to http://XYZ.amazonaws.com:80/api/Successfully built kaos [dev] environment

✅ The output from kaos build is a successfully running endpoint. Note that this endpoint can be shared for collaborative development!

Train

The kaos training pipeline consists of two stages — build and train. They are separate Pachyderm pipelines but linked together to form a cohesive process.

the kaos training pipeline

The training pipeline requires at least a valid source bundle and data bundle for initiating a training job. Note that params is optionally available when runtime arguments are desired, which supports parallelized hyperparameter optimization.

Source Bundle

The source bundle is responsible for supplying the code and environment for running a training job. Its nature should be treated as ephemeral and dynamic since versioning is handled with kaos. In other words, a user does not need to adapt chaotic naming conventions (i.e. beerbot-v1, beerbot-v1-latest, beerbot-v1-final, etc...).

kaos requires a relatively clear source folder structure that absolutely must contain a train executable for running a training job. Here is the directory structure used for running YOLOv3 within kaos.

$ tree beerbot/model-train-cpubeerbot/model-train-cpu
└── yolo-cpu
├── Dockerfile
└── model
├── __init__.py
├── build_input.py
├── cfg
│ └── obj.names
├── create_custom_model.sh
├── params.json
├── requirements.txt
└── train

Here is the train bash script responsible for improving BeerBot.

#!/bin/bash

# ================
# READ BASE INPUTS
# ================

PARAMS_PATH="params.json"
echo "+ loading base inputs [${PARAMS_PATH}]"
eval "$(jq -r 'to_entries[] | "\(.key)=\(.value)"' ${PARAMS_PATH})"
jq -r 'to_entries[] | "\(.key)=\(.value)"' ${PARAMS_PATH}

# ===============================
# BUILD MODEL DEFINITION (YOLOv3)
# ===============================

echo
"+ building model definition (YOLOv3)"
bash create_custom_model.sh ${NUM_CLASS} ${MODEL_PATH}

# =========================
# BUILD DATA INPUT (*.data)
# =========================

echo
"+ building .data file"
python3 build_input.py --data_path ${DATA_PATH} \
--name_path ${CLASS_PATH} \
--ext ${IMG_EXT} \
--pct ${SPLIT_PCT}

# ========
# TRAINING
# ========

echo
"+ starting training"
python3 src/train.py --model_def ${MODEL_PATH} \
--data_config ${DATA_PATH} \
--class_config ${CLASS_PATH} \
--pretrained_weights ${WEIGHTS_PATH} \
--img_size ${IMG_SIZE} \
--epochs ${EPOCHS} \
--batch_size ${BATCH_SIZE} \
--checkpoint_interval ${CHECKPOINT_INTERVAL} \
--evaluation_interval ${EVALUATION_INTERVAL} \
--output_dir ${OUT_PATH}
echo "+ finished training"

The required training python script src/train.py is available here. Note that it is added to the source environment when building the Dockerfile.

Data Bundle

The previously described labels were separated into the following directory structure to comply with YOLOv3 requirements.

$ tree beerbot/databeerbot/data
└── single_class
├── images
│ ├── 1553248583.png
│ ├── ...
│ └── 1565626432.png
└── labels
├── 1553248583.txt
├── ...
└── 1565626432.txt

Training a YOLOv3 model within kaos is very straightforward since it only requires a single command kaos train deploy. Note that desired resource requirements for training can be set via command-line options (e.g. --cpu , --memory and/or --gpu).

$ kaos train deploy -s beerbot/model-train-cpu \
-d beerbot/data
Submitting source bundle: beerbot/model-train-cpu Compressing source bundle: 100%|███████████████████████████| ✔ Setting source bundle: /yolo-cpu:e0222Submitting data bundle: beerbot/data Compressing data bundle: 100%|███████████████████████████| ✔ Setting data bundle: /single-class:8d184
CURRENT TRAINING INPUTS +------------+---------------------+-------------+ | Image | Data | Hyperparams | +------------+---------------------+-------------+ | | ✔ | ✗ | | <building> | /single-class:8d184 | | +------------+---------------------+-------------+

More info about the training pipeline is detailed in the documentation.

Select

The status of the deployed YOLOv3 model can be checked with kaos train list. It contains high level information regarding its id, duration, start time and state.

$ kaos train list+------------------------------------------------------------------+
| TRAINING |
+-----+----------+------------+----------------------+-------------+
| ind | duration | job_id | started | state |
+-----+----------+------------+----------------------+-------------+
| 0 | 2264 | f79775 | 20 Aug 2019 09:37:44 | JOB_SUCCESS |
+-----+----------+------------+----------------------+-------------+

Additional info for a particular model can be queried via kaos train info.

$ kaos train info -i 0Job ID: f79775
Process time: 2260
State: JOB_SUCCESS
Available metrics: ['recall', 'mAP', 'precision', 'f1']
+-----------------------+---------------------------+
| Code | Data |
+-----------------------+---------------------------+
| Author: jfriedman | Author: jfriedman |
| Path: /yolo-cpu:e0222 | Path: /single-class:8d184 |
+-----------------------+---------------------------+
Page count: 1
Page ID: 0
+-----+-------------------+
| ind | Model ID |
+-----+-------------------+
| 0 | e0222_8d184:00025 |
+-----+-------------------+
| 1 | e0222_8d184:00050 |
+-----+-------------------+
| 2 | e0222_8d184:00075 |
+-----+-------------------+
| 3 | e0222_8d184:00100 |
+-----+-------------------+
| 4 | e0222_8d184:00125 |
+-----+-------------------+
| 5 | e0222_8d184:00150 |
+-----+-------------------+
| 6 | e0222_8d184:00175 |
+-----+-------------------+
| 7 | e0222_8d184:00200 |
+-----+-------------------+

Improving BeerBot requires selecting the best checkpoint based on a desired available metric (i.e. organized by epoch number). The best model for BeerBot 2.0 is selected based on the best F1 score, which is a function of both precision and recall. This is a very simple operation by passing the--sort_by option to kaos train info.

$ kaos train info -i 0 -s f1Job ID: f79775
Process time: 2260
State: JOB_SUCCESS
Available metrics: ['recall', 'mAP', 'precision', 'f1']
+-----------------------+---------------------------+
| Code | Data |
+-----------------------+---------------------------+
| Author: jfriedman | Author: jfriedman |
| Path: /yolo-cpu:e0222 | Path: /single-class:8d184 |
+-----------------------+---------------------------+
Page count: 1
Page ID: 0
+-----+-------------------+-------+
| ind | Model ID | Score |
+-----+-------------------+-------+
| 0 | e0222_8d184:00175 | 0.874 |
+-----+-------------------+-------+
| 1 | e0222_8d184:00125 | 0.871 |
+-----+-------------------+-------+
| 2 | e0222_8d184:00200 | 0.856 |
+-----+-------------------+-------+
| 3 | e0222_8d184:00150 | 0.855 |
+-----+-------------------+-------+
| 4 | e0222_8d184:00100 | 0.845 |
+-----+-------------------+-------+
| 5 | e0222_8d184:00075 | 0.806 |
+-----+-------------------+-------+
| 6 | e0222_8d184:00050 | 0.685 |
+-----+-------------------+-------+
| 7 | e0222_8d184:00025 | 0.258 |
+-----+-------------------+-------+

✅ The best model is associated with epoch 175 (e.g. e0222_8d184:00175).

Serve

The kaos serve pipeline consists of two stages — build and serve. Similar to the training pipeline, they are separate Pachyderm pipelines linked into a single cohesive process.

the kaos serve pipeline

The serve pipeline requires at minimum a valid source bundle for deploying an endpoint.

Source Bundle

The source bundle for serving a model is handled in the same manner described in the training pipeline. Serving also requires a relatively clear source folder structure that absolutely must contain a serve executable for running inference. Here is the directory structure used for serving a trained YOLOv3 model within kaos.

$ tree beerbot/model-servebeerbot/model-serve
└── yolo
├── Dockerfile
└── model
├── __init__.py
├── nginx.conf
├── predict.py
├── serve
├── web-requirements.txt
└── wsgi.py

Deploying a running endpoint within kaos is very straightforward since it only requires a single command kaos serve deploy. The previously selected model_id is used as the best BeerBot model.

$ kaos serve deploy -s beerbot/model-serve \
-m e0222_8d184:00175
Submitting source bundle: beerbot/model-serve
Compressing source bundle: 100%|███████████████████████████|
✔ Adding trained model_id: e0222_8d184:00175
✔ Setting source bundle: /yolo:cfcc8

✅ The url of the running endpoint can be identified via kaos serve list.

$ kaos serve list+-----------------------------------------------------------------+
| RUNNING |
+-------------------------------+-----------+---------------------+
| url | user | created_at |
+-------------------------------+-----------+---------------------+
| XXX.elb.amazonaws.com/beerbot | jfriedman | 2019-08-20 10:34:34 |
+-------------------------------+-----------+---------------------+

More info about the serve pipeline is detailed in the documentation.

Predict

The running endpoint is now ready for “real world” testing with a new raw high resolution image from BeerBot.

latest_image.png
$ curl -X POST XXX.elb.amazonaws.com/beerbot \
--data-binary @latest_image.png
[
{
"cls":"Beer",
"cls_conf":0.9901855,
"conf":0.9999855,
"x1":1605.83,
"x2":1691.78,
"y1":269.31,
"y2":380.91
},

...

{
"cls":"Beer",
"cls_conf":0.9839665,
"conf":0.9985551,
"x1":759.06,
"x2":878.05,
"y1":335.56,
"y2":458.63
}
]

Success! kaos was used to train an object detection model for BeerBot.

Results

The improved BeerBot 2.0 engine™ (still without a trademark) consists of the following processing pipeline. The main changes with respect to the BeerBot 1.0 engine™ are the Bottle Detection and Slackbot components.

Bottle Detection

Detection is quite robust without the previous deterministic image processing. It now only predicts with a high resolution “raw” image against the running kaos endpoint (previously described). Two forms of visualization from BeerBot 2.0 are included below. Raw object detection predictions (left) show the location and confidence from the trained YOLOv3 model. The segmented bottle caps (right) are coloured by relative coldness (i.e. age).

BeerBot 2.0 is quite smart (and quite confident)

Slackbot

A new command (/engine) was added to BeerBot 2.0 to debug raw bounding box predictions from the underlying YOLOv3 model. It aids in identifying missing or poorly localized bottles based on its normalized confidence.

The three main BeerBot commands (1) /inventory, (2) /photo and (3) /engine are animated below (left to right).

BeerBot 2.0 in action

Get Involved

Once again, please get in contact if you’re interested in BeerBot or kaos. BeerBot and kaos are both always open for any issues or pull requests. All contributions are greatly appreciated.

Finally, check out www.ki-labs.com for more information about our great products and services.

--

--

Josh Friedman
KI labs Engineering

Product. Data. Hacker. Problem Solver. Crypto-enthusiast. Hockey Fan.