ASL Menu Application
Introduction
This project proposes an application that takes ASL gestures as input and returns a food item corresponding to that gesture as an output. The gestures have been taken from ASL. One can set an initial mapping of gesture to food item and then is good to go.
Problem Definition
The basic ideology behind this project was to make any customer-based utility service more friendly for the people who are disabled (deaf and mute specifically) Since they are taught sign — language as an alternative to the standard way of interacting with others to overcome the barriers of communication, a lot of them face problems in the day–to–day life like ordering food at a “Drive-thru” or having a conversation with an attendant who is oblivious to their predicament and whose incompetence might be a bit irritating for the person who is disabled.
Proposed Solution
Upload an ASL gesture and the app will identify it and recommend you the food to be ordered (For e.g. if a person wants to have a burger, he can upload the image of ‘B’ gesture in ASL language and the app would show predicted alphabet as well as the food suggested) in order to help both the customer and the retailer to have a better engagement whilst having a conversation.
Model
Fine tuned VGG16 to our dataset and we also have another CNN that we made from scratch (to use as a baseline) VGG16 gave an accuracy of 0.84 on the test data.
Architecture
Technical Architecture
Implementation Details
Setup 2 containers -
- api-service container
- frontend-simple container
1. API Service Container:
Exposes port 9000 from the container to the outside for the api server. It
- Downloads the best model from GCP bucket using the json file in secrets folder and stores it in the persistent_folder.
- Uses the best model to predict the ASL gesture of the input image (image uploaded by the user).
Trying out predict, uploading image of ‘B’ ASL gesture to get the prediction label:
2. Frontend App Container:
Simple frontend app that uses basic HTML & Javascript. Exposes port 8080 from the container to the outside for the web server.
When ‘VIEW MENU’ button on predict.html page is clicked, it takes to menu.html page for user to have a look at menu.
When ‘Order Now!’ button on menu.html page is clicked, it takes to predict.html page for user to upload ASL gesture and get the prediction and suggested food.
Prediction Example -
Deployment
Manual deployment steps followed-
- Built and pushed docker images to Docker Hub
2. Created Compute Instance (VM) in GCP
3. Provisioned the server (Installed required softwares)
4. Setup Docker containers in VM Instance.
5. Setup a web server to expose the app to the outside world.
Automated the deployment using Ansible-
- Setup local container to connect to GCP (arcon-app-deployment)
Created 2 service accounts besides the already existing bucket-reader service account-
- deployment — Has admin access to the group GCP project
- gcp-service — Has read access to the group GCP projects GCR
2. Built and pushed docker images to GCR
3. Created Compute Instance (VM) in GCP
4. Provisioned the server (Installed required softwares)
5. Setup Docker containers in VM Instance
6. Setup a web server to expose the app to the outside world.
Arcon App — Kubernetes Deployment
Created a Deployment Yaml file (Ansible Playbook) and used it to create and deploy the app to Kubernetes Cluster.
Future Work
We plan on adding distilled models so that they can be stored locally on the machine rather than on cloud. We hope this will make the application faster. We also plan on using techniques like Federated Learning to preserve a users privacy. It goes without saying that in the final app we plan on having users so that each distilled model is personalized to that user. The teacher model will be VGG16 stored on cloud.