Teachable Desktop Automation

Teach your computer to recognize gestures and trigger a set of actions to perform after a certain gesture is recognized.

Vivek Amilkanthawar
The HumAIn Blog
5 min readApr 1, 2019

--

Hello World! I’m very excited to share with you my recent experiment wherein I tried to teach my computer certain gestures and whenever those gestures are recognized certain actions will be performed.

In this blog, I’ll explain all you need to do to achieve the following

IF: I wave to my webcam

THEN: move the mouse pointer a little to its right

I have used power for Node js to achieve this. The idea is to create a native desktop app which will have access to the operating system to perform certain actions like a mouse click or a keyboard button press and also on the same native desktop app we’ll try to train our model and draw inferences locally.

To make it work, I thought of using tensorflow.js and robotjs in an Electron App created using Angular.

So, are you ready? let’s get started…

# Generate the Angular App

Let’s start by creating a new Angular project from scratch using the angular-cli

# Install Electron

Add Electron and also add its type definitions to the project as dev-dependency

# Configuring the Electron App

Create a new directory inside of the projects root directory and name it as “electron”. We will use this folder to place all electron related files.

Afterward, make a new file and call it “main.ts” inside of the “electron” folder. This file will be the main starting point of our electron application.

Finally, create a new “tsconfig.json” file inside of the directory. We need this file to compile the TypeScript file into JavaScript one.

Use the following as the content of “tsconfig.json” file.

Now it’s time to fill the “main.ts” with some code to fire up our electron app.

Visit electronjs for details

# Make a custom build command

Create a custom build command for compiling main.ts & starting electron. To do this, update “package.json” in your project as shown below

We can now start our app using npm:

There we go… our native desktop app is up and running! However, it is not doing anything yet.

Let’s make it work and also add some intelligence to it…

# Add Robotjs to the project

In order to simulate a mouse click or a keyboard button press, we will need robotjs in our project. I installed robotjs with the following command

and then tried to use in the project by referring to some examples on their official documentation. However, I struggled a lot to make robotjs work on the electron app. Finally here is a workaround that I came up with

Add ngx-electron to the project

And then inject its service to the component where you want to use the robot and use remote.require() to capture the robot package.

# Add Tensorflow.js to the project

We’ll be creating a KNN classifier that can be trained live in our electron app (native desktop app) with images from the webcam.

# A quick reference for KNN Classifier and MobileNet package

Here is a quick reference to the methods that we’ll be using in our app. You can always refer tfjs-models for all the details on implementation.

KNN Classifier

  • knnClassifier.create(): Returns a KNNImageClassifier.
  • .addExample(example, classIndex): Adds an example to the specific class training set.
  • .predictClass(image): Runs the prediction on the image, and returns an object with a top class index and confidence score.

MobileNet

  • .load(): Loads and returns a model object.
  • .infer(image, endpoint): Get an intermediate activation or logit as Tensorflow.js tensors. Takes an image and the optional endpoint to predict through.

# Finally make it work

For this blog post, I’ll keep aside the cosmetics part (CSS I mean) apart and concentrate only on the core functionality

Using some boilerplate code from Teachable Machine and injecting robotjs into app component here is how it looks

and now when running the command npm run electron you see me (kidding)

# Let’s Test it

I’ll train the image classifier on me waving to the webcam (Class 2) and also with me doing nothing (Class 1).

Following are the events are associated with these two classes

Class 1: Do nothing

Class 2: Move mouse pointer slightly to the right

With this, your computer can learn your gestures and can perform a whole lot of different things because you have direct access to your operating system.

The source code of this project can be found at below URL…

I have shared my simple experiment with you. Now it’s your turn try to build something with it and consider sharing that with me as well :D

I have just scratched the surface, any enhancements, improvements to the project are welcome though GitHub pull requests.

Well, that’s it for this post… Thank you for reading until the end. My name is Vivek Amilkanthawar, see you soon with another one.

--

--