TrueASLPiano — Creative Machine learning Project

Ali Elzalmy
TrueASLPiano
Published in
10 min readJan 13, 2019

1. Creative Motivation and intended use

My final Machine learning project is an classification program that recognizes one-handed ASL to play piano as well as creating adding in melody with the other hand. The piano has keys which are ordered in letters A,B,C,D,F,G and thus I created a ASL recognition machine that when a user performs the gestures for the alphabetic letter of “A” for example it will play the A note and give you visual feedback.

Initially, I wanted to create a machine learning program that uses people hand gestures to output information, since I was quite interested in music my initial idea was a Piano classification program. I thought about exactly what I wanted to achieve in this program and it was these objectives:

  • To use hand gestures to create a musical instrument of some sort
  • To be able to add some educational value using my program
  • To build a good gesture recognizer using a machine learning algorithm.

The software is intended to be used by anyone as it will teach them basic ASL e and be able to create some cool sounding music and melodies. Although it can be used as a standalone program for people to learn ASL (As it indicates what letter is being shown when the gesture is being performed).

I wanted to build assistive technology using Machine learning as I think it can be a great tool to help people that want to learn basic ASL and my initial choice involved a musical instrumental so this is a combination of both. I did cycle through a few ideas, but wanted to be creative in my approach and this is the idea I came up with.

2. Existing similar work

Because this idea was quite original and my motivation was clear, there are no similar works or inspiration that I took from other academic sources.

However, there “teaching sign language” Software made using Machine learning + Leap motion mini project that I read and was definitely impressed with.

Taylor, S. (2016). ASL Tutor: Teaching Sign Language with Leap Motion + Machine Learning. [Blog] Blog.leapmotion. Available at: http://blog.leapmotion.com/asl-tutor-teaching-sign-language-leap-motion-machine-learning/ [Accessed 4 Jan. 2019].

3. Implementation

Once I had clear objectives in what I wanted to achieve in this project, It was time for me to choose a algorithm. I knew that this was a classification problem as you would imitate certain gestures of the ASL Alphabet. The other input I was using also gestures so it was the obvious choice this was going to be a classification problem.

Image taken and edited from www.startASLcom

Since there are different types of classification it is important to understand that my input will be a leap motion sensor. I will use the Leap motion 15 input Processing code on the Wekinator Example website to do so. I will be able to record 15 different data sets, and since the gestures will range in a specific data set its better off to use “lazy learner” classification algorithms.

Lazy learner algorithm are preset stored data and will detect most related data to the training set and does not make assumptions (unlike eager learners), therefore this proves most efficient to my algorithm for this project. The one I decided to choose is K-nn algorithm. From the feedback of the proposal I had received, I knew that it was more about feature and data accuracy rather than large amount of data, so I had to test out whether the value of k would be higher. If more data was required, then K would needed to have been higher because it would be more smooth and ignore outliers easier.

Inputs : 2

Outputs: 2

Featurnator programs: 2

The software tools I decided to use:

  • Processing (Java)
  • Featurnator

Libraries used:

  • Com.leapmotion (For java)
  • ddf.minim (For audio purposes)
  • OSC (To send messages to Featurnator)

When recording my gestures, it was important to get as much accuracy as possible, one of the main data analyzing techniques I used was to look at the raw training data in table format.

An example would be when I would gesture the letter “E”. The “E” gesture and “A” are quite similar, and since I prioritized feature selection more, I decided to delete outlier data from inputs 1–13 (Fingers X,Y,Z Data). This helped to greatly improve the accuracy of my program to recognize similar gestures.

Inputs

Input Program, detecting my hands on Leapmotion and sending this data

My input program was from the Wekinator Example and is a program which sends 15 inputs over OSC . I had edited this by making it only detect one hand using the hand.isLeft/Right Function, where it will detect which hand is being recognized and draw this. It was essential as the program can only send 15 features from 1 hand at a time. I made sure it also sent over the feature of the hand it was detecting.

The reason for choosing this input was because it had exactly what I needed, it detected all the finger X,Y,Z movements and palm and arm position.

Outputs

For outputs, I used the Classification TriggerTEXTSimple output which is available on the Wekinator example website. I heavily changed this to be extremely different but it was a good start on what I needed as the original program would generate a random Hue on any output you choose.

Here below you can see that instead of a random generated loop, I made a array of different text to display and sound files to go along, so when you select

The result:

I had to import the library ddf.minim to play the audio files for each key that is recognized. The other output works in the same way but tells you which beat is displayed. Each output has their own color scheme.

Training data

Training accuracy was key for my project, I had to make sure I realized what features I needed for my program and what I was going to use. I initially made a plan on what features I will need and will not need, thus is why I used Featurnator.

Feature selection on Featurnator

It was important for me to have good feature engineering rather than large amount of data. I first manually deselected some features such as IQR which did not seem relevant, even though the auto select feature put these operations above the threshold. When I used auto select with some manual assumptions, the gestures became a lot more accurate.

Because of the amount of data I was using, I was relying more on accuracy then quantity, therefore I decided to use the K-nn Algorithm with only 1 number of neighbor.

K-Nearest Neighbor algorithm with only 1 neighbor.

4. Challenges and objective achievements

Did I achieve my objectives?

My initial objectives were:

  • To use hand gestures to create a musical instrument of some sort

I did achieve this initial creative aim, albeit it not being perfect, I did build a system that recognizes ASL gestures and plays the notes, which I am satisfied with the achievement.

  • To be able to add some educational value using my program

This objective I am probably most satisfied with, I definitely did add educational value as you play more notes, you involuntarily learn the exact gestures.

  • To build a good gesture recognizer using a machine learning algorithm.

I did use a classification algorithm to build a gesture recognition software so I am also satisfied with this achievement as it was decent at recognizing gestures, but can be quite difficult when certain gestures are the same such as the “A” and “E”, because the data points are so similar it is not always accurate. Also since the input only detects one hand at a time, when 2 hands are in the frame, it can sometimes send mixed features from each X,Y,Z of each hand therefore distorting the accuracy.

Testing

When testing, I had 4 volunteers test out the final product and the results were:

4/4 Thought it was easy to imitate the gestures of ASL

2/4 agreed that the beat and ASL at the same time was easy to make music with

4/4 Learned new ASL Gestures and could re-imitate at least 4 gestures after 30 minutes after playing the game.

Challenges

I found this project particularly hard because I encountered a lot of problems in the implementation stage. Sometimes even though I had saved a file, i was not able to run my program because it came with an error stating “Was not able to run, please retrain your models”, which became quite frustrating, especially since I downloaded featurnator it became quite common, that I had to constantly retrain my models everytime I turn off my computer.

Started back on the December 24th, where you would not be able to run the program

Another major problem I faced was that my input would only send 15 features over to Wekinator of one hand at a time. This meant that if the other hand was in the frame, it would not either detect the data points for my gestures, and I was not able to record any new data or emulate any of the gestures. Even when I made 2 input programs that only drew one hand on the frame, it would not send the correct features if two hands were above the Leap motion sensor.

My possible theory for this was that one hand was detected, it would send the 15 inputs for that particular hand that was in the frame first, I tried to visualize of what was happening

It was either sending the 15 inputs of one hand at a time in Wekinator

This was a large challenge as I had a limited amount of data to work with and was a barrier to my original creative project. The way I tackled this problem was by learning more about OSC messages and the LeapMotion and being able to only send features

I eventually fixed it by adding a simple code statement as you can see below:

It will only send the features if the one hand is detected, this helped my code work

If I had more time, I would focus more on the output system, to try and create a better feedback way such as through a visualization and more enticing images. I would also maybe try to implement it with bone conductive headphones as people with hearing difficulties would be able to hear the beats and piano through vibrations.

5. Sources & My code

Audio files:

A key sound: https://freesound.org/people/Goup_1/sounds/176487/

B key sound: https://freesound.org/people/Goup_1/sounds/176480/

C key sound: https://freesound.org/people/Goup_1/sounds/176449/

D key sound: https://freesound.org/people/Goup_1/sounds/176516/

E Key sound: https://freesound.org/people/Goup_1/sounds/176526/

F Key sound: https://freesound.org/people/Goup_1/sounds/176491/

G Key sound: https://freesound.org/people/Goup_1/sounds/176509/

Beat 1: https://freesound.org/people/ezwider7227/sounds/184479/ (Cut short, increased base)

Beat 2 : https://freesound.org/people/Spol/sounds/337671/ (Cut short, increased base)

Input: LeapMotion_fingertips_15inputs : http://www.doc.gold.ac.uk/~mas01rf/WekinatorDownloads/wekinator_examples/all_source_zips/LeapMotionViaProcessing.zip

This code was mostly the same, the only parts that I added on was changing the port and host as well as adding drawing only one hand and sending features for one hand at a time.

Output

TriggerTEXTSimple: http://www.doc.gold.ac.uk/~mas01rf/WekinatorDownloads/wekinator_examples/all_source_zips/Processing_TriggerText_1DTW.zip

This was heavily edited by me, I added in a library called ddf.minim to add audio and I added 2 arrays one for choosing a letter to display, and one for select a audio file.

Playing each sound when the output is detected, required some complex if statement to not spam the sounds.

I also used Wekinator and Featurnator to experiment with both:

Featurnator: https://www.doc.gold.ac.uk/~mas01rf/WekinatorDownloads/FeaturnatorNov018/Featurnator-v2.1.1.0b.exe

6.

Instructions

Step 1: Make sure your Leap Motion sensor is connected and Leap Services is running

Step 2 : Open all inputs which are stored in the folder LeapMotionASL & LeapMotionForBeats (They are processing source codes and should all detect each hand)

Step 3 : Open all the outputs they are in the folder “SoundOutputBeats” & “SoundOutputASL” and are also Processing programs.

Step 4: Run both of the Featurnator programs “LeapMotionForBeats.wej” & “ASLLeapMotion.wej” Make sure both are running on the right ports

LeapMotionForBeats is running on: OscP5 Port (11500) and port 7448.

ASLLeapMotion is running on OscP5 (12000) and Port 6448.

Step 5: Make sure both programs are running and imitate the gestures on the ASL Alphabet for the ASLLeapMotion to get gestures.

Step 6: You can either make a normal flat hand facing the Leap motion for LeapMotionBeats then a fist to start Beat1, Beat2 make a fist.

Conclusion

Overall, I am satisfied but I can definitely try and use better data analyzes technique and improve the accuracy of my program to recognize gestures easier. If I had more time, I would definitely try to make the leap motion input send 30 features of each hand so that you only need one program, as it might be more accurate to do so, but this required more hours and maybe even a knew drawing of fingers method, to give better visual feedback.

--

--