Finger Tap

Pattakit Charoensedtakul
7 min readNov 4, 2023

--

Objective

Finger Tap is a project which utilizes artificial intelligence and machine learning to classify Parkinson’s Patients from Non-affected individuals. Simple hand movement is used, such as repetitively pinching fingers together. This is especially helpful as a cheap a cheap alternative to MRI scans and numerous tests, costing around $560 in one hospital.

Data

The data used consists of participants repetitively pinching their fingers.

Image of glove used when recording video.

There are 3 classes: Normal, Parkinson’s Disease, and Stroke. Each class has 100, 60, and 110 videos respectively.

Data Source: https://cccnlab.co

Planning

Before starting, my plan was to use some kind of Neural Network to process time-series data. A model which would be sensitive to temporal data, such as LSTMs or RNNs. Here was the plan:

  1. Extract Time-series data
  2. Calculate the Euclidean distance between fingers
  3. Train LSTM Model with distance (somehow)

Another plan I had was to extract some features from the videos such as the decay of power when the participants were tapping their fingers. Then, treat the data like any tabular data: passing it through simple machine learning models or a neural network. Some models I had in mind were decision trees or regression models. Here is my second plan:

  1. Extract Time-series data
  2. Extract prominent features
  3. Train Machine Learning model

Let’s go through how each plan went.

Data Extraction

To extract information from the videos, I would need to find the location of the fingertips colored red and blue. To accomplish this, I went through each pixel of each frame and found the average X and Y values.

Here is the simplified code:

for frame in video:
for pixel in frame:
if pixel is red:
red_count += 1
rX = pixel.X
bY = pixel.Y
elif pixel is blue:
bX = pixel.X
bY = pixel.Y
elif pixel is white:
wX = pixel.X
wY = pixel.Y

This chunk of code worked perfectly; however, there was one problem. It was painstakingly slow, taking over 10 minutes to process a video. With that, I looked for another solution. After discussing it with people, one senior told me to use moments.

Here is the revised code:

for frame in video:
hsv = cv.cvtColor(frame, cv.COLOR_BGR2HSV)
blue = cv.inRange(hsv, lower_blue, upper_blue)
red = cv.inRange(hsv, lower_red, upper_red)
white = cv.inRange(hsv, lower_white, upper_white)

MR = cv.moments(red)
MB = cv.moments(blue)
MW = cv.moments(white)

rX = int(MR["m10"] / MR["m00"])
rY = int(MR["m01"] / MR["m00"])

bX = int(MB["m10"] / MB["m00"])
bY = int(MB["m01"] / MB["m00"])

wX = int(MW["m10"] / MW["m00"])
wY = int(MW["m01"] / MW["m00"])

This was much faster. Now, the extracted data is put into csv files for further processing.

Distance Calculation

After extracting the data, it was time to train the LSTM model. First, I calculated the Euclidean distance between the thumb and index finger. I thought there was going to be a clear difference between Parkinson’s patients and normal individuals; however, I was very wrong.

Let’s inspect some distance-time graphs:

Normal distance-time graph
Normal distance-time graph
Normal distance-time graph

Looking at these three graphs from unaffected individuals, it seemed normal. The frequency and maximas were to be expected from someone who had no problems with controlling their hands.

Let’s take a look at a Parkinson’s affected individual next:

Parkinson’s distance-time graph
Parkinson’s distance-time graph

There is a huge difference! The frequencies and maximas are all over the place. After digging deeper, however, a realization appeared.

Here is another Parkinson’s affected individual’s distance-time graph:

Parkinson’s distance-time graph

I was confused. How does it make sense? The graph looks exactly like an unaffected individual’s graph. Despite this, I marched on.

LSTM Training

With the data in the format of distance to time, I started training the model, but there were many restrictions. The first restriction was the constricted array size. Since all of the videos had different lengths, the data’s shape had to be adjusted. Before writing any code, the data’s shape was plotted:

The average shape was 960, so I decided to cut the data to 960 if the length exceeded 960 and fill the data in with the tail of the graph to the desired length. Looking back, this may have caused a problem. Since the starting and ending points of the graph had no correlation with the average distance, this may throw off the model.

After clearing the problem of inconsistent data shapes, I stumbled upon another problem. How would you start off with designing a LSTM model? I asked my mentor this question, and he recommended me some solutions; however, after some time fixing the program, I was stuck. This was when my mentor suggested using my second plan as a baseline. Since there was no guarantee that the LSTM would work, I should work on an easier model first before trying to improve the accuracy even more.

This was why I switched to the first plan. I had to extract features from the data.

Preprocessing

The concept was simple. Calculate the frequency of the data, the maximum distance, and average distance. But, how would I know which features were important? I went back to my mentor. He sugessted looking into the tfresh library.

Tfresh is a library which extracts features from time-series data. Perfect! It produced over 200 features from the data, which can now be easily used since it is one dimensional.

https://tsfresh.readthedocs.io/en/latest/text/introduction.html

Here is what the data looked like:

Extracted Features from time-series data

Another procedure which had to be completed was data scaling. While some features ranged from -0.82 to 0.95, others ranged from over 16,000 down to under 76. By scaling the data down to between 0 and 1, their importance is equalized, which affects some machine learning models. The scikitlearn library’s MinMaxScaler was used to accomplish this.

https://datascience.stackexchange.com/questions/43972/when-should-i-use-standardscaler-and-when-minmaxscaler
https://ourcodingclub.github.io/tutorials/data-scaling/

Here is the scaled data:

Scaled extracted features

Training

Many machine learning models were tested: Decision Tree, Gradient Boosting, and Logistic Regression; The most efficient model, however, was the Random Forest model. It presented fast run times and was the most accurate. Right off the bat, the accuracy was 69.2%, with default hyperparameters and entropy criterion.

https://medium.com/@mrmaster907/introduction-random-forest-classification-by-example-6983d95c7b91

To improve this further, Optuna was also utilized to find better hyperparameters. After 1000 trials, the best option turned out to have a max depth of 28, min samples split of 2, min samples leaf of 1, and a gini criterion. It had an accuracy of 78.8%.

Random Forest Confusion Matrix

There are many problems with this, however. Each time the train and validation datasets are created, the accuracy would vary. Using scikitlearn’s train_test_split function, the splits were random. More data would be needed to create a stable model.

https://www.sharpsightlabs.com/blog/scikit-train_test_split/

I also tested out Neural Networks. As expected, the unscaled dataset performed horribly, reaching 65% accuracy at most. With the scaled dataset, on the other hand, it was on par with the default Random Forest. Overfitting also had to be considered. With 10,000 epochs and large models, the train set had an accuracy of 100%, while the validation set only had around 50%. To combat this, a small network with lower epochs was used. After some optimizations, the highest accuracy was around 73%, with a 235–250–500–250–3 Network, ReLU activation function, 0.0001 learning rate, and 145 training epochs.

https://www.geeksforgeeks.org/artificial-neural-networks-and-its-applications/

Analysis

Looking at the confusion matrix, it seems like the model had a hard time classifying Parkinson’s Disease. However, an explaination may be that the spread of Normal and Stroke to Parkinson’s was uneven. There were only 60 data points for Parkinson’s, but 100 for Normal and Stroke each.

Although the Random Forest and Neural Network were very close in terms of accuracy, Random Forest ends up on top in terms of training time and prediction speed. This makes Random Forest model the best to be used from these experiments.

Future Plans

Even though these experiments have a long way to go to reach healthcare standards, it was a fun journey.

Here are some plans for the future:

  1. Data Balancing
  2. Increased hyperparameter testing
  3. Utilization of different models
  4. Optimization of Neural Network
  5. LSTM Training

Acknowledgments

Firstly, without Brain Code Camp’s program, I would have looked into this field of study at all. I am appreciate everyone behind the program for giving me this opportunity to explore computational neuroscience, and teaching me artificial intelligence and machine learning techniques.

Secondly, I appreciate P’ Mos and P’ Nut for helping me throughout the process from choosing a project idea to finalizing this medium post. It has been certainly helpful with your mentorship, and it has been a pleasure to work under your guidance.

--

--