Robotic hand that can see for itself

Using Deep Learning to build a cheaper, vision prosthetic arm.

There are about 2 million amputees in the U.S. alone, and that number is expected to nearly double to 3.6 million by 2050. However, current prosthetics aren’t accurate with their grasp and grip control. For cheaper prosthetics, grasp (ability to interact with real-world objects) must be manually controlled by the user which requires a lot of training and still often not very accurate.

Although there are more automated ways to control grasp, these prosthetics can end up being $100,000+, making them difficult to afford by most users.

To solve this problem, I’ve been working alongside mentors from on building a cheaper, more automated and more accurate grasping program for a robotic prosthetic arm using deep learning.

Image for post
Image for post
Robotic prosthetic arm I’ve been building!

Before I get into how I built this, lets better understand how prosthetics today work and more about the problem.

How do Prosthetics Work?

We essentially try to achieve this same thing with a prosthetic arm so that we can still do daily tasks (however, most of them do this without the feeling being returned to the person). Today, when a wearer of a prosthetic arm wants to grab something there are three main ways in which this single is sent:

  1. Grip mechanism might be controlled mechanically = a cable attached to the opposite shoulder.
  2. Signal to grip is detected using myoelectric sensors (these will read muscle activity from the skin).
  3. Sensors to measure nerve signals that are implanted inside the muscle.

1. Body-Powered Mechanical Control

Image for post
Image for post

There are two main types of body-powered hand prostheses:

  • Voluntary Open: opens the hand when applying tension to the cable
  • Voluntary Close: closes the hand when applying tension to the cable

→ Barriers/Current Problems:

  • Rejection rate 16–58% (often uncomfortable)
  • Restrict the range of movement of arm
  • Difficult to finely control grasp and grip

2. Myoelectric Sensors

Image for post
Image for post

They often use two electrode sites to sense muscle contractions from two major muscle groups.

→ Barriers/Current Problems:

  • Batteries need to be recharged often
  • Lengthy training period to adapt
  • Difficult to finely control grasp and grip

3. Implanted Myoelectric Sensors (IMES)

Image for post
Image for post

The IMES system is a device is used to improve signal quality and consistency of myoelectric signals for prosthetic control.

  • The end caps serve as electrodes for picking up EMG activity during muscle contraction.
  • Reverse telemetry (via a coil around the arm) is used to transfer data from the implanted sensor, and forward telemetry is used to transmit power and configuration settings to the sensors.
  • The coil and associated electronics are housed within the frame of a prosthesis. A control system that sends data associated with muscle contraction to the motors of the prosthetic joints is housed in a belt-worn, battery-powered device. A cable attaches the control unit to the prosthetic frame.

An IMES is implanted into each targeted muscle that will be used to control a function of the prosthetic arm. Two devices would be needed for DOF (one device would control fingers opening and another device would control fingers closing).

→ Barriers/Current Problems:

  • Invasive
  • Difficult to finely control grasp and grip

The Problem

  • Large training times
  • Not adaptive
  • Expensive
  • Inaccurate control grasp and grip

What if there was a way to have full functionality of a prosthetic arm with smaller training times, at a much cheaper price with increased precision of control grasp and grip.

This was my goal with this project, specifically, I had mounted a USB camera onto a 3D printed prosthetic arm that using a Convolutional Neural Network (CNN) is able to detect objects and identify ways for the arm to manipulate them. This creates for a cheaper and more effective alternative to current prosthetic arms. Specifically:

  • Expensive — this problem can be solved by using a 3D printed arm which can be made for as little as $50.
  • Inaccurate control grasp and grip — this problem can be solved by using various CNN methods.
  • Large training times — this problem can be solved by using imitation learning to reduce the training time needed to design customized prosthetics. (I’ll be briefly mentioning this in today's article but more in another article).
  • Not adaptive — this problem can be solved by using various Reinforcement Learning techniques to work with different people and in different environments. (I’ll be briefly mentioning this in today’s article but more in another article).

My Process:

Image for post
Image for post
Pick and place mini-robot arm
  • Designing and Printing 3D Parts: I used Fusion 360 to manipulate some 3D printed prosthetic arm files and got them printed:
Image for post
Image for post
3D Printed Arm files
  • Assembling Arm: after getting the 3d printed parts, I spent some time putting together the arm, the forearm and attaching the motors to the fingers:
Image for post
Image for post
  • Training the CNN on real-time objects: after building my CNN model I trained it on Cornell’s Dataset for identifying grasping rectangles in images of real-time objects and sending instructions to the arm to manipulate the objects accordingly.
Image for post
Image for post
  • Mounting Camera, Sensors and Testing: I’m currently working on attaching depth sensors, mounting the camera and running my model on a GPU to test how it performs on novel objects.

For the rest of the article, I’ll be diving deeper into how I built the CNN model for object detection and grasping:

Object Detection and Grasping

  1. The camera would send frames of images to the GPU, which would identify the type of object it was/grasping rectangles.
  2. Then send this information to the prosthetic arm.
  3. The arm could then move in a way that would allow it to best interact with the identified object.

For the robotic arm to interact with the identified object we need a grasping implementation which has the following sub-systems:

  • Grasp detection sub-system: To detect grasp poses from images of the objects in their image plane coordinates.
  • Grasp planning sub-system: To map the detected image plane coordinates to the world coordinates.
  • Control sub-system: To determine the inverse kinematics solution of the previous sub-system.

The architecture I used includes a grasp region proposal network for the identification of potential grasp regions. The network then partitions the grasp configuration estimation problem into regression over the bounding box parameters, and classification of the orientation angles, from RGB-D image inputs.

Grasp Configurations

We can use the 5-dimensional grasp rectangle as the grasp representation here which describes the location, orientation, and opening distance of a parallel gripper prior to closing on an object. The 2D orientated rectangle, shown below depicts the gripper’s location (x, y), orientation θ, and opening distance (h). And an additional parameter describing the length (w) is added for the bounding box grasp configuration.

As seen in c) Each element in the feature map is an anchor and corresponds to multiple candidate grasp proposal bounding boxes.

Image for post
Image for post

CNN Model:

  • Grasp Proposals: This is the first stage of the deep network and it aims to generate grasp proposals across the whole image. The Grasp Proposal Network works as a mini-network over the feature map.
  • Grasp Orientation as Classification: Rather than performing regression, we formulate the input/output mapping as a classification task for grasp orientation. This means if none of the orientation classifiers outputs a score higher than the non-grasp class, then the grasp proposal is not used.
  • Multi-Grasp Detection: This last step identifies candidate grasp configurations. It classifies the predicted region proposals from the previous stage into regions for grasp configuration parameter. This part also refines the proposal bounding box to a non-oriented grasp bounding box (x, y, w, h).
Image for post
Image for post
Structure of multi-object multi-grasp predictor

To Summarize:

  • Blue blocks (on the above image) indicate network layers and gray blocks indicate images and feature maps.
  • Green blocks show the two-loss functions. The grasp proposal network slides across anchors of intermediate feature maps from ResNet-50 with k = 3×3 candidates predicted per anchor.


The Cornell dataset is preprocessed to fit the input format of the ResNet-50 network which mostly consisted of resizing the images (227x227) and substituting in the depth channel.

Example Outputs

Image for post
Image for post

Future Applications: Reinforcement Learning

The ultimate goals with this section are to figure out:

  1. How to automatically adapt control systems of prosthetic arms to the needs of individual patients.
  2. How to improve limb control based on patient feedback over time.
  3. Both these will make the arm more adaptive and personalized.

Using RL, we can get an agent to learn the optimal policy for performing a sequential decision without complete knowledge of the environment.

The agent first explores the environment by taking action and then edits the policy according to the reward function to maximize the reward. We use the Deep Deterministic Policy Gradient (DDPG), Trust Region Policy Optimization (TRPO) and Proximal Policy Optimization (PPO) to train the agent.

Image for post
Image for post

We could use Imitation learning where the learner tries to mimic an experts action in order to achieve the best performance. Possibly by implementing DAgger algorithm.

I know I just dropped a bunch of algorithms but if you’re interested I’ll be going deeper into this in a future article.

Next Steps:

  • I’m going to test the model with real-time objects to see how it performs.
  • Research into how we could feasibility apply RL to improve design, training and adaptability of prosthetic arms.
  • If you want to stay posted on my progress, feel free to follow me or reach out on twitter.

That’s it for now ✌

Image for post
Image for post

Hi. I’m Alishba.

If you want to learn more about this project or have any questions, let’s connect!




Written by

I’m a passionate 16-year-old on a journey to solve the world’s most important problems using emerging tech like Machine Learning and Blockchain.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store