The Pink Stapler Test

How hidden correlations in your dataset can inject bias your deep learning model!

3 min readJan 16, 2022

This is a part of the [Data-Driven] series, where I try to bring you observations told by data, or by the lack of it.

Bias towards a certain norm in deep learning models — Stereotypes — Photo by Christian Lue on Unsplash

This one is around an outlier in the testing data faced by a group of researchers at Google while training a deep neural network to learn “Hand-Eye Coordination for Robotic Grasping”. In other words, it is training a model to assess the success rate of picking up an object through a series of actions.

The series of actions is split into two main tasks: (a) how to reach the object, and (b) how to grasp it. While reaching the object is a relatively well-studied task, grasping an object is still heavily under research, where determining the shape, weight, and texture of the object can affect the way the robot goes around solving the problem of grasping, from approach, to pressure needed, and multiple other involved factors.

To simplify the grasping approach, let’s assume that there are primarily two ways to grasp an object as shown in the figure below:

[To the right] — by grabbing it from both sides for solid objects
[To the left] — by pinching one of the manipulator’s finger through the center of the object and the other from one of the sides to hold it tight and avoid slipping for soft-textured objects

Grasps chosen for objects with different material properties. Note that the soft sponge was grasped with a very different strategy from the hard objects

After training the model over a portion of the data, and through testing, an interesting behavior was noticed, It had to do with the manipulator attempting to grasp a stapler. As you would have guessed, the stapler, being a hard-textured object, should be grasped by having the fingers of the manipulator at both ends of the object. While through testing, it was noticed that the robot repeatedly tried grasping the stapler through embedding one of its fingers into the stapler as if it was a soft-textured object.

Further investigating this case, and attempting the test with similar staplers of different colors, the robot was able to grasp them properly. Looking into clustering objects by shape, size, and color, it was noticed that most of the soft-textured objects that were introduced to the robot at that stage were baby clothing. Baby clothing as you may know predominantly have light pastel colors. And the color of the stapler was.. pastel light PINK!

Pastel Colors — Photo by Andrew Ridley on Unsplash

Clearly the model at that point had a strong correlation between pastel-colored objects and soft-textured objects, and decided that embedding its finger through pastel colored objects is the best approach for grasping them.

If this shows us something, it is how important to diversify your dataset and check for any correlation between columns, not just in their raw form, but also in different possible clustering of the categorical data.

It is worth noting that this caveat was accounted for during the course of the experiment. For more details over the dataset, the experiment, or other study outcomes, you can check the full paper here.

As a quick update on Google’s recent efforts in this domain, X, a subsidiary of Alphabet, the parent company of Google, wrote back in November about how their robots are slowly leaving the lab to learn around campus in an attempt to come one step closer to everyday robots. You can read more about it in their recent blog post.

Everyday robots are (slowly) leaving the lab

For the last several years, my team and I have been working to see if it's possible to teach robots to perform useful…

x.company

The Pink Stapler Test

How hidden correlations in your dataset can inject bias your deep learning model!

Everyday robots are (slowly) leaving the lab

For the last several years, my team and I have been working to see if it's possible to teach robots to perform useful…

Written by Salam El Bsat