How to Add Object Recognition Abilities to a Robot Arm.

Simplify programming for uFactory xArm with Rebellum.

Antonio Cerruto
8 min readMay 18, 2024
A custom object recognition model trained on Rebellum recognizes an avocado.

This is a guide for using the Rebellum software application to train object recognition models and deploy them on uFactory xArm robot arms. Rebellum was built by r4robot, a robotics research studio.

Getting a robot arm to respond to camera input can be a powerful way to extend its capabilities. Rebellum offers a simple interface for doing this. Object recognition models can be trained using any webcam and deployed on the robot within the same app. You’ll need a subscription to deploy computer vision models, but training models is available in the free app.

Let’s take a look at how this is done.

Hardware Setup

When Rebellum boots up, it lands on the Robot tab for programming robot motions through waypoints.

Home screen of the Rebellum software for programming uFactory xArm robots through waypoints.

To enable the robot, select your hardware, enter your robot’s IP address, and click the Connect button. Remember to edit your computer’s network settings according to the uFactory user manuals in order to establish a connection with the robot arm.

Configure robot hardware (arm model, end effector, and IP address) in order to connect to the arm.

Within the hardware settings, in the end effector drop-down menu, you can configure new end effectors, or edit or remove existing end effectors. You can also view settings for built-in end effectors.

Create, view, edit, or remove end effector settings through the end effector drop-down menu under Hardware settings.

Programming via Waypoints

To set a new waypoint for the robot arm, move the robot arm to the desired waypoint location and click the Add button located below the command drop-down. The command currently selected in the command drop-down will be saved when the Add button is clicked. In this case, the current command is “Go to Waypoint”. Notice that your robot’s position values are recorded and saved as a new waypoint.

The robot arm can be moved through the coordinate or joint buttons, or it can be set to Manual Mode and moved into position by hand. Fine-tune robot position using the arrows next to each position value (x, y, z, roll, pitch, yaw).

Fine-tune robot position with the arrows next to each position value (x, y, z, roll, pitch, yaw).

Along with position, each waypoint contains a few important waypoint variables: payload (in kilograms), gripper status, speed, the type of move, and the radius of the move (in millimeters) blending sequential linear moves. Each of these can be edited inline after a waypoint is recorded, or they can be set a the top menu before waypoints are recorded.

Waypoint variables of payload, grip, speed, move type, and radius can be edited inline after a waypoint is recorded.


Make sure this value reflects the weight of the object(s) carried by the arm at a particular waypoint. It should be 0 when no object is actively carried by the arm.


This is the gripper position or state value. For the xArm Gripper, position 0 corresponds to a fully closed position. For the xArm Vacuum Gripper, a state of 0 corresponds to suction off, and 1 corresponds to suction on.


The speed can be set for each waypoint individually. The default speed is 10%. Be careful increasing speed, as higher speeds can pose safety risks.

Move Types

Moves can be ‘Linear’, ‘Circle’, or ‘Any’. ‘Linear’ moves are straight-line paths to the set waypoint. ‘Circle’ moves require three points to define a circle. The first point defining a circle can be Linear type, but the next two points must be Circle type. ‘Any’ moves let the robot choose the most efficient path to the waypoint while avoiding self-collision.


This is the blend radius applied to Linear moves, which offers a smooth transition between linear waypoints.

Propagating Edits

To edit all waypoints with a particular waypoint variable value, click the arrow above any waypoint variable to propagate the value at the top menu to all waypoints at once.

Inline Edits

Waypoint variables other than position values can be edited inline by simply clicking on the value to be edited. Variables for all other commands can be edited this way, too.

Editing waypoint move type inline.

Building a Sequence of Commands

To build a robotic program, you will sequentially add commands through the Add button. These commands can be move to a waypoint, wait for a set period of time, repeat a section of previous commands, read external digital inputs and perform commands conditioned on those external inputs, or set digital outputs on the robot control box. You can also program commands conditioned on objects recognized by a connected camera. To see a list of all possible command options, click on the command drop-down.

Drop-down menu showing command options.
Respond to digital and analog inputs connected to the xArm control box.
Control digital and analog outputs on the xArm control box.

To use camera commands, you’ll first have to train your own computer vision model before deploying it on your robot. Rebellum provides a simple interface for quickly prototyping object recognition models. Let’s take a look at that next.

Connecting the Camera

Switch over to the Camera tab on the top-left corner of the Rebellum application. You’ll see that you can connect up to two cameras at a time. Click the checkbox next to one of the cameras to enable the live video feed.

Home screen of the Camera tab on Rebellum.

Camera selection

Any USB camera will do.

Camera Placement

While collecting training images, make sure the camera is positioned or installed in its intended deployment location. Training images should look as close as possible to the live camera feed during inference.

Built-in Hand Recognition Model

You can use Rebellum’s built-in hand recognition model to test computer vision commands before you train your own models. Select the Hand Recognition model from the model drop-down menu and click Run Model. Feel free to play around with this and see if you can get the model to detect the presence of a hand.

The Model Training Wizard

Finally, the fun part. Turn on at least one camera and click on the Create New Model button. This will start the step-by-step model training wizard.

Home screen of the Camera tab with one camera enabled, ready to create a new model.

Model Setup

Enter a descriptive name for your model, and at least one object to recognize. You can recognize up to five different objects with each model. In this example, we’ll train a model to recognize an avocado and a lemon. Click Next.

Set up the model by giving the model a name and entering up to five different objects to recognize.

Collect Background Images

The first step is to collect background images — these are images without any object of interest in view. The wizard will record video for 10 seconds once you hit the Record Background button. This works best if there is some activity in the video feed rather than a static image in view. Try shifting background objects or showing hands or shadows while the video records.

Prep screen before recording background video.
Video records for 10 seconds to collect data for the background and for each object.
Be sure to add some activity in the video feed while the video records for a more robust model.

Collect Object Images

Next, the wizard will record 10 second videos for each object to be recognized. It’s important to move the object in various orientations while the video records. Be sure to get some video frames without your hands in the frame if hands will not appear during inference. Keep objects near the center of the frame.

Place the object to be recorded in the video frame before recording the 10 second data collection video.
Place the object in various orientations and periodically remove hands from the frame, if hands will not appear during inference.
Repeat the data collection steps for all objects in the model.


Once data has been collected for the background and each object, the model is ready to train. Click Train. Training will take a few minutes with speed depending your local hardware. This can range from one minute to about 7 minutes.

Click Train to train the model once data collection is complete.
Training time will depend on your local hardware, but it should range from 1 to 7 minutes.


Click Test New Model to see your model in action. Inference is performed on a live video feed for each enabled camera. The object recognized by each camera is displayed above the camera feed.

Inference is performed on a live video feed. The current object recognized by each camera is displayed above the camera feed. In this case, no object is recognized since there is no object in view.
A lemon is recognized by our newly trained model.
An avocado is recognized by our newly trained model, even with a hand present.
An avocado is recognized by our newly trained model.

Deploying Object Recognition Models via Waypoints

Back to the Robot tab

Back in the Robot tab, you will see your new model in the list of options when you add one of the camera commands. Select the model, the corresponding object to recognize, and the camera to use for a given camera command.

Select the model, the corresponding object to recognize, and the camera to use for a given camera command.

In this example, our robot waits for camera 0 to recognize an Avocado using our new Fruit Recognition model before moving on to the next waypoint.

The last two lines here program our robot to wait for camera 0 to recognize an Avocado using a custom Fruit Recognition model before moving on to the next waypoint.

Saving Your Work

When you’re ready to save your work, use the Export button to save the program to your local machine. Use the Import button to load saved programs. Object recognition models are saved automatically and will be available for any new robot program whenever you open Rebellum.

And there you have it. You’ve learned how to program a uFactory xArm robot arm through waypoints, how to train custom object recognition models, and how to deploy your object recognition models on the robot arm via waypoints, all on the Rebellum software application. Congrats!