MAGPIE: An Open-Source Force Control Gripper With 3D Perception

Published in

Correll lab

7 min readMay 11, 2024

There are a myriad of robotic arms, but very few choices when it comes to robotic grippers, particularly those with built-in force control and perception. This article explores the outer and inner workings of the MAGPIE gripper, an intelligent robotic object manipulator developed at the Correll Lab at the University of Colorado, Boulder. The gripper’s hardware design was created by Stephen Otto during his Master’s thesis, and the software for planning, perception (utilizing the RealSense with Open3D), and interfacing with the UR5 was developed by Dylan Kriegman as part of his senior thesis. Alongside this, James Watson also made significant contributions to perception and planning software. The original paper, published by Correll, Otto, Kriegman, and Watson, can be found here.

Equipped with sensors and 3D perception capabilities, the gripper provides support for a variety of tasks that are critical for autonomy, including object detection, segmentation, and point cloud processing. Its design enables force-controlled manipulation and symbolic re-planning, allowing it to adjust its actions dynamically based on real-time feedback. The system emphasizes manufacturability, using commercial off-the-shelf parts and open-source software, making it a robust solution for various automation challenges.

This article describes the physical robot while breaking down the software components that operate the robot. We will walk through how these software components are integrated to solve basic tasks using a block stacking task as a running example. Much of the software we describe in this article is open-source, and can be used on other projects.

Architecture and Design Choices

The following is an overview of the different hardware and software components which make up the MAGPIE gripper/manipulation pipeline. We will also describe the rationale behind each design choice we made.

The MAGPIE Gripper

The MAGPIE robotic gripper features a robust design utilizing commercial off-the-shelf components, including motors and cameras, which helps keep manufacturing costs low and simplifies assembly. With weight of 450g, the gripper can be mounted also on low-cost arms that provide only 1kg payload. The physical design of the gripper uses a dual-motor 4-bar linkage mechanism. The 4-bar linkage requires less space then a full-size parallel jaw gripper like, e.g. the Franka Emika Panda Gripper. The dual-motor setup enables independent operation of each finger. Finally, the motors and gripper mechanism is designed so that it optimizes the field of view for the integrated Intel RealSense D405 camera, which is mounted in the palm. This setup minimizes occlusions and distortion, allowing the camera to maintain visibility of objects up until contact is made.

The core benefits of the gripper are discussed below:

Easy access to force readings, the MAGPIE robotic gripper was designed to provide precise control over its operations, especially focusing on the manipulation of sensitive or delicate objects. By integrating Ax12 motors, we’ve enabled real-time adjustments to the torque settings through functions like torquelimit and speedlimit. This capability allows us to fine-tune the force applied by the gripper, ensuring that it can handle a wide range of materials safely and efficiently. These settings are crucial for tasks requiring careful force management and are easily accessible through our motor control software, allowing for immediate adaptations to task requirement.
4-bar linkage, the choice to implement 4-bar linkage mechanism was driven by the need to optimize the operational range and maintain an unobstructed view for the integrated Intel RealSense D405 camera. This setup enhances the gripper’s mechanical efficiency and precision, facilitating a clear visual path right up to the point of contact with an object. The linkage design not only supports accurate object manipulation but also simplifies the mechanics, making the system robust yet straightforward to maintain.
Low-cost, one of the core objectives in developing the MAGPIE gripper was to create a highly functional tool that remains affordable. This was achieved by utilizing commercial off-the-shelf components like the AX12 servo motors, which are both cost-effective and easy to find. This approach reduces the overall cost of the gripper and ensures that replacement parts are obtainable, enhancing the gripper’s practicality for widespread adoption in educational and research settings.

Open-source Software

A general overview of the open-source projects used in the perception and planning pipeline:

Open3D — toolset for 3D data processing and visualization/spatial analysis
Fast Downward — automated planner for creating action sequences in complex decision-making scenarios
PyTrees — framework for managing robotic AI decision-making through structured behavior trees
OWL-ViT — zero-shot object detection model integrating CLIP’s multi-modal features with transformer architecture
SegmentAnything (SAM) — performs image segmentation, capable of isolating specific parts of images
NumPy — numerical computing with support for multi-dimensional arrays and mathematical functions

How Does MAGPIE Solve a Task?

In this section, we’ll focus on describing how MAGPIE solves a task, using the block-task and a Universal Robot UR5 as an example. The block task is a simple exercise for the robot which involves three uniquely colored blocks, where the robot is tasked with stacking the blocks in a pre-defined order.

The MAGPIE robot integrates several software modules to handle various aspects of robotic control, perception, and task planning:

RTDE Control and Receive (rtde_control, rtde_receive):

These modules interface with the UR5 robot arm. They are used for sending commands to the robot and receiving data from the robot, respectively.
RTDEControlInterface handles sending motion commands and configurations to the robot.
RTDEReceiveInterface is used to get the robot's state and feedback.

Motor Control (Motor_Code):

Manages the operations of the gripper’s servo motors, including setting torque and speed limits.

UR5 Interface (UR5_Interface): A custom interface that encapsulates interactions with the UR5 robot, making it easier to control the robot arm and gripper as a unified system.

RealSense (RealSense): Manages the Intel RealSense camera for 3D perception, initializing the connection and processing the captured data to generate point clouds and RGB-D images.

Object Detection (ObjectDetection): Utilizes the RealSense camera data to detect and locate objects. This module is crucial for identifying blocks and their positions relative to the robot.

Task Planner (TaskPlanner): After objects are detected, this module plans the tasks required to manipulate the blocks according to specified goals.

The following sections will detail how the robot detects objects, step by step, utilizing the integration of the various software modules and libraries described previously.

Initial Setup and Integration

The first step in tackling any task is intializing the robot to be in the appropriate state.

The robot’s IP and communication ports are initialized, establishing a connection with the UR5 robot arm through the RTDE (Real-Time Data Exchange) Control and Receive interfaces.

The RTDEControlInterface is responsible for sending motion commands to the robot, while the RTDEReceiveInterface gathers state feedback and other data from the robot.

con = rtde_control.RTDEControlInterface(robotIP)
rec = rtde_receive.RTDEReceiveInterface(robotIP)
servoPort = "/dev/ttyACM0/"

gripperController = Motors(servoPort)
gripperController.torqueLimit(600)
gripperController.speedLimit(100)

ur = ur.UR5_Interface()
ur.gripperController = gripperController

Initialization of the gripper controller and motors

Motor and Gripper Configuration: A Motors instance is created, connecting to the servo port specified. Here, torque and speed limits are set for the gripper’s servo motors to ensure safe and effective manipulation of the blocks.

UR5 Interface Setup: An instance of UR5_Interface is created and configured with the previously set up control and receive interfaces as well as the motor settings. This step ensures that all components of the robot arm and gripper work cohesively.

RealSense Camera Initialization and Data Retrieval

The Intel RealSense camera is initialized using the RealSense class. The camera is used in the pipeline to capture 3D and RGB-D images.

real = real.RealSense()
real.initConnection()

The getPCD() method from the RealSense instance retrieves the point cloud data (PCD) and RGB-D images, providing the visual data necessary for identifying and locating the blocks on the task surface.

Object Identification and Localization

The perception aspects of the MAGPIE mostly take place in the ObjectDetection module. The ObjectDetection class takes the data from the RealSense camera and performs object detection using the pre-loaded OViT model weights.

# Loading the model
path = "google/owlvit-base-patch32"
self.label_vit = LabelOWLViT(path)
ckpt = "/home/streck/work/owlvit_segment_anything/sam_vit_h_4b8939.pth"
self.mask_sam = MaskSAM(ckpt)

This module uses the camera's images to detect the blocks and ascertain their exact positions and orientations relative to the robot.

# Making a call to the ViT model
pcd,rgbdImage = detector.real.getPCD()
blocks = detector.getBlocksFromImages(rgbdImage,urPose,display = True)

The final step is then to analyze the RGB-D images to segment out the colored blocks based on their unique characteristics. The robot uses this information to understand where each block is located in its operational space. In this step, we use the OWL-ViT recognition model where we prompt with the object we’re looking for.

Is Open World Vision in Robotic Manipulation Useful?

Active camera motion can dramatically reduce uncertainty in OWL-ViT, but open world perception is still far away from…

medium.com

Task Planning and Execution

Task planning and execution occurs in the TaskPlanner module. With the blocks detected and their positions known, the TaskPlanner takes over. This module receives the identified blocks segmented from a point cloud as input and uses a predefined goal (e.g., stacking the blue block on the yellow block) to generate a sequence of actions using PDDL and FastDownward in conjunction.

Thinking, fast and slow with LLMs and PDDL

ChatGPT is never shy at pretending to perform deep thought, but — like our brain — might need better tools to reason…

towardsdatascience.com

The plan generated is then used with the PDDL library to dictate the movements required by the UR5 arm and the gripper to rearrange the blocks according to the specified order. The code below generated PDDL files, which are then passed to a function which interprets these files as plans using FastDownward.

# PDDL Generatioon
initDict = self.getProblemArguments(blocks)
problem = bp.blocksProblem()
problem.generate_domain_pddl()
problem.generate_problem_pddl(
            init={
                'initDict': initDict
            },
            goal={
                'goalDict': self.goalDict
            }
        )

# Fast Downward call
os.system("/home/andreamiller/ris/downward/fast-downward.py domain.pddl problem.pddl --search 'astar(lmcut())'")

The robot executes the planned actions generated using PDDL and FastDownward. As the robot moves, it continuously monitors its position and the position of the blocks, adjusting its actions as necessary to ensure precise placement.