Towards fully reconfigurable robots

From programming to training machines

Víctor Mayoral Vilches
Silicon Valley Robotics
11 min readMar 14, 2018

--

This article is a writeup of “Towards self-adaptable robots: from programming to training machines” available at https://arxiv.org/pdf/1802.04082.pdf submitted to ICLR 2018. Peer written with Irati Zamalloa Ugarte, Alejandro Hernández, Risto Kojcev and Nora Etxezarreta:

Figure depicts the 3 Degrees-of-Freedom (DoF) modular simulated robot in a SCARA configuration.

As pointed out in a previous article, today’s landscape of robotics is still dominated by vertical integration. Single vendors develop the final product leading to slow advances, application specific robots, expensive products and customer lock-in. The traditional approach for building robots has empowered this reality:

Figure pictures the traditional approach for building robots including a representation of the robotics control pipeline. The critical section is a path that marks a section in the process where each individual change on any of the contained blocks will demand a complete re-execution of all the steps (inside the critical section) has been highlighted in red.

It consists of a cascaded process. This procedure is composed of different steps that go from the purchase of robot components, to the deployment of the final robot for a given task. Once executed, it produces a certain output, and any modification in the robot will demand a big engineering effort to maintain the same result. The critical section for the traditional approach has been highlighted. This path marks the section in the process where each individual change on any of the contained blocks will demand a re-execution of all the following steps inside the critical section. Such a limitation leads to a time and resource-consuming approach for building robots. Still, this process remains being the most popular in industry and leads to robots that lack of flexibility and reconfigurability.

(the traditional approach) remains being the most popular in industry and leads to robots that lack of flexibility and reconfigurability.

From a roboticists perspective, the traditional approach is better understood as follows:

  1. Buy component: Refers to the action of acquiring those robot components required to build the robot. Typically, this includes sensors, actuators, communication devices, power mechanisms, etc.
  2. Integration of components (critical section): Many of the devices used in robotics, when compared to each other, typically, consist of incompatible electronic components with different software interfaces. The task of configuring and matching all components in a robot is known as the ‘integration effort’. Generally composed by diverse sub-tasks and demanding multidisciplinar knowledge, the integration effort supersedes many other steps in the process of building a robot.
  3. Robot assembly (critical section): This step captures the physical construction and assembly of the robot.
  4. Programming the robot (critical section): Programming the robot is done through the well known sense-model-plan-act framework. The Figure above depicts the robotics control pipeline which captures this framework. In our experiment, we implement the robotics pipeline using ROS as follows:
    - Observations: In our robot, the observations correspond with the positions, velocities and efforts of the joint angles. These values are fetched from the servomotor_driver ROS package.
    - ‘State estimation’ + ‘Modeling and prediction’: Given the observations from the previous step, the traditional approach describes the pose of the robot by inferring a set of characteristics such as its position, orientation or velocity. Considering a SCARA robot, this estimation will be the position of the end-effector calculated from the forward kinematics. In our experiments, we calculate the kinematic data of the robot using several ROS packages. Among them, we made active use of the urdf package –able to read a file which represents the robot model as a tree-like structure– and the kdl package –the default numerical inverse kinematics solver in ROS which can be used to publish the joint states and also to calculate the forward and inverse kinematics of the robot–. Mistakes in the observations lead to errors in the state estimation and are typically handled either in the state estimation or in the modeling and prediction step.
    - Planning: consists of determining the actions required to execute the task. It is a complex undertaking, specially from a mathematical perspective, since we need to consider limitations in joints, collisions, obstacles, etc. In our experiment, planning implies calculating the points in the space that the end-effector needs to follow so that it can achieve its goal –reach a given point in the space– . We implemented a planning technique using the moveit ROS package suite . Particularly, we make use of move group package , a node that acts as an integrator of the various abstractions that represent the robot and delivers actions or services depending on the user’s needs to facilitate the process of planning. Specifically, in our experiments, move_group node collects the joint states and transforms them into ROS actions or services that will be used in the blocks that follow. Knowing the starting pose of the robot, the desired goal pose of the robot and the geometrical description of the robot (and the world, both calculated through the urdf package and related ones), we proceed to execute motion planning –the technique to find an optimum path that moves the robot gradually from the start pose to the goal pose–. The output of motion planning is a trajectory consisting of joint spaces for each joint in which the links of the robot should never collide with the environment, avoid self-collision (collision between two robot links) and also not violate the joint limits.
    - Low-level control: The final step in the pipeline consists in transforming the ‘plan’ into low level control commands that steer the robot actuators to execute a given task. In our implementation, using the ros_control ROS package suite, after motion planning, the generated trajectory talks to the controllers in the robot using the ros_controllers interface. This is an action interface in which an action server runs in the robot, and move_node initiates an action client which talks to this server and executes the trajectory on the real robot or on its simulated version.
  5. Test and adapt (critical section): This step refers to the process of validating the programmed logic in the robot and the overall performance. If valid, the machine can be deployed. Alternatively, one would come back to the integration of components (step 2), the robot assembly (step 3) or the programming of the robot (step 4). A change on a layer will demand a complete re-execution of all the layers below which make the process slow, expensive and time consuming.

(in the traditional approach) A change on a layer will demand a complete re-execution of all the layers below which make the process slow, expensive and time consuming.

Discussion

Opposed to this traditional approach for building robots, as introduced by Mayoral et al. (2017) [1], modular robots promise interoperability and ease of re-purposing. The following figure illustrates the modular approach in the center.

Depicts three different strategies for building robots: a) the traditional approach, b) a modular approach where interoperable modules can be used seamlessly to extend the robot and c) the Modular And SelfAdaptable (MASA) approach where modules, besides interoperating, are detected and configured automatically in the robot for its use.

When followed (the modular approach), the integration effort is removed and the critical section reduced significantly. However, although the process of building robots, and particularly, the integration of new robot modules is simplified, the task of programming robots remains cumbersome. New modules, although interoperate, need to be introduced in the logic of the system manually. This implies that for each module addition or modification, a complete review of the logic that governs the behavior of such robot will need to happen. In other words, the adaptation capabilities of these systems are still limited.

Section c) of the Figure above illustrates the Modular And Self-Adaptable (MASA) approach for building robots. This approach radically changes the robot building process in which, rather than programming, modular robots train themselves for a pre-defined task. By continuously integrating the information from its modules, based on an information model such as the one described by Zamalloa et al. (2018) [2], a robot is able to adapt automatically when new modules are added. This approach reduces both the human development effort and time significantly. The process of building a robot gets simplified to defining a task, adding robot hardware modules and letting the robot train until it accomplishes the assigned task.

(MASA) This approach radically changes the robot building process in which, rather than programming, modular robots train themselves for a pre-defined task.

Introducing the Modular And Self-Adaptable (MASA) approach for building robots:

Represents the Modular And Self-Adaptable (MASA) approach for building robots. The critical section is underlined in red. Step 4, automatic training contains a dashed line implying that this process executes automatically and without any human effort.

The figure above depicts the MASA approach for building robots. Its critical section, much smaller than other approaches’, has been highlighted in red. Similar to what Brooks (1986)[3] proposed, MASA presents a mechanism to incrementally build intelligence for a given task. In the same report, Brooks argues that roboticists typically assume static environments however real-world scenarios involve dynamism. We argue that within these dynamic changes, robots are subject to errors in their measurements, and ultimately, related to their imperfect sensing capabilities.

The MASA strategy for building robots can be summarized as follows:

  1. Buy module: This step refers to the action of acquiring those robot modules required to build the robot. Since the devices are modules, they are assumed to be interoperable, easy to integrate and re-use.
  2. Define task (critical section): This step refers to the process of defining the goal that the robot should accomplish in a mathematical form so that the learning algorithms can be based on it. Typically, in Reinforcement Learning (RL) –a set of AI techniques– this mathematical expression is captured in what is called a ‘reward function’ that steers the learning process of the robot.
  3. Robot assembly (critical section): We capture the physical construction of the robot in this step, which is simplified since all modules interoperate.
  4. Automatic training (critical section): In this step, together with reconfiguration mechanisms such as those proposed by Mayoral et al. (2017)[1]; Zamalloa et al. (2018)[2], we implement AI techniques that allow the robot to continuously integrate the information from its modules and adapt dynamically a neuromorphic model to fit the task defined in previous steps. This way, regardless of the physical changes that happen in the robot (such as additions or removal of modules), the robot will automatically retrain itself for the task. In our particular implementation, we extend the Deep Reinforcement Learning (DRL) techniques proposed by Kojcev et al. (2018)[4] and include the reconfiguration ideas cited previously. Specifically, we apply Proximal Policy Optimization (PPO) Schulman et al. (2017)[5], which alternates between sampling data trough interaction with the environment and optimizing the ‘surrogate’ objective by clipping the policy probability ratio. Noise is introduced on each iteration of the learning algorithm.
  5. Deploy: Once trained, this approach outputs a flag that notifies about the success or failure of the automatic training step. In the case of failure, the user can refine the task definition (step 2) or add additional modules to the robot (step 3) and allow the training process to iterate again automatically (step 4) until success.
Depicts the 3 Degrees-of-Freedom (DoF) real robot in a SCARA configuration

Preliminary results

Pictures the 3 Degrees-of-Freedom (DoF) simulated robot in a SCARA configuration

We present an experiment that aims to shed some light into the relevance of this new approach for building robots meant for real-world scenarios (subject to noise and errors) and on the spot testing. We compare the results obtained by the traditional approach and the MASA one on a given task. The task of the robot is to reach a given point in the workspace. The setup consists of a robot with 3 Degrees-of-Freedom (DoF) in a SCARA configuration. The robot is built and configured by following the traditional and MASA approaches. The configuration of each robot follows from its building process and is either programmed or trained. Simulation is used to accelerate the process of experimentation applying, when appropriate, faster than realtime techniques as introduced by Kojcev et al. (2018)[4]. In a first experiment, the robot is built and programmed using traditional control theory mechanisms and is subject to Gaussian noise (N(0, σ)) introduced on each of its joints to simulate the imperfect sensing capabilities. In a second experiment, a modular robot is trained with motor joint observations that pass through a similar Gaussian filter introducing noise on each iteration. A summary of the results is presented in the Table below:

Displays the standard deviation (σ) of the Gaussian noise (N(0, σ)) applied to each one of the joints in the robot and the corresponding Root Mean Square Error (RMSE) obtained while the robot executes the task when following the MASA approach and the traditional approach. Best result per noise perturbation has been highlighted in bold.

Results show that the new approach proposed for building robots outperforms the traditional one in the presence of noise by even an order of magnitude in some cases.

Arbitrary trajectories of each one of the outputs of MASA for the given task under different levels of noise. The values in the x and y-axes are in meters (m). An interesting observation is that in the presence of Gaussian noise, MASA is able to adapt, and although the trajectory overshoots the target, it returns to the goal improving its accuracy. For example, the yellow line presents the output of MASA for a robot where each joint has been subject to an error of zero mean and 0.1 radians (5.73 degrees, refer to Table 1 for more details) of standard deviation. Still, it manages to get to less than 2 centimeters.

Conclusion

Most of the conventional testing conditions are virtually irrelevant when it comes to practical situations. We argue that self-adaptable robots, besides modularity, require not to be programmed but rather trained incrementally using AI techniques. This leads towards more adaptable machines, able to face changes and evolve in parallel with the robotics ecosystem.

This writeup and its related article tackles the problem of how to build self adaptive robots more effectively. Robots that are easy to configure and re-purpose. We present the concept of a self-adaptable robot that makes use of modularity and Artificial Intelligence (AI) techniques to reduce the effort and time required to be built. We demonstrate how rather than programming, training produces behaviors in the robot that generalize fast and produce robust outputs, even in the presence of noise. We prove that training the robot at faster trajectory execution times in simulation preserves the same accuracy when transferred to the real robot. We compare the trajectories produced by the same robot built using the traditional strategies and MASA. We discuss the results and show how MASA leads to self-adaptable robots that could disrupt the future of how these machines will be built and configured.

We conclude that an integrated path for AI and Robotics demands a strong consideration of the process of building and configuring robots. In particular, we claim that modularity plays a key role in the convergence of AI and Robotics.

we claim that modularity plays a key role in the convergence of AI and Robotics.

Glossary

Definition 1. component: a part of something that is discrete and identifiable with respect to combining with other parts to produce something larger (Source: ISO (2017)).
Definition 2. module: component with special characteristics to facilitate system design, integration, interoperability and re-use.
Definition 3. configuration (noun): the arrangement of a modular robot in terms of the number and type of modules used, the connections between those modules, and the settings for those modules, in order to achieve the desired functionality of the modular robot as a whole.
Definition 4. modularity: process of designing / building modules to facilitate easy robot configurations for different applications.
Definition 5. reconfiguration: altering the configuration of a modular robot in order to achieve an intended change in the functionality.
Definition 6. adaptation: the process of change by which a modular robot becomes better suited to its environment and its goal.
Definition 7. self-configuration (automatic configuration): achieving configuration of a modular robot through an automated process without human interaction except to initiate the process, if necessary.
Definition 8. self-reconfiguration (automatic reconfiguration): achieving reconfiguration of a modular robot through an automated process without human interaction except to initiate the process, if necessary.
Definition 9. self-adaptation (automatic adaptation): achieving adaptation of a modular robot through an automated process without human interaction except to initiate the process, if necessary.

References

[1] V. Mayoral, A. Hernandez, R. Kojcev, I. Muguruza, I. Zamalloa, A. Bilbao, and L. Usategi. The shift in the robotics paradigm; the hardware robot operating system (h-ros); an infrastructure to create interoperable robot components. In 2017 NASA/ESA Conference on Adaptive Hardware and Systems (AHS), pp. 229–236, July 2017. doi: 10.1109/AHS.2017.8046383

[2] Irati Zamalloa, Inigo Muguruza, Alejandro Hernández, Risto Kojcev, and Víctor Mayoral. An information model for modular robots: the hardware robot information model (hrim). arXiv preprint arXiv:1802.01459, 2018.

[3] Rodney A Brooks. Achieving artificial intelligence through building robots. Technical report, MASSACHUSETTS INST OF TECH CAMBRIDGE ARTIFICIAL INTELLIGENCE LAB, 1986.

[4] Risto Kojcev, Nora Etxezarreta, Alejandro Hernandez, and Víctor Mayoral. Evaluation of deep reinforcement learning methods for modular robots. arXiv preprint arXiv:1802.02395, 2018.

[5] John Schulman, Filip Wolski, Prafulla Dhariwal, Alec Radford, and Oleg Klimov. Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347, 2017.

--

--