ROS! What is it good for?

4 min readDec 30, 2022

ROS is known for its steep learning curve which means that you’ll need to invest a lot of time initially to even get an overview of what it does. Here, I’ll try to speed up that process — starting with an introduction to the vocabulary/jargon of the tool — which I believe can help jump-start the understanding. This is mostly a very high-level overview with some quirky insights.

ROS (Robot Operating System) is a middleware, and middleware is? — Think of it as an interface that helps with establishing a network of programs (or nodes).

Most of the jargon here is borrowed from network programming.

Nodes? Well, they are executable codes that compute some functions. These nodes or programs can be written in cpp, python, or lisp. This is one of the key attributes of ROS — it is language agnostic — meaning — it can be written in any language.

Okay, so if there is another node that uses the output of this node as input, we’ll need to establish a connection, and to do so — we use topics.

So, what are topics? A topic can be thought of as a user-defined bus and topic names are unique. This carries certain messages which can be any of the primitive data types — string, int, or float.

The ROS master is called to keep track of all the nodes.

The node which talks or sends data is called the publisher node and the node that receives the data is known as the subscriber node.

This way of representation is called as the computation graph.

Consider the illustration below, inside the circle are the nodes that we want to use and we can establish a connection between them using topics. Now, when we have one too many nodes to call, we can use API/wrapper functions to call all the nodes with just one script. A small detour -

Illustration: Nodes, launch and configuration files

API (Application Programming Interface) acts as a kind of interface or bridge between different systems, allowing them to interact and share functionality in a structured and defined way.

A wrapper function, on the other hand, is a function that “wraps” or encloses another function, usually to extend or modify its behavior.

The key difference between them is — API works in the outermost layer or it’s a level higher in abstraction than a wrapper function.

The scripts that we use to call the nodes are called launch files and they are written in XML (eXtensible Markup Language). We also use YAML (YAML Ain’t Markup Language) to write configuration files.

But why do we use XML and YAML? Both are data-serialization languages — this means they can translate a data structure into a format that can be stored or transmitted and reconstructed later. The subtle difference between them is — XML is generally verbose and uses tags, whereas YAML has a more flexible structure and is more concise.

Levels of abstraction

The illustration above depicts the abstraction layers that we are dealing with — starting from the low-level hardware abstraction layer that is written in low-level languages — followed by — nodes which are generally written in high-level languages — one top that — we use human-readable programming languages like XML, YAML to write API which helps us communicate with the nodes — lastly — we have GUI (Graphical User Interface) this layer is the easiest to work with — where we can visualize the system using some of ROSs’ built-in plugins — but this limits the degree of freedom, as in we have limited control over the function behavior. Nevertheless, these existing plugins mainly serve as analysis tools which are quite helpful to debug and as well as understand the working.

Striking a balance!

Illustration: Balancing learning approaches

Lastly, this is my takeaway from learning ROS over the last few months and I think this can be generalized as well. So, I started reading the ROS documentation (top-down approach) and then realized it was too much information to absorb at once. So to expedite the process I started going through numerous repos to learn through pattern recognition and replicating the same (bottom-up approach). And I kept wavering between these two approaches but I think I leaned a little toward the pattern recognition side. If I had stroked a better balance between them and cycled between them more frequently, I would have learned quicker (maybe).

Alas, hindsight is 20/20!

ROS! What is it good for?

This way of representation is called as the computation graph.

Levels of abstraction

Striking a balance!

Written by Raghav Navaratna