Going to the edge of space safely, with OpenStratos and Rust

Iban Eguia Moraza
9 min readMay 15, 2017

--

Hi there, I’m Iban, a software developer that loves to try new things, currently loving the Rust programming language. I will be publishing coding examples about how I solve certain issues in Rust. Feel free to comment or ask for help ;)

From the huge amount of projects that I would like to implement, I’m currently focusing my time in two. SUPER Android Analyzer and OpenStratos. The first is an Android application analyzer for desktop where you can get reports about potential vulnerabilities in Android apps. The second is a project to send balloons to the stratosphere while developing all the code and the final hardware integration directly by me and some collaborators. I will talk about OpenStratos in this post.

OpenStratos has gone to the stratosphere twice, and things didn’t go really well. The first time, we lost the probe after receiving some mayday messages. We were able to recover it a month later though, thanks to my mobile phone number being attached to the capsule, and we learned tons of useful information.

The second time was even worse. After the launch confirmation message, we didn’t receive any extra communication, and never recovered the probe. This means we almost didn’t learn anything from it, since all the logs get stored in the probe itself.

After these issues, I decided to do something about it. I had been learning Rust for some time (almost two years now) and I knew the potential of that memory-safe, concurrent language. Two of the features I knew would impact OpenStratos positively were the type-level control flows checked at compilation time and the exhaustive pattern matching: no way to forget about one possible outcome.

The original OpenStratos version was written in C++, so I would need to port the whole code-base to a new language. Not only that: since in the first version we had issues with SMS messaging and information not reaching the ground, I wanted to add real-time telemetry to the probe.

Hardware overview

Hardware setup for OpenStratos Hypathia (first launch).

Before diving into the code, I would like to give a small explanation of the hardware in OpenStratos. The main computer is a Raspberry Pi. Originally it was a Raspberry Pi A+, but in the new version it will be changed for a Raspberry Pi Zero. This platform gives us a great tested hardware with upstream Linux support. It gives us a great camera accessory that we will use to take pictures and video with surprisingly good quality, and its weight and power consumption are really low: ideal for going to the edge of space.

Apart from the main computer and camera, the probe needs to know its current position (in 3D, to detect launch and balloon burst). For this, the probe is equipped with a stratosphere-tested GPS module. Of course, we wanted to do all the GPS frame parsing ourselves, and that will be part of another post.

Additionally, OpenStratos has one communication mechanism, a GSM module (Adafruit FONA) that has an ADC (analog-digital converter) connector that we use to check main battery usage. It can also report about the second (GSM) battery. Once again, the whole GSM messaging is implemebted by us using the message protocol for the FONA module. This module is used to send SMSs in different milestones of the mission (such as launch preparation, launch confirmation and landing, among others). The main issue is that it must be powered off to save battery once we get over 1.5km altitude or so (there is no GSM connectivity there).

For this new version, and due to the issues we had with the GSM, we’re adding a real-time telemetry mechanism. For that, we are using some long range XBee PRO modules. These modules can connect up to 45km in direct line of sight, and since our balloon will get to about 30–35km in altitude, it should be enough for most of the time (we will follow the balloon from ground). Even if we lose some telemetry, it’s not critical as long as we can find the probe after landing and as long as we get enough information in case of a failure. GSM will be used as a backup. XBees will be preconfigured with XCTU (Digi’s software to configure XBees) to be able to use transparent serial. No need to implement all XBee logic if we have only two modules, one server (in the probe) and one client (in a laptop in the chasing car).

Of course, all this needs power, and the probe uses one lightweight LiPo (lithium polymer) battery to power it for at least 6 hours (potentially even 10). This battery might need upgrading for the new XBee modules. Some extra components such as a heatsink for the computer and some connectors are used, but they are not needed to understand the software-facing hardware setup.

Software logic

OpenStratos’ balloon logic is pretty straightforward. It has a state machine, that starts with an initialization state and ends with a shut down. It has also two special states: an eternal loop if there is no GPS to detect launch or landing, and a safe mode, in case something goes wrong. The original version had a recovery mode in case the safe mode went wrong, which is not expected in the new Rust version.

I don’t want to go into details for each state, but I would like to introduce a bit of each so that the final implementation can be understood.

The initialization state is in charge of checking the camera, turning on the GPS, the GSM and once implemented, it will be in charge of initializing the telemetry and connecting with the client. Logging will also be initialized even before the state machine is in place. Once this is finished, the balloon will switch to a new state, where it will wait until the GPS gets a fix from satellites.

Once the fix is in place, a new state will send an SMS (I still have to check if this is really needed once we have telemetry) to confirm that the balloon is ready for launch. Until this SMS is sent, errors will shut down the probe with the logs being stored in a data folder. Once this SMS is sent, the balloon assumes that it could be launched anytime, so it will never shut down (the OS could restart/shut down in extreme temperatures, but not the core software, except on really critical conditions).

The balloon will then start recording video and wait for launch, detected by a change in altitude. Once in flight, it will send a launch confirmation SMS (once again, I will have to check if it’s needed), and enter a new state. Going up is a simple state. It will log the altitude milestones (5km, 10km…) once in a while in a log file, and will wait for the balloon burst. Meanwhile, another thread will be in charge of taking pictures, videos and so on. Not important right now.

The balloon burst will be detected by a sudden change in the altitude (going down) or by detecting that the balloon has gone down 1km from the maximum altitude if the first check fails. Once the balloon starts its descent slowed down by the parachute, no more pictures will be taken (only video), and the new state will log the altitude milestones backwards. It will detect the landing once it stays below 4 km in the same altitude for some time. It will try to send up to 3 SMSs when going down, just in case something goes wrong on landing (you know, RUDs happen).

This will change its state to a landing state. Once landed, It will send an SMS with the landing coordinates, then wait for 10 minutes and send another SMS. The second one is just in case it landed in a moving element, such as a river or the top of a truck.

Once the second SMS is sent, the shut-down state will shut down all peripherals and shut the probe down.

Implementing a compile-time safe state machine in Rust

The whole point of this post was not only to introduce OpenStratos, but to give some insights on how I implemented its state machine in 100% safe and compile-time checked Rust. In the C++ version, the implementation was rather rudimentary. Only a simple while with an enumeration: I had to implement all the error-prone logic by hand. I knew Rust could do better.

The whole idea lays behind these three traits:

/// Trait representing a state machine.
pub trait StateMachine {
/// The logic to run after the current state.
type Next: MainLogic;
/// Executes this state and returns the next one.
fn execute(self) -> Result<Self::Next>;
}
/// Trait to get the current state in the `State` enum for the
/// current state in the state machine.
pub trait GetState {
/// Gets the state enumeration variant for the current state.
fn get_state(&self) -> State;
}
/// Trait implementing the main logic of the program.
pub trait MainLogic: GetState {
/// Performs the main logic of the state.
fn main_logic(self) -> Result<()>;
}

The StateMachine trait makes sure that every state that implements StateMachine will have a next state. And not only that, it makes sure that next state will have some sort of logic implementation. Moreover, both the execute() and main_logic() functions will consume its current structure, so there is no way for a state to be called twice. It will also make sure in compile time that all states with logic will have a function that returns the current state in the State enum. This enables us to add the following default implementation:

lazy_static! {
static ref CURRENT_STATE: Mutex<State> = Mutex::new(State::Init);
}
impl<S> MainLogic for S
where S: StateMachine + GetState
{
fn main_logic(self) -> Result<()> {
let new_state = self.execute()?;
{
let mut current_state = match CURRENT_STATE.lock() {
Ok(guard) => guard,
Err(poisoned) => poisoned.into_inner(),
};
*current_state = new_state.get_state();
}
new_state.main_logic()
}
}

Here we implement the MainLogic trait for all state machines, and it’s as simple as executing the current state, changing the global CURRENT_STATE, and then executing the main logic of the next state. This means that we now only have to write the execute() function for each state, and define the proper traits. For the last state, we will implement MainLogic but not StateMachine, since it won’t have a next state. Here is an example state:

impl StateMachine for OpenStratos<GoingUp> {
type Next = OpenStratos<GoingDown>;
fn execute(self) -> Result<Self::Next> {
unimplemented!()
}
}

We have defined three more structures and an enumeration here:

pub struct OpenStratos<S: GetState> {
state: S,
}
impl<S> GetState for OpenStratos<S>
where S: GetState
{
fn get_state(&self) -> State {
self.state.get_state()
}
}
#[derive(Debug, Clone, Copy, PartialEq, Eq)]
pub enum State {
Init,
AcquiringFix,
FixAcquired,
WaitingLaunch,
GoingUp,
GoingDown,
Landed,
ShutDown,
SafeMode,
}
pub struct GoingUp;impl GetState for GoingUp {
fn get_state(&self) -> State {
State::GoingUp
}
}
pub struct GoingDown;impl GetState for GoingDown {
fn get_state(&self) -> State {
State::GoingDown
}
}

As it can be seen, except the State enum, GoingDown, GoingUp or OpenStratos occupy zero bytes in memory, so they have no information. This means that all the logic will be implemented at compile time, no need for extra information at runtime. Those structures are there for type checking, a common practice in Rust. You can specify that the next state after going up is going down, at type level.

The only points of failure of this approach are the StateMachine implementation for each state (it will have to be checked so that the next state is the proper one) and the GetState implementation for each state, that can be checked with simple tests to ensure it will always return the proper state enumeration variant:

#[cfg(test)]
mod tests {
use super::*;
#[test]
fn it_get_state_init() {
let state = Init;
assert_eq!(state.get_state(), State::Init);
}
}

Conclusion

Even if the current implementation is still in its beginnings (the current source code can be found in GitHub), I’m already seeing the benefits of using Rust. Much less information to check by myself, so much less error-prone.

Please feel free to comment this code, improve it or contribute to the main repository. I will be posting more insights on the new implementation as it progresses. Thanks for reading!

OpenStratos Hypathia launch squad.

--

--