Endangered Technique: Using Environment Variables to Find Escaped Processes

6 min readSep 1, 2022

Today I want to mourn the loss of a feature that most developers never knew about.

This feature was so little used that Apple killed it in MacOS 11 without so much as a press release. I suspect they did not bother telling anybody about this because they simply did not think anybody needed this feature. Honestly, they are probably almost right. But, I use this feature and I feel this use case should be recorded somewhere.

My Problem: Finding Escaped Processes

At my work, we run a truly massive amount of testing. A developer will often run over 8,000 hours for a typical submission and even more for a risky change. To efficiently run all that, we rely on many machines running with many tests on each machine. These tests do all sorts of stuff, including launching different processes that talk to each other to simulate user workflows.

Now what happens when the program crashes during a test? The developer is happy because they found a bug before submitting it. But the machine may not be. This type of crash will often leave behind child processes which will quickly gunk up the server, causing other tests to fail in unexpected ways. We need to find and terminate these stray processes or work will quickly grind to a halt.

A simple approach is to rely on the “process tree”. When each process is created, the OS records the parent process (PPID). In most cases, you can find all processes launched by some process by building a graph of PID->PPID. Killing these child and grand-child processes will eliminate most of the process cruft.

However, processes go away and this graph breaks down. When a parent process dies, the OS will update PPID to the magic “1” or “init” process. Also, some processes intentionally “daemonize” and reparent themselves to “init”. How can we find these?

Strained Analogy Time: Tagging Livestock

Imagine a rancher delivering cows to a feedlot with many other cows. How will they know which animals are theirs?

A cow with yellow identification tags on its ears. — Photo by Wouter van der Velde on Unsplash

The rancher just needs to attach tags to the cow’s ears. In our analogy, processes already have ids called their PID. These PIDs are reused eventually, but start time + PID pairs are nice and unique on a given machine. We can have a list of these pairs to track processes no problem.

But processes can clone themselves at any point (fork) and launch children that may (fork) or may not (exec) look like themselves. Even if cows could do this, the ear tag would not go along for the ride, so we need another type of tag as an analogy.

Let’s shift in scale to consider bacteria. Researchers can track descendants of a bacteria by inserting phosphorescent genes. Then the bacteria and all their descendants will glow under ultraviolet light. Unless they mutate the phosphorescence away, of course. Thus if we can create at tag that mostly gets passed to children, we can track most of the descendents.

Glowing bacteria — Photo by CDC on Unsplash

So is there anything that applies to a process and it’s descendants kind-of like DNA? Fortunately, there is: environment variables.

How Environment Variables Propagate

Environment variables are one of the key ways that we can get information into a process when it launches. The OS supplies the process with a list of environment variable names and values. Each OS has a different way of doing so, but if you are using a high-level language you do not need to worry about it because your language will have an API that lets you get and set environment variables. Typically, the API makes it appear that your environment is mutable and then any child processes you create will get the current state of the environment.

So typically, if you don’t try, the child process will inherit the environment from their parent. Sounds like just what we’re looking for!

Reading Environment Variables

In Linux, the environment variables for a process is actually immutable from when the process is created. Your programming language will maintain a mutable copy of this environment which is then used when launching child processes. Thanks to the Unix “everything is a file” mentality, you can see how this works by scrounging around in /proc :

# You would normally do this:
$ printenv | grep -o -e '^SHELL=.*'
SHELL=/bin/bash# You can do it yourself:
$ grep -z -o -e '^SHELL=.*' /proc/$$/environ
SHELL=/bin/bash

Now here is the key: a process may read the /proc/1234/* files for any other process if it has the correct access. This means that a process might be able to see the environment of other processes. Typically, this means a process can see the environment of other processes launched by the same user.

On Windows and Mac, this process works slightly different, but there are still ways to get the environment for other processes launched by the same user on the machine.

Putting it Together

So here’s the technique:

Set a unique environment variable to a unique value. Perhaps DECENDENTS_OF_$$=1 (where $$ is our PID).
Launch the child process
Clear environment variable
When process finishes, search for any process by the current user that have the right environment variable/value combination
Kill all the escaped processes you found

The exact technique for reading environment variables differs on each platform. On Linux, you can just scrounge around in /proc. On Windows, you need to use the system APIs. On Mac, you can simply call ps -ewww and then parse the output.

Using a unique environment variable name is helpful because it allows nesting of the technique or running many different processes on the same machine.

Apple Ruins The Day

All good things come to an end. As I said up front, this technique is endangered. There are not a lot of use cases for processes reading the environment of other processes, with the possible exception of explicitly debugging.

As of MacOS 11, processes can no longer read the environment variables of other processes if the SIP (System Integrity Protection) is enabled. You used to be able to use ps on Mac to get this environment as documented in this Apple stack exchange post. With SIP enabled, the techniques no longer works.

I don’t know if Windows or Linux will follow suit, but I wouldn’t be surprised.

Why Apple was Probably Right

This ability is potential source of obscure security vulnerabilities. Perhaps a user uses an environment variable to pass high security information into a sub process. This is more secure than passing the same information on the command line since other users on the machine cannot access it. However, if you run untrusted code, that code may access and scrape this information.

So a feature that is almost never used for good but can be used for evil? I really cannot be mad about shutting this down even if it makes me sad.

What’s Next?

My team is starting to investigate other approaches. We cannot turn off SIP on Mac because we want to test as close to the typical user configuration as possible. We’re investigating using cgroups on Linux as a more robust approach for true process containment. Using containers or virtual machines are obvious options, though far more expensive in hassle and possibly runtime.

Honestly, I don’t have a great answer. You probably have never seen this technique, but hopefully I’ve made you slightly sad it is going away.