Using /proc fs to further understanding of code execution

Rona Chong
Jul 27, 2017 · 15 min read

Here’s what may be a startling confession: for a little over a year, I’ve coded in C with little knowledge of how executables are run by the kernel.

If you’re like me, have no fear — it’s probably not ideal, but super possible as a beginner programmer to write functional programs in C and even know about preprocessing, compilation and linking without knowing how the programs we write get executed. After all, being able to execute our programs is often as far as our needs go. We don’t need to understand how programs are executed to see that they are executed at all.

Proceed far enough, however, and details about memory and CPU become important. As you start to think about these details, it’s almost natural to build a picture of happens when you run your executable.

Organic as this process is, sometimes the picture-building can be a bit incomplete. I’d been picking up bits and pieces over time, but my picture was still a little… dare I say, piecemeal?

In situations like this, it helps to see the real thing (i.e., the kernel) in action. Conveniently, we have something called the /proc filesystem to help us do just that. By taking a stab at the /proc filesystem, I helped clarify my understanding of how code gets executed and clicked a few more pieces together. Here are a few points I’d like to share:

0. /proc fs = a glimpse into kernel land

/proc fs, or the /proc filesystem, is a directory that exists on every unix machine. However, the files in /proc are no ordinary files.

To see what I mean, run ls -la on /proc. You will see that most of the files listed in /proc have a size of 0 — despite the fact that viewing these files will show you information.

rona@mugwort:/proc$ ls -la
total 4
dr-xr-xr-x 214 root root 0 Jul 23 12:46 ./
drwxr-xr-x 24 root root 4096 Jul 21 14:31 ../
dr-xr-xr-x 9 root root 0 Jul 23 12:46 1/
dr-xr-xr-x 9 root root 0 Jul 23 17:15 10/
dr-xr-xr-x 9 bacula tape 0 Jul 23 17:15 1002/
dr-xr-xr-x 9 colord colord 0 Jul 23 17:15 1026/
dr-xr-xr-x 9 root root 0 Jul 23 17:15 1027/
dr-xr-xr-x 9 root root 0 Jul 23 17:15 1036/
dr-xr-xr-x 9 root root 0 Jul 23 17:15 1060/
dr-xr-xr-x 9 mysql mysql 0 Jul 23 17:15 1097/
dr-xr-xr-x 9 root root 0 Jul 23 17:15 11/
[....]
-r-------- 1 root root 0 Jul 23 18:13 vmallocinfo
-r--r--r-- 1 root root 0 Jul 23 18:13 vmstat
-r--r--r-- 1 root root 0 Jul 23 18:13 zoneinfo
rona@mugwort:/proc$ head -n15 zoneinfo
Node 0, zone DMA
pages free 3952
min 33
low 41
high 49
scanned 0
spanned 4095
present 3974
managed 3952
nr_free_pages 3952
nr_alloc_batch 8
nr_inactive_anon 0
nr_active_anon 0
nr_inactive_file 0
nr_active_file 0

For another example, go ahead and run ‘file’ or ‘stat’ on one of the files in /proc before trying to read it.

vagrant@vagrant-ubuntu-trusty-64:/proc$ file uptime 
uptime: empty
vagrant@vagrant-ubuntu-trusty-64:/proc$ stat uptime
File: ‘uptime’
Size: 0 Blocks: 0 IO Block: 1024 regular empty file
Device: 3h/3d Inode: 4026532030 Links: 1
Access: (0444/-r--r--r--) Uid: ( 0/ root) Gid: ( 0/ root)
Access: 2017-07-24 19:41:00.980098758 +0000
Modify: 2017-07-24 19:41:00.980098758 +0000
Change: 2017-07-24 19:41:00.980098758 +0000
Birth: -
vagrant@vagrant-ubuntu-trusty-64:/proc$ cat uptime
20624.09 20585.73

As The Linux Filesystem Hierarchy explains, /proc fs is not so much a collection of actual files on the hard disk as a collection of “virtual” files whose data fields are pointers to kernel data structures in memory. [1] (If you’re not sure what I mean by data field, check out my other post here.) Hence sizes and blocks of 0 despite viewable content.

Being able to read (and potentially write to) data structures actively in use by the kernel is a powerful thing. So what data structures does /proc expose? As suggested by name, the data structs of processes!

1. The running of an executable is a process. A process is the running of an executable.

Every time an executable is run, the kernel creates a ‘process’ to run the code. Keeping separate namespaces for separate execution requests allows the kernel to track what it has been asked to execute in a sane manner. Every process gets its own id.

Run any long-running executable you like and you can see this for yourself. For instance, on Ubuntu 14.04, I run the executable ‘ping’ stored in /usr/bin. But before I do so, I store the contents of /proc in the file proc1.out.

vagrant@vagrant-ubuntu-trusty-64:/$ ls /proc > ~/proc1.out & echo $!
[1] 1869
1869
vagrant@vagrant-ubuntu-trusty-64:/$ head -n 15 ~/proc1.out
1
10
1009
1012
1014
1015
1070
108
1095
11
1125
1152
1185
12
125

Now, I run ping and store the contents of /proc in a new file. This documents any changes in /proc fs that comes with running ping.

vagrant@vagrant-ubuntu-trusty-64:/$ /bin/ping 127.0.0.1
PING 127.0.0.1 (127.0.0.1) 56(84) bytes of data.
64 bytes from 127.0.0.1: icmp_seq=1 ttl=64 time=0.013 ms
64 bytes from 127.0.0.1: icmp_seq=2 ttl=64 time=0.022 ms
64 bytes from 127.0.0.1: icmp_seq=3 ttl=64 time=0.025 ms
64 bytes from 127.0.0.1: icmp_seq=4 ttl=64 time=0.027 ms
^Z
[1]+ Stopped /bin/ping 127.0.0.1
vagrant@vagrant-ubuntu-trusty-64:/$ ls /proc > ~/proc2.out & echo $!
[2] 1883
1883
vagrant@vagrant-ubuntu-trusty-64:/$ head -n 15 ~/proc2.out
1
10
1009
1012
1014
1015
1070
108
1095
11
1125
1152
1185
12
125

Spoiler: /proc should contain a new directory for the ping process, named according to pid (process id). We see the new dir by running diff on the two output files. E.g.:

vagrant@vagrant-ubuntu-trusty-64:/$ diff ~/proc1.out ~/proc2.out 
31c31,32
< 1869
---
> 1882
> 1883

In this example, 1869 is the directory for the first run of ls (which, as a fleeting process, is gone from /proc by the second time we run ls). 1883 is the directory for the second run of ls. We know this due to the output of `& echo $!` attached to each ls of /proc.

This leaves 1882 as the process id for running ping. Run ps and grep to verify (note that the process for executing grep also shows up in the grep results).

vagrant@vagrant-ubuntu-trusty-64:/$ ps -aux | grep 1882
vagrant 1882 0.0 0.1 6512 628 pts/0 T 03:20 0:00 /bin/ping 127.0.0.1
vagrant 1903 0.0 0.1 10472 904 pts/0 S+ 03:28 0:00 grep --color=auto 1882

If you are unfamiliar with ps, ps is an executable which provides a snapshot of active processes. In fact, ps is one of many system utilities which reads the /proc filesystem to provide the information that it does.

2. Getting Loaded in Memory

To learn more about how processes work, we can take a look at the data structs exposed by /proc for each process. If I use ls on /proc/1882, for instance, I can see that there are quite a few files representing data structs for an individual process.

vagrant@vagrant-ubuntu-trusty-64:/proc/1882$ ll
ls: cannot read symbolic link cwd: Permission denied
ls: cannot read symbolic link root: Permission denied
ls: cannot read symbolic link exe: Permission denied
total 0
dr-xr-xr-x 9 vagrant vagrant 0 Jul 24 03:21 ./
dr-xr-xr-x 86 root root 0 Jul 24 02:33 ../
dr-xr-xr-x 2 vagrant vagrant 0 Jul 24 20:44 attr/
-rw-r--r-- 1 root root 0 Jul 24 20:44 autogroup
-r-------- 1 root root 0 Jul 24 20:44 auxv
-r--r--r-- 1 root root 0 Jul 24 20:44 cgroup
--w------- 1 root root 0 Jul 24 20:44 clear_refs
-r--r--r-- 1 root root 0 Jul 24 03:28 cmdline
-rw-r--r-- 1 root root 0 Jul 24 20:44 comm
-rw-r--r-- 1 root root 0 Jul 24 20:44 coredump_filter
-r--r--r-- 1 root root 0 Jul 24 20:44 cpuset
lrwxrwxrwx 1 root root 0 Jul 24 20:44 cwd
-r-------- 1 root root 0 Jul 24 20:44 environ
lrwxrwxrwx 1 root root 0 Jul 24 06:29 exe
dr-x------ 2 root root 0 Jul 24 03:28 fd/
dr-x------ 2 root root 0 Jul 24 20:44 fdinfo/
-rw-r--r-- 1 root root 0 Jul 24 20:44 gid_map
-r-------- 1 root root 0 Jul 24 20:44 io
-r--r--r-- 1 root root 0 Jul 24 20:44 latency
-r--r--r-- 1 root root 0 Jul 24 20:44 limits
-rw-r--r-- 1 root root 0 Jul 24 20:44 loginuid
dr-x------ 2 root root 0 Jul 24 20:44 map_files/
-r--r--r-- 1 root root 0 Jul 24 20:44 maps
-rw------- 1 root root 0 Jul 24 20:44 mem
-r--r--r-- 1 root root 0 Jul 24 20:44 mountinfo
-r--r--r-- 1 root root 0 Jul 24 20:44 mounts
-r-------- 1 root root 0 Jul 24 20:44 mountstats
dr-xr-xr-x 7 vagrant vagrant 0 Jul 24 20:44 net/
dr-x--x--x 2 root root 0 Jul 24 20:44 ns/
-r--r--r-- 1 root root 0 Jul 24 20:44 numa_maps
-rw-r--r-- 1 root root 0 Jul 24 20:44 oom_adj
-r--r--r-- 1 root root 0 Jul 24 20:44 oom_score
-rw-r--r-- 1 root root 0 Jul 24 20:44 oom_score_adj
-r--r--r-- 1 root root 0 Jul 24 20:44 pagemap
-r--r--r-- 1 root root 0 Jul 24 20:44 personality
-rw-r--r-- 1 root root 0 Jul 24 20:44 projid_map
lrwxrwxrwx 1 root root 0 Jul 24 20:44 root
-rw-r--r-- 1 root root 0 Jul 24 20:44 sched
-r--r--r-- 1 root root 0 Jul 24 20:44 schedstat
-r--r--r-- 1 root root 0 Jul 24 20:44 sessionid
-rw-r--r-- 1 root root 0 Jul 24 20:44 setgroups
-r--r--r-- 1 root root 0 Jul 24 20:44 smaps
-r--r--r-- 1 root root 0 Jul 24 20:44 stack
-r--r--r-- 1 root root 0 Jul 24 03:28 stat
-r--r--r-- 1 root root 0 Jul 24 20:44 statm
-r--r--r-- 1 root root 0 Jul 24 03:28 status
-r--r--r-- 1 root root 0 Jul 24 20:44 syscall
dr-xr-xr-x 3 vagrant vagrant 0 Jul 24 20:44 task/
-r--r--r-- 1 root root 0 Jul 24 20:44 timers
-rw-r--r-- 1 root root 0 Jul 24 20:44 uid_map
-r--r--r-- 1 root root 0 Jul 24 20:44 wchan

To go over just a few, exe is a link to the executable itself. cwd keeps track of the current working directory for any actions related to file IO, should the executable have any. cmdline stores the command line arguments with which the executable was run.

vagrant@vagrant-ubuntu-trusty-64:/proc/1882$ sudo less exe 
"exe" may be a binary file. See it anyway?
vagrant@vagrant-ubuntu-trusty-64:/proc/1882$ cat cmdline
/bin/ping127.0.0.1
vagrant@vagrant-ubuntu-trusty-64:/proc/1882$ sudo ls cwd
bin dev home lib lost+found mnt proc run srv tmp vagrant vmlinuz
boot etc initrd.img lib64 media opt root sbin sys usr var
vagrant@vagrant-ubuntu-trusty-64:/proc/1882$ ls /
bin dev home lib lost+found mnt proc run srv tmp vagrant vmlinuz
boot etc initrd.img lib64 media opt root sbin sys usr var

For more summaries of the struct-exposing files for each process, you can man proc or check out http://tldp.org/LDP/Linux-Filesystem-Hierarchy/html/proc.html.

Of particular interest (at least, to me) were the maps and mem files. The Linux Filesystem Hierarchy describes “proc/PID/maps” as ‘memory maps to executables and library files’; “proc/PID/mem” is ‘memory held by this process.’ [2]

If that description doesn’t illuminate enough for you, Linux Device Drivers explains that /proc/PID/maps more or less exposes the process’ memory map structure, mm_struct. [3] If each process gets an mm_struct (which the presence of maps in each /proc/PID dir indicates), then we can say each process gets its own memory map.

Each memory map includes several “virtual memory areas”, or vm_area_structs. A virtual memory area represents a set of virtual addresses sharing the same permissions and being used as storage for the same data entity. So when we look at the maps file for ping, we see something like:

00400000-00409000 r-xp 00000000 08:01 144                 /bin/ping
00609000-0060a000 r--p 00009000 08:01 144 /bin/ping
0060a000-0060b000 rw-p 0000a000 08:01 144 /bin/ping
0060b000-0061e000 rw-p 00000000 00:00 0
01e5e000-01e7f000 rw-p 00000000 00:00 0 [heap]
[...]
7ffe93d34000-7ffe93d55000 rw-p 00000000 00:00 0 [stack]
7ffe93d7f000-7ffe93d81000 r-xp 00000000 00:00 0 [vdso]
ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0 [vsyscall]

where each line represents one virtual memory area.

Does this mapping ring any bells? It it does, you might be familiar with the typical memory layout diagram for a C program. This map corresponds. The first line in the above maps file is the memory mapping for the read-only ‘text’ segment of ping (i.e., the executable portion of ping). The third line is the ‘data’ segment, where values for initialized global and static variables are stored. The fourth: ‘bss’, memory for values of uninitialized global and static vars. Then we get the heap and stack.

Rearrange these lines from high to low address and we have the following!

diagram from http://www.geeksforgeeks.org/memory-layout-of-c-program/

Now, if we take the description for “proc/PID/mem” literally, mem is the process’ memory space in the form of a file. If we were to read it, we should be able to access any of the areas listed in maps for a given process: the stack, the heap, the text, and so on.

Worth testing out?

(Hint: the answer is yes)

3. Up close and personal: reading the heap

To create a simple process whose memory layout I could easily understand and parse, I wrote a quick program that prints two strings in a loop.

#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <unistd.h>
#define STR1 "String 1"
#define STR2 "String 2"
void cpy_str(const char *str, char *dst);int main(void)
{
char *str;
char *str2;
str = malloc(strlen(STR1) + 1); /* belongs to heap */
cpy_str(STR1, str);
str2 = malloc(strlen(STR2) + 1); /* belongs to heap */
cpy_str(STR2, str2);
while (1)
{
printf("%s\n", str);
sleep(1);
}
free(str);
return (0);
}
/** cpy_str - copies string literal to dst
*/
void cpy_str(const char *str, char *dst)
{
int i;
for (i = 0; str[i] != '\0'; i++)
dst[i] = str[i];
dst[i] = '\0';
}

Since I dynamically allocated memory for str and str2, I expected these variables to be stored in the heap. I compiled and ran the executable with the following commands:

vagrant@vagrant-ubuntu-trusty-64:/vagrant/holbertonschool-linux_programming/0x03-proc_filesystem$ gcc -pedantic -Wall -Wextra -Werror print_str.cvagrant@vagrant-ubuntu-trusty-64:/vagrant/holbertonschool-linux_programming/0x03-proc_filesystem$ ./a.out
String 1
String 2
String 1
String 2
String 1
String 2
String 1
String 2
String 1
String 2
^Z
[3]+ Stopped ./a.out

ps -aux showed me that the pid for this process is 7265.

vagrant@vagrant-ubuntu-trusty-64:/vagrant/holbertonschool-linux_programming/0x03-proc_filesystem$ ps -aux | grep ./a.out
vagrant 7265 0.0 0.0 4336 356 pts/1 T 00:51 0:00 ./a.out
vagrant 7267 0.0 0.1 10472 916 pts/1 S+ 00:51 0:00 grep --color=auto ./a.out

Next I wrote a function in Python that takes in a PID and parses the proc/[PID]/maps file to find the memory mapping for the process’ heap.

import sysdef get_heap(pid):
"""Identify start and end addresses for heap of given process."""
with open("/proc/{pid}/maps".format(pid=pid), 'r') as maps:
while True:
mapping = maps.readline()
if not mapping:
# EOF presumably reached
sys.exit(
"Could not identify heap in /proc/{pid}/maps."
.format(pid=pid)
)
if mapping[-8:] == ' [heap]\n':
mapping = mapping.split('-')
heap_start = int(mapping[0], 16)
heap_end = int(mapping[1].split(' ')[0], 16)
return (heap_start, heap_end)

From there I created a script to print a specific segment of proc/[PID]/mem for the same PID.

Given that the addresses provided by map for the memory areas and the location of the same bytes in mem should be 1:1, I used the start address returned by get_heap as the offset from which to read mem and the end address as a measure of how far to read from offset.

#!/usr/bin/python3import sysdef get_heap(pid):
[...]
def main():
"""Print the heap of a given process."""
pid = sys.argv[1]
heap_start, heap_end = get_heap(pid) with open("/proc/{pid}/mem".format(pid=pid), 'r+b') as mem:
mem.seek(heap_start)
heap = mem.read(heap_end - heap_start).decode("ISO-8859-1")
print(heap)
main()

The only thing to do after that was to run my script with 7265 passed as the PID. And what do you know? It spat out the chars ‘String 1’ followed by ‘String 2’.

vagrant@vagrant-ubuntu-trusty-64:/vagrant/holbertonschool-linux_programming/0x03-proc_filesystem$ sudo ./read_heap.py 7265
!String 1!String 2Á

I subtitled this part ‘Reading the Heap’, but I really could have titled it ‘Code We Write Gets Loaded In Memory.’ That was the takeaway for me; parsing through the process’ memory and reading from its heap helped me understand and verify that each process is allocated its own memory space data struct, and that this struct consists of several vm areas, including the standard ‘segments’ of text, data, bss, heap and stack.

Feel free to give this exercise a try for yourself.

4. Execution happens from memory

Next I decided to identify a specific string in the heap and overwrite it. Iterating onto my main(), I took in two more command line arguments for the search and replace terms. Then, instead of printing the heap, I identified the offset of the search term in the heap (if it was present) and overwrote it with the replacement term. So

def main():
"""Print the heap of a given process."""
pid = sys.argv[1]
heap_start, heap_end = get_heap(pid) with open("/proc/{pid}/mem".format(pid=pid), 'r+b') as mem:
mem.seek(heap_start)
heap = mem.read(heap_end - heap_start).decode("ISO-8859-1")
print(heap)

became

def main():
"""Replace a string in heap of given process."""
pid = sys.argv[1]
search = sys.argv[2]
replace = sys.argv[3]
heap_start, heap_end = get_heap(pid) with open("/proc/{pid}/mem".format(pid=pid), 'r+b') as mem:
mem.seek(heap_start)
heap = mem.read(heap_end - heap_start).decode("ISO-8859-1")
i = heap.find(search)
if i == -1:
sys.exit("ERROR: '{search}' not found in heap."
.format(search=search))
mem.seek(heap_start + i)
mem.write(replace.encode("ISO-8859-1"))

Resuming my paused process 7265, in another pane (thank you tmux!) I fired off this new script with the command line `sudo ./read_write_heap.py 7265 “String 1” “Replaced”`.

As you can see, the output of the process was affected in real time:

vagrant@vagrant-ubuntu-trusty-64:/vagrant/holbertonschool-linux_programming/0x03-proc_filesystem$ jobs
[3]+ Stopped ./a.out
vagrant@vagrant-ubuntu-trusty-64:/vagrant/holbertonschool-linux_programming/0x03-proc_filesystem$ fg
./a.out
String 1
String 2
String 1
String 2
String 1
String 2
String 1
String 2
String 1
String 2
String 1
String 2
Replaced
String 2
Replaced
String 2
Replaced
String 2
Replaced
String 2
Replaced
String 2
Replaced
String 2
^C

A simple proof of concept, but damning in its demonstration that the heap gets accessed by the processor with every execution of the loop.

5. The system relies on virtual memory

The last thing this exercise illuminated for me is the the significance of the term ‘virtual memory’. People often think of virtual memory as a type of memory: specifically, the use of disk space as memory, as opposed to ‘real’ or random access memory. Here I learned of virtual memory as a memory management strategy, aka the use of virtual (non-physical) addresses to abstract memory away from the physical layer.

As I mentioned before, every executable gets loaded into memory in the form of a memory map consisting of several ‘virtual memory areas’. From low to high, we have the text segment, the data segment, the bss segment, the heap and the stack (besides other areas containing information for the process). But in reality, the addresses in RAM are divvied up into pages, and the memory map of a process may not always fit neatly into what’s left of a page. Virtual memory areas may be split over multiple pages, and these pages may not even be contiguous. Yet, just as memory for chunks of data appears to be allocated in contiguous blocks to the programmer, to the world of processes (e.g. as in /proc fs) each process appears to have its own address space in memory. The abstraction of virtual addresses which get translated to physical addresses upon access allows for the illusion.

Conclusion

So there you have it. If you think about it, the basic nature of most of the points I’ve provided here may veer towards obvious or self-evident: we vaguely understand or intuit that programs are executed as processes, that the instructions and variables of a program get loaded into memory for execution, and that the processor runs through the instructions loaded into memory. However, it wasn’t until I took a look at the /proc filesystem that each of these ideas (and more) sank into concrete detail for me and snapped into place with each other.

Now I’ve seen for myself that the kernel creates or splits a new process to run any executable; that the executable gets loaded into memory in a series of memory spaces called the memory map; that the processor runs through the instructions loaded in the text section of the map, with jumps to the stack and heap to access values; and that virtual memory management allows the kernel to treat the memory of each process as if it were isolated.

The real beauty of this exercise for me is that it illustrates how a simple poke under the hood — a look at the kernel implementation — can leave you with a much more concrete sense of how things work. The Linux kernel is an elegantly accessible system if you enjoy this approach to learning.

The next time you find yourself unclear on low level topics, be creative! Think about how you too can investigate the kernel implementation in a way that challenges the ideas you have in your head.

Cited Sources

[1] and [2] http://tldp.org/LDP/Linux-Filesystem-Hierarchy/html/proc.html

[3] http://www.makelinux.net/ldd3/?u=chp-15-sect-2

Other Resources

About processes and proc fs:

About memory and virtual memory:

Memory layouts for C programs

/proc/PID/mem and map

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade