An Introduction to writing Kernel modules

Sorry for the huge post, I’m migrating for my previous blog and wanted to include the whole Kernel category in one place. I understand that the post is poorly structured and a few dead links exist.

Kernel modules are pieces of code that can be loaded or unloaded into the kernel on demand. Strictly speaking, this is the definition of a ‘loadable’ kernel module. A module can be built-in or loadable. The advantage of loadability being that there isn’t any need for booting the kernel again and again after every load. To load or remove modules they have to be configurated appropriately. In this tutorial session on kernel modules we will be going through this configuration process.

To write your own kernel modules, you need to have basic understanding of processes in an operating system. You should have implimented basic OS level code (like using fork() function, few CPU scheduling algorithms e.t.c). A frim grip on C programming is also essential.

Being a novice programmer myself, I will be learning and blogging simultaneously. Feel free to refer the “The Linux Kernel Module Programming Guide” for authentic information.

In this tutorial, I will be discussing various algorithms and techniques on writing your own kernel modules. All the code is written in C and ran on Ubuntu 14.04 (debian based) linux system. Feel free to comment your queries.

So, let’s get started.


Some essential commands before getting started into Kernel Development

1. lsmod — In linux systems, `lsmod `prints the contents of the `/procs/modules` file. It shows the currently loaded kernel modules. Here, 'module' denotes the name of the loaded module. ‘size’ denotes the size of the module, not the memory used by it. 'Used by' provides the list of refering modules and their count. Also, If the module controls its own unloading via a can_unload routine then the use count displayed by lsmod is always -1, irrespective of the real use count

2. modinfo - It is the command used to know the information associated with a certain module. Its syntax is

modinfo module_name

3. systool - `systool` is used for various purposes. We will limit ourselves to list the options that are set for a loaded module. This is how it's done

systool -v -m module_name

4. modprobe - `modprobe` is a Linux program originally written by Rusty Russell and used to add a loadable kernel module (LKM) to the Linux kernel or to remove a LKM from the kernel. It can also be used to display the comprehensive configuration of all the modules:

modprobe -c | less

To display the configuration of a particular module:

modprobe -c | grep module_name

Also, it can be used to list the dependencies of a module, including the module itself:

modprobe --show-depends module_name

Try these commands once. If you couldn’t understand what is printed on the console at this moment, it’s okay. We will be discussing all of them soon. So Stay Tuned.


What does modprobe do?

As we have discussed already, modprobe command loads or removes a loadable kernel module. It was previously used as ‘modutils’ and is now included in module-init-tools package in newer linux kernel versions. There are various similar commands like insmod and rmmode but modprobe dominates them all due to various reasons.

The modprobe offers the following features:

  • an ability to make more intuitive decisions about which modules to load
  • an awareness of module dependencies, so that when requested to load a module, modprobe adds other required modules first
  • the resolution of recursive module dependencies as required

This makes modprobe unique and essential. The modprobe program also has more configuration features than other similar utilities. It is possible to define module aliases allowing for some automatic loading of modules. If you are interested to know more, read the man pages of module-init-tools and see for yourself what’s really going on.


Compiling Kernel Modules

It’s important to know how Makefiles work. Another way (easier one) is by using kbuild. We will discuss about it soon.

obj-m += first.o all: make -C /lib/modules/$(shell uname -r)/build M=$(PWD) modules clean: make -C /lib/modules/$(shell uname -r)/build M=$(PWD) clean

Now, This is just a sample of how the file looks. We will write our first program, first.c in the next tutorial. After this, we use make command to the build the kernel module.

Now it is time to insert your freshly-compiled module into the kernel with insmod ./first.ko. The latest kernel versions require you to name the object files with .ko extension to diiferentiate that from the usual ones. The diiference would be the .modinfo file that comes along. We will see what it is in the next tutorial.

As we already know, all the kernel modules that are loaded are listed in the /proc/modules. Once you write your first module, you can cat our file and see if it’s loaded or not. To remove our module from the kernel, use rmmod first.

So, let’s start writing our modules now.


Finally, It’s time to write our first kernel module. But before that, do you know what printk() is? It’s not an output statement. It is used to log information or give warnings for the kernel. The kernel uses the loglevel to decide whether to print the message to the console. The kernel displays all messages with a loglevel below a specified value on the console.

You specify a loglevel like this:

printk(KERN_WARNING "This is a warning!\n");
printk(KERN_DEBUG "This is a debug notice!\n");
printk("I did not specify a loglevel!\n")

The KERN_WARNING and KERN_DEBUG strings are simple defines found in <linux/ kernel.h>. They expand to a string such as “<4>” or “<7>” that is concatenated onto the front of the printk() message. The kernel then decides which messages to print on the console based on this specified loglevel and the current console loglevel,console_loglevel. If you do not specify a loglevel, it defaults toDEFAULT_MESSAGE_LOGLEVEL, which is currently KERN_WARNING. Because this value might change, you should always specify a loglevel for your messages.

Here’s a what each loglevel mean:

  • KERN_EMERG — An emergency condition; the system is probably dead
  • KERN_ALERT — A problem that requires immediate attention
  • KERN_CRIT — A critical condition
  • KERN_ERR — An error
  • KERN_WARINING — A warning
  • KERN_NOTICE — A normal, but perhaps noteworthy, condition
  • KERN_INFO — An informational message
  • KERN_DEBUG — A debug message typically superfluous

Okay. Enough of prerequisites. Here is our first kernel module code. Don’ t compile it yet. We will improvise the code. This is just the basic template.

/* 
Our first kernel module - first.c
Author: Sricharan Chiruvolu
Date: 14 Dec 2014
*/
#include <linux/module.h>
#include <linux/kernel.h>
int init_module(void){
	//Our initial code goes here...
	return 0;
}
void cleanup_module(void){
	//The terminating code goes here...
}

The init_module function is called after the module is loaded into the kernel and the cleanup_module function is called just before removing the module from the kernel.

Note that the use of cleanup_module function is to explicitly undo the changes ofinit_module and remove the module safely from the kernel.

Also, all the source code can be downloaded here. Each program is committed according to this tutorial. So you can visit the previous commits for the appropriate one.


Now, let’s add some output statements to our program. As already mentioned, we will use printk().

/* 
Our first kernel module - first.c
Author: Sricharan Chiruvolu
Date: 14 Dec 2014
*/
#include
#include
int init_module(void){
	printk(KERN_ALERT "This is our first program.");
	return 0;
}
void cleanup_module(void){
	printk(KERN_ALERT "End of our first program.");
}

Now we are ready to test our module. After writing the proper Makefile, run the make command. If you have any issues refer to the previous tutorial.

You should get something like this in your terminal

make -C /lib/modules/3.13.0-43-generic/build M=/home/sricharan/kernel modules
make[1]: Entering directory `/usr/src/linux-headers-3.13.0-43-generic'
CC [M] /home/sricharan/kernel/first.o
Building modules, stage 2.
MODPOST 1 modules
CC /home/sricharan/kernel/first.mod.o
LD [M] /home/sricharan/kernel/first.ko
make[1]: Leaving directory `/usr/src/linux-headers-3.13.0-43-generic'

Your first kernel module is ready. Load it to the kernel with the following command insmod first.ko. Note that you need sufficient privileges to do so.

Our LKM is loaded into the kernel. As already discussed in the previous tutorial, you can check out /proc/modules for our kernel module.

As of now, let’s remove it from the kernel. Run the rmmod first command to do so. In the following tutorials we will dive deep into more advanced LKM code.


The Macros

After writing our first program, we are now ready to modify it a litte. We will be using module_init and module_exit macros to rename our default init_module()and cleanup_module() respectively. Remember that you need to call these functions before calling the macros.

Also, We will be adding __int and __exit macros. These are macros to locate some parts of the linux code into special areas in the final executing binary. __init, for example instructs the compiler to mark this function in a special way. At the end the linker collects all functions with this mark at the end (or begin) of the binary file. If you what to know more you can read the init.h file along with the comments. This is the advantage of working with an open source operating system!

When the LKM starts, this code runs only once (initialization). After it runs, the kernel can free this memory to reuse it.

This is how our code was before we implimented it using modules.

/* 
Our first kernel module - first.c
Author: Sricharan Chiruvolu
Date: 14 Dec 2014
*/
#include <linux/module.h>
#include <linux/kernel.h>
int init_module(void){
	printk(KERN_INFO "This is our first program.");
	return 0;
}
void cleanup_module(void){
	printk(KERN_INFO "End of our first program.");
}

We will now modify it as follows.

/* 
Edited first.c; Added macros module_init and module_exit
Author: Sricharan Chiruvolu
Date: 14 Dec 2014
*/
#include <linux/module.h>
#include <linux/kernel.h>
#include <linux/init.h>
static int __init firstInit(void)
{

printk(KERN_ALERT "This is our first program.");
return 0;
}
static void __exit firstExit(void)
{
printk(KERN_ALERT "End of our first program.");
}
module_init(firstInit);
module_exit(firstExit);

Now you can make it. You will see that it behaves in the similar way as before. The difference? The attribute __init, , will cause the initialization function to be discarded, and its memory reclaimed, after initialization is complete. It only works, however, for built-in drivers; it has no effect on modules. __exit, instead, causes the omission of the marked function when the driver is not built as a module; again, in modules, it has no effect.

The use of __init (and __initdata for data items) can reduce the amount of memory used by the kernel. There is no harm in marking module initialization functions with __init, even though currently there is no benefit either. Management of initialization sections has not been implemented yet for modules, but it’s a possible enhancement for the future.

Next comes the MODULE_LICENCE macro. Latest Kernels provide a mechanism to specify if the module is under GPL licence or not. A warning will be printed when you try to run it. To avoid this, you need to specify that the kernel module you have written is open source. To know more about GPL licence, you can visit this. You need to add the MODULE_LICENCE to prevent all such warnings.

Here’s what included in the [linux/module.h](http://lxr.free-electrons.com/source/include/linux/module.h) about the GPL licence. Notice that the MODULE_DISCRIPTION and MODULE_AUTHOR have their usual meanings and are essential to be included in the LKM code.

<code>
/*
* The following license idents are currently accepted as indicating free
* software modules
*
* "GPL" [GNU Public License v2 or later]
* "GPL v2" [GNU Public License v2]
* "GPL and additional rights" [GNU Public License v2 rights and more]
* "Dual BSD/GPL" [GNU Public License v2
* or BSD license choice]
* "Dual MIT/GPL" [GNU Public License v2
* or MIT license choice]
* "Dual MPL/GPL" [GNU Public License v2
* or Mozilla license choice]
*
* The following other idents are available
*
* "Proprietary" [Non free products]
*
* There are dual licensed components, but when running with Linux it is the
* GPL that is relevant so this is a non issue. Similarly LGPL linked with GPL
* is a GPL combined work.
*
* This exists for several reasons
* 1. So modinfo can show license info for users wanting to vet their setup
* is free
* 2. So the community can ignore bug reports including proprietary modules
* 3. So vendors can do likewise based on their own policies
* #define MODULE_LICENSE(_license) MODULE_INFO(license, _license)
*
* Author(s), use "Name " or just "Name", for multiple
* authors use multiple MODULE_AUTHOR() statements/lines.
*
* #define MODULE_AUTHOR(_author) MODULE_INFO(author, _author)
*
* What your module does.
* #define MODULE_DESCRIPTION(_description) MODULE_INFO(description, _description)
*/
</code>

Now, we will add the required Macros as well as GPL license to our code.

Here’s the final code.

<code>
/*
Added macros and Documentation to first.c
Author: Sricharan Chiruvolu
Date: 14 Dec 2014
*/
#include <linux/module.h>
#include <linux/kernel.h>
#include <linux/init.h>
#define DRIVER_AUTHOR "Sricharan Chiruvolu "
#define DRIVER_DESC "The First Program - Template"
static int __init first_init(void)
{
printk(KERN_ALERT "This is our first program.");
return 0;
}
static void __exit first_exit(void)
{
printk(KERN_ALERT "End of our first program.");
}
module_init(first_init);
module_exit(first_exit);
MODULE_LICENSE("GPL");
MODULE_AUTHOR(DRIVER_AUTHOR);
MODULE_DESCRIPTION(DRIVER_DESC);
/*  
* This module uses /dev/testdevice. The MODULE_SUPPORTED_DEVICE macro might
* be used in the future to help automatic configuration of modules, but is
* currently unused other than for documentation purposes.
*/
MODULE_SUPPORTED_DEVICE("testdevice");
</code>

Now that we have written our first LKM at production level, we will be writing more advanced kernel modules in the following tutorials. We will be learning about /proc file system, Charaacter device files e.t.c.


Multiple file modules

Before we get into the real stuff, there are a few must-knows that are worth discussing. Note the our purpose of this tutorial set is to write our own LKMs and Rootkits for our own purposes. Few of them might have to be written in multiple files or require some arguments to be passed. We will discuss these issues now. ** Command-line Arguments**

Command-line arguments are declared by defining the variables that take these arguments as global and using the module_param() macro. The variable declaration is usually described at the beginning of the module. The insmod is used to pass the values at runtime. Arrays of Integers or Strings are called usingmodule_param_array and module_param_string macros.

The module_param_desc is used to document arguments that the module can take.

This is how command-line variables are declared.

<code>
int number = 100;
module_param(number, int , 0); // module_param(variable_name, variable_type , permissions);
// This is how arrays are declared
int numberArray[5];
module_param_array(numberArray, int, NULL, 0); /* not interested in count */
int numberArray2[30];
int count;
module_param_array(numberArray2, short, count, 0); /* put count into "count" variable */
</code>

Here’s a sample program using command-line arguments.

<code>
/*
Using Command-line arguments
Author: Sricharan Chiruvolu
Date: 14 Dec 2014
*/
#include <linux/module.h>
#include <linux/kernel.h>
#include <linux/init.h>
#include <linux/moduleparam.h>
#include <linux/stat.h>

#define DRIVER_AUTHOR "Sricharan Chiruvolu "
#define DRIVER_DESC "The First Program - Template"
//Declaring a few variables...
static short int short_number = 1;
module_param(short_number, short, S_IRUSR | S_IWUSR | S_IRGRP | S_IWGRP);
MODULE_PARM_DESC(short_number, "A short Number");

static int number = 54;
module_param(number, int, S_IRUSR | S_IWUSR | S_IRGRP | S_IWGRP);
MODULE_PARM_DESC(number, "A Number");

static long int long_number = 33553466;
module_param(long_number, short, S_IRUSR | S_IWUSR | S_IRGRP | S_IWGRP);
MODULE_PARM_DESC(long_number, "A long Number");

static char *char_string = "Hello";
module_param(char_string, charp, 0000);
MODULE_PARM_DESC(char_string, "A character string");
static int __init first_init(void)
{
int integerA;
printk(KERN_INFO "This is our first program.");
printk(KERN_INFO "short_number is a short integer: %hd\n", short_number);
printk(KERN_INFO "number is an integer: %d\n", number);
printk(KERN_INFO "long_number is a long integer: %ld\n", long_number);
printk(KERN_INFO "char_string is a string: %s\n", char_string);
	return 0;
}
static void __exit first_exit(void)
{
printk(KERN_ALERT "End of our first program.");
}
module_init(first_init);
module_exit(first_exit);
MODULE_LICENSE("GPL");
MODULE_AUTHOR(DRIVER_AUTHOR);
MODULE_DESCRIPTION(DRIVER_DESC);
/*  
* This module uses /dev/testdevice. The MODULE_SUPPORTED_DEVICE macro might
* be used in the future to help automatic configuration of modules, but is
* currently unused other than for documentation purposes.
*/
MODULE_SUPPORTED_DEVICE("testdevice");
</code>

Now that we know how to send command-line arguments let’s. We will next see how multiple file modules work.


Quite often, it is logically suitable to write multi-file kernel modules. Let’s write one now. We will use two files third_one.c and third_one.c for this.

third_one.c will look like this:

<code>
/*
Using multiple files - Part one
Author: Sricharan Chiruvolu
Date: 16 Dec 2014
*/
#include <linux/kernel.h>
#include <linux/module.h>
int init_module(void)
{
printk(KERN_INFO "Hello, world - this is the kernel speaking\n");
return 0;
}
</code>

and the other one, third_two.c will be:

<code>
/*
Using multiple files - Part two
Author: Sricharan Chiruvolu
Date: 16 Dec 2014
*/
#include <linux/kernel.h>
#include <linux/module.h>
void cleanup_module()
{
printk(KERN_INFO "Short is the life of a kernel module\n");
}
</code>

You can see that it’s just the same program as our first but is in two different files.

We are only supposed to change our Makefile to build both the object files of our program one after the other and create a combined object code. The modifiedMakefile looks like this.

<code>
obj-m += combined_module.o
combined_module-objs := third_one.o third_two.o
all:
make -C /lib/modules/$(shell uname -r)/build M=$(PWD) modules
clean:
make -C /lib/modules/$(shell uname -r)/build M=$(PWD) clean
</code>

You can now run make command to find the combined_module.ko kernel module.

We have successfully written a basic multi-file kernel module.

P.S: Since the post is being migrated from a markdown version, there are a few formatting errors and dead links, I will certainly edit them all when I get some time. Thanks.

Originally published at http://sricharan.xyz on December 24, 2014.