How cscope makes my life easier

Himadri Pandya
Code Dementia
Published in
7 min readJul 17, 2019

Cscope is a developer’s tool for browsing source code. It was originally developed at Bell Labs and has inbuilt support for Vim. My current project involves tracking down page size related dependencies in the Linux kernel’s Hyper-V code and coming up with fixes for those places. My mentor introduced me to this wonderful tool which allows me to search for all references to a symbol, global definitions, functions called by a function, functions calling a function, text strings, regular expression pattern, files and files including a file.

This post describes how to use cscope and particularly how does it help me in my internship tasks.

Cscope has inbuilt support for Vim and I also use Vim as the text editor. So all examples in this post are shown using Vim. You don’t really need to know how to use Vim to use cscope. But I highly recommend learning to use Vim. Check out this fun game and tutorial for that.

Okay. Coming back to cscope :)

How to install cscope?

If you are using Linux, your package installer’s standard install command with package name ‘cscope’ should do it.

For example, on Debian/Ubuntu:

sudo apt install cscope

Using cscope

It is always a good idea to have a look at the man page before starting to use any utility. So go ahead and run

man cscope

You should see the synopsis and description of cscope followed by a list of options you can use with the tool. I’ll be particularly using the -R option to recurse subdirectories during a search for symbols.

To give an example of using cscope, I’m describing the process of coming up with a fix for one of my recent internship tasks. It is a good idea to check out what am I working on before getting into the details of this particular task.

In Linux kernel’s source code, we have a file hv_fcopy.c which comprises of the code to enable the file copy service from the Windows host to the Linux guest running on Hyper-V. My mentor has told me that this file has some page size related dependencies. So my task is to find them and suggest fixes.

So the first thing I want to do is to find this file. For that, I navigate to the Linux kernel’s source code directory on my computer and I run the following command.

cscope -R

Cscope will build its database. It may take up to a few seconds depending on the size of the code base. As the Linux kernel consists of millions of lines of code, it may take a bit longer. And when it is ready you’ll see the following options.

I need to know where the file hv_fcopy.c is located. So I search for this file as shown below.

Following is the search result.

After knowing the location, I open the file and start going through the code. Soon, I come across the following piece of code.

In the definition of function hv_fcopy_onchannelcallback(). there is another function call to vmbus_recvpacket(). A reference to the guest page size is used to provide the 3rd argument to this function. I need to investigate this usage of the symbol PAGE_SIZE. For that, I need to understand what is the significance of this 3rd argument. So I want to take a look into the definition of this function.

This can be done using cscope’s functionality to search for global definitions as shown below.

As a result, Cscope opens the file containing this definition in vim editor in the same window.

Following is the function definition of vmbus_recvpacket().

The Linux kernel uses function names starting with __ (double underscores) to denote the lower-level components of the interface and these functions should be used with caution. The same function name without the double underscores is generally used to call these functions. The above function definition is an example of this practice. One interesting thing to notice here is that vmbus_recvpacket() always calls __vmbus_recvpacket() passing false as the value of the argument requestid. This is also a good example of how functions without underscores make sure that the underlying functions with __ get proper and required input.

Next, I check the definition of __vmbus_recvpacket() using the same global definition search functionality of cscope.

As a result, I reach the following function definition.

My goal is to understand the significance of the third argument of this function. The comments are really helpful here. So now I know that the third argument is supposed to define the buffer length. The particular call to vmbus_recvpacket() passes PAGE_SIZE*2 as the third argument.

Next thing is to decide if the buffer length should be defined using page size or not. After discussing this with my mentor, I learned that it should be defined using page size. Now, I need to know where is the actual memory allocated for this buffer. It is a little tricky but I could get it with the help of my mentor.

__vmbus_recvpacket() returns a call to another function named hv_ringbuffer_read(). So I also need to have a look at this new function. Again I use cscope for this.

And here is the definition of the function hv_ringbuffer_read().

This function is more about how the data is copied to the buffer and it does not help me much in understanding where is the buffer is allocated the memory. So continuing my quest for that, I do a symbol search using cscope to find all the occurrences of this buffer.

I get the following search result.

I find a relevant occurrence in the same file(hv_fcopy.c) which is an assignment to recv_buffer. But it is an indirect assignment.

Investigating further, I also find the actual assignment to srv->recv_bufferwhich is later being assigned to recv_buffer.

Here is the code snippet which shows this assignment. And this assignment is done in a different file named hv_utils.c

There! The call to kmalloc() is assigning the memory to this buffer. Which means, the buffer is assigned memory in the kernel space of the guest. But wait, even this kmalloc() is using a reference to page size. So probably it would be my next task to look into it.

One more interesting thing is happening here. The buffer is assigned memory of four pages and the buffer length is set to only two pages. So I again ask about it to my mentor and he tells me that it ensures that we never end up writing into the adjacent memory which is used for some other purpose.

Now that I understand the significance of this particular page size reference, it is time to make sure that this reference works correctly even if the guest page size is greater than Hyper-V’s page size(i.e. when we build the Linux kernel with the page sizes of 16KB or 64KB on ARM64 processors). To ensure this, we should use Hyper-V specific page size instead of guest page size here.

So the next step is to submit a patch to Hyper-V and the Linux kernel communities proposing this change and look forward to their feedback.

Probably you would have noticed that the quest of finding a fix needs a lot of lookups to various symbols and definitions. And cscope makes it very convenient as the search and code lookups can be done from the same terminal window. And the best part is that it is very easy and simple to use this tool. So if you are working on something which also needs such frequent code lookups, give cscope a try!

Thanks for reading :).

--

--

Himadri Pandya
Code Dementia

Exploring Computer Science | Outreachy Summer Intern at the Linux Kernel | Karateka | Fitness enthusiast | Occasional guitarist