The Art of Debugging: Diagnosis Techniques (Part 3)

Sopheak Hang
9 min readJun 11, 2024

--

Posts in this Series

  1. Introduction
  2. Bug Reproduction
  3. Diagnosis Techniques (This Post)
  4. Fixing and Reflection Process
Source: Learning Resources

Now that you have learned how to reproduce the errors effectively (from the previous post). It’s the key to unlocking the mystery since this will make pinpointing the culprit much smoother.

It is time to turn yourself into a bug detective. We will explore the 9 most powerful diagnosis techniques. Let’s dive into each method one by one.

1. Use Logger

Suitable for: Local Debugging, Remote Debugging, Production Debugging

Logs are like a black box recorder for your software. It can be a treasure of information when the issue occurs, especially for backend developers. Therefore, logging is crucial and essential and should be actively implemented during development.

Some popular logging libraries include Log4J for Java, Log4Net, and NLog for .NET. You should also be able to find a suitable logging library for other languages.

Here are some effective logging practices to follow:

>> Log Requests and Responses:

  • Log incoming API requests and their corresponding responses.
  • Log requests and responses of third-party integrations.

>> Log Exceptions:

  • Always log exceptions that contain full stack traces and relevant context.

>> Log Execution Progress:

  • Log key points in your application’s workflow, such as data loading, validation, and insertion. It can be useful to trace the bottleneck of the specific execution.

>> Log at the Appropriate Level:

  • Use different log levels (e.g., DEBUG, INFO, WARN, ERROR) appropriately to distinguish the severity and type of events.

>> Write Meaningful Log Messages:

  • Avoid vague messages like ‘Invalid destination country’.Instead, provide detailed information: ‘Destination country ‘KH’ is not eligible for product ‘001’’. The user will know what the error is about with this detailed information. The developer will also know what specific data introduces the error.

>> Add Context to Your Logs:

  • Include relevant information, such as user IDs, response time, transaction IDs, and some key elements that link to the application’s business that can help trace and diagnose issues.

2. Read Stacktrace

Suitable for: Local Debugging, Remote Debugging, Production Debugging

The stack trace contains the Exception’s type, a message, a line number, and a list of all the methods called when it was thrown. Stack traces are constructed in a very similar way in most languages, they follow the LIFO (Last In First Out) stack approach.

Example from above java stack trace :

Here is how we read it. We read from bottom to top:

1. Start in main() function.

2. Then call getBookTitles() function in Author class at line 25.

3. Then call getTitle() function in Book class at line 16.

How to spot the bug:

  1. Check getTitle method at line 16 in Book.java
  • Verify what object in the method might be null at this point.
  • Ensure proper null checks or initialization.

2. Examine getBookTitles method at line 25 in Author.java

  • Look at how getTitle is called and what data is passed to it.
  • Ensure the passed parameters are not null

3. Examine Main method at line 14 in Bootstrap.java

  • Look at how getBookTitles is called and what data is passed to it.
  • Ensure the passed parameters are not null

3. Use Debugger Tool

Suitable for: Local Debugging

A debugger is a very useful tool for inspecting the state of the objects and variables in your code at run time. In most modern IDE comes equipped with built-in debuggers. This makes it easier than ever to leverage the power of debugging during development.

Here are the common steps to start using the debugger tool:

Starting a Debugging Session

  • Attach to Process: Start the debugger and attach it to a running process, or start the program within the debugger.
  • Set Breakpoints: Identify lines of code where you want execution to pause. This can be done by clicking in the margin next to the line number in most IDEs.

Navigating Through Code

Debugger Navigation in JetBrains IDE
  • Run: Start the program and run until it hits a breakpoint.
  • Step Over (F10): Move to the next line from the current execution. It will skip over function calls and move to the next line.
  • Step Into (F11): Step into the details of a function call.
  • Step Out (Shift+F11): When you want to finish the current function execution and return to the calling function.
  • Continue (F5): Resume execution until the next breakpoint or the end of the program.

Inspecting Variables and State

Variables inspection during a debugging session
  • Watch Variables: Add specific variables to the watch it so that we can monitor it later.
  • Evaluate Expressions: You can also evaluate expressions to check their current values.

Modifying Execution

Conditional Breakpoint in JetBrains IDE
  • Change Variable Values: Alter the values of variables on the fly to test different scenarios.
  • Conditional Breakpoints: Set breakpoints that trigger only when certain conditions are met, reducing noise and focusing on specific cases.
  • Exception Handling: Configure the debugger to break on specific exceptions or errors, allowing you to inspect the state right when an issue occurs.

4. Use Profiler Tool

Suitable for: Local Debugging, Remote Debugging, Production Debugging

The profiler tool is useful for optimizing the performance of your software by identifying bottlenecks and inefficient code.

Profiler tools can be GUI and non-GUI. For instance, you might use a command-line tool to gather profiling data on the production server and then analyze the results using a GUI tool on a development machine.

Some popular profiler tools consist of visualVM (Java), Visual Studio Profiler (.NET), Jetbrains dotTrace(.NET), Chrome DevTools (JavaScript), PyCharm Profiler (Python).

5. Use data visualization for abnormality detection

Suitable for: Production Debugging

In this technique, data visualization helps create insightful visuals. The two most popular tools for log visualizations are Kibana and Grafana. In my professional experience, I used Kibana for log visualization.

To build a comprehensive log monitoring system, one popular solution is the ELK stack, which combines three powerful tools:

Elasticsearch: It is a search engine that indexes and stores log data which enables quick and efficient full-text search.
Logstash: A lightweight, open-source, server-side data processing pipeline that allows you to collect data from various sources, transform it on the fly, and send it to your desired destination.
Kibana: This visualization tool allows users to create interactive dashboards and visualize the log data stored in Elasticsearch.

Why is this important? Suppose someone reports that your API is running slowly. Checking a few logs might confirm the issue, but you won’t understand why it’s happening. This is because a single log or record often doesn’t provide useful information. However, if you aggregate the data and plot it as a time series, interesting patterns may emerge.

For instance, the above image shows a performance spike during a specific period, which coincides with a recent deployment. This quickly leads to the conclusion that the deployment caused the issue, saving you significant time.

6. Use “Cause Elimination”

Suitable for: Local Debugging, Remote Debugging

Next is the Elimination technique. This widely used method is straightforward and familiar to many developers. It doesn’t require deep thought or reading every line of code to guess which part might be causing a bug.

Instead, you can simply comment out sections of your code. However, if the module has many processes, this can become time-consuming. To address this, you can apply the binary search method you learned in school. This approach helps reduce the number of times you need to comment and test your code.

Elimination steps using binary search

For example, consider a module or API represented by a large square like the above image, with smaller squares inside representing its 16 subprocesses. Imagine you are unlucky and the bug is elusive. You comment out half of the subprocesses and test. If the error persists, the bug is in the uncommented half. You then comment out half of the remaining code and test again. Repeat this process until the error disappears. Using this method, you only need to test a maximum of 5 times to isolate the bug among 16 subprocesses.

7. Review Latest Change

Suitable for: Local Debugging, Remote Debugging, Production Debugging

Another useful and effective technique when there is a bug happens after you deploy or merge the code. What you can do is review the latest change in the application.

  • Check the last commits to see if any changes related to the bug.
  • Review the deployment history if it matches the time the bug was reported or not
  • You can use git annotate to see the latest change in the file. So, you can see any updates recently and who changed it so that you can come to the person to figure it out. I use this technique a lot. In case you don’t know how to do it, Below are the images showing how to check git annotation.

8. Construct Hypotheses

Suitable for: Local Debugging, Remote Debugging, Production Debugging

Source: Gonzalo Osco Hernández

Constructing and testing hypotheses is a systematic approach to diagnosing bugs. By formulating potential causes and then performing experiments to test these theories, we can confidently identify the underlying issue.

  1. Construct Hypotheses: Begin by brainstorming all possible causes of the bug. You can consider the recent changes, previous known issues, and any error messages or logs that provide clues.
  2. Sort Hypotheses: Prioritize your hypotheses from most likely to least likely. It could help us focus efforts on the most probable causes first to save our time and resources.
  3. Test Hypotheses: Experiment to test each hypothesis. The goal is to prove if the hypotheses are right or not. Record the results to track which hypotheses are eliminated and which are supported.

Maybe it’s hard to understand. So let’s see the use case:

Scenario: There is a complain from a customer that the order API is abnormally performing too slowly.

Please check the hypotheses I constructed with how to prove each of them.

9. Rubber Ducking

Suitable for: Local Debugging

Have you ever asked a friend to help fix a bug and then suddenly found the root cause just by having them nearby, without any actual help from them? This happens because you were using the “rubber ducking” technique.

What is Rubber Ducking?

Rubber ducking is when you explain your problem out loud to any object, like a rubber duck, to help you solve it. During the explanation, you often discover the solution on your own.

Steps to Start Rubber Ducking:

  1. Find Your Rubber Duck: If talking to an actual rubber duck feels awkward, that’s okay. You can use a colleague as your “duck.”
  2. Explain Your Code and Its Goals: Begin by describing what your code is supposed to do and the issue you’re facing.
  3. Go Through the Details: Review your code line-by-line, explaining each part thoroughly.
  4. Discover the Solution: Often, you’ll find the solution during this process of detailed explanation.

Rubber ducking helps clarify your thoughts and can be a surprisingly effective debugging tool.

There are so many techniques. Which techniques and when should I use them?

I have compiled all the mentioned techniques with each scenario to help you choose each technique:

With the above techniques, I trust you will be able to spot the bugs quickly and effectively. Next part, you will learn what is necessary to be considered during the bug fixing.

*** Please help me give claps if my content is useful for you. Thank for spending your time. 😊

References:

--

--

Sopheak Hang

Tech lead / Scrum Master / Freelancer / Tech Mentor / Data Science and AI enthusiast