How to debug freezing JUnit tests

Piotr Zawadzki
the-stepstone-group-tech-blog
5 min readJun 26, 2019

--

Photo by Aaron Burden on Unsplash

Recently, when writing integration tests with Robolectric on Android I came across a situation where, at some point, one of my tests would freeze for about 10 minutes when it was executed on a CI machine. This was a very strange behaviour as I didn’t change the test logic nor the class under test. To make matters worse, the test was not hanging on my local machine and I wasn’t able to reproduce it.

In this article, I’ll explain how to tackle this kind of situations when running any tests on CI machines (assuming you can connect to the machine e.g. via SSH) or on your local machines (Android Studio/IntelliJ or command line).

How to know that you actually have a problem?

Unit tests should execute swiftly. A single test should finish in a matter of milliseconds (seconds at most with Robolectric, mostly when it’s starting up) and not minutes. So when running a test in your favourite IDE we see something like this:

JUnit test taking 5 minutes to complete

We have a problem 😉

On a CI server if a test hangs for a while but resumes eventually we can inspect JUnit reports and check if individual test times are within reason. You need to set up your CI solution to gather them in build artefacts. This differs based on your solution, e.g. on Jenkins you need to follow these steps and on Bitrise this should be supported out of the box. This could produce something like this:

Duration of 30 seconds seems suspicious

When running tests, I think it’s also a good practise to print console logs and gather them on a CI server. This is so that we have more data once something goes wrong.

E.g. on Android we can print the tests in console as they are executed by adding this in our build.gradle file:

This way we should be also able to see on which test a given build is stuck — we’d see a test being started but then no new console logs being printed. This might look like this:

Logs stopped showing up after STARTED was printed

Finding the issue

Connecting to the machine

If you can reproduce test freezes on your local machine you can move on to Locating the JVM process 😉

If your tests are running on a remote machine you should connect to that machine e.g. via SSH. This is usually the case when you have a Jenkins/Bamboo server running somewhere in your company’s network. With cloud CIs it’s usually not possible. There’s a notable exception though — Bitrise.io offers remote access to their machines when a build is executed (we use Bitrise.io at StepStone and definitely recommend it as a great service for building & verifying your Android & iOS apps 🚀).

Locating the JVM process

To find the culprit, we need to identify the process that runs our tests on JVM first.

There are many ways in which we can do that. On UNIX we could use e.g. ps -fe | grep java. We could also use a jps command (built into JDK) with -lv option. E.g. when running a test from IntelliJ this could produce:

$ jps -lv...86561 com.intellij.rt.execution.junit.JUnitStarter -ea -Didea.test.cyclic.buffer.size=1048576 -javaagent:/Applications/Android Studio 3.4.app/Contents/lib/idea_rt.jar=53774:/Applications/Android Studio 3.4.app/Contents/bin -Dfile.encoding=UTF-8

Or when running from command line with Gradle:

$ jps -lv...88545 worker.org.gradle.process.internal.worker.GradleWorkerMain -Djava.awt.headless=true -Djava.security.manager=worker.org.gradle.process.internal.worker.child.BootstrapSecurityManager -Dorg.gradle.native=false -Xmx512m -Dfile.encoding=UTF-8 -Duser.country=PL -Duser.language=en -Duser.variant -ea

The values in bold are process IDs which we need.

Make a thread dump

Once we have the JVM process, we need to inspect the threads to look for potential deadlocks. Thankfully, there’s a tool built into JDK called jstack, that allows us to make a thread dump. We need to provide it with the process ID like this:

$ jstack -l [PROCESS ID]

e.g.

$ jstack -l 86561

This will print the current status of all Java Threads used in that process. You need to go through it and look for suspicious locks/waits. Most probably, the stack-trace with the issue will reference your classes so you can look for a package name from your project.

In the DummyTest example from the top, the thread responsible was:

"main" #1 prio=5 os_prio=31 tid=0x00007fd764813000 nid=0x2503 waiting on condition [0x00007000080c8000]
java.lang.Thread.State: TIMED_WAITING (sleeping)
at java.lang.Thread.sleep(Native Method)
at com.dummy.ImageProcessor.processImage(ImageProcessor.kt:6)
at com.github.zawadz88.materialpopupmenu.DummyTest.This test should definitely not hang(DummyTest.kt:14)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
... at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
at org.junit.runner.JUnitCore.run(JUnitCore.java:137)
at com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:68)
at com.intellij.rt.execution.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:47)
at com.intellij.rt.execution.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:242)
at com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStarter.java:70)

And if we inspected the ImageProcessor class we would see that the implementation was faulty:

class ImageProcessor {    fun processImage() : Boolean {
Thread.sleep(300_000)
return true
}
}

More real-world examples?

This was a rather simple example — the issue was obvious once we inspected the test code and for that we don’t necessarily need jstack . You might run into issues which are not that trivial though. This is especially the case with integration tests, where we have more classes “playing together” than in simple unit tests. In the Robolectric tests mentioned at the beginning I couldn’t find what was wrong by just inspecting the code. When I used jstack I found the following thread dump:

"SDK 23 Main Thread" #25 prio=5 os_prio=0 tid=0x00007f758c341800 nid=0x2d17c runnable [0x00007f76516a7000]
java.lang.Thread.State: RUNNABLE
at java.io.FileInputStream.readBytes(Native Method)
at java.io.FileInputStream.read(FileInputStream.java:255)
at sun.security.provider.NativePRNG$RandomIO.readFully(NativePRNG.java:424)
at sun.security.provider.NativePRNG$RandomIO.implGenerateSeed(NativePRNG.java:441)
- locked <0x00000000e04a7130> (a java.lang.Object)
at sun.security.provider.NativePRNG$RandomIO.access$500(NativePRNG.java:331)
at sun.security.provider.NativePRNG$Blocking.engineGenerateSeed(NativePRNG.java:274)
at java.security.SecureRandom.generateSeed(SecureRandom.java:533)
at org.bouncycastle.jcajce.provider.drbg.DRBG$HybridSecureRandom.<init>(Unknown Source)
at org.bouncycastle.jcajce.provider.drbg.DRBG.createBaseRandom(Unknown Source)
at org.bouncycastle.jcajce.provider.drbg.DRBG.access$200(Unknown Source)
at org.bouncycastle.jcajce.provider.drbg.DRBG$Default.<clinit>(Unknown Source)
...org.mockito.internal.creation.MockSettingsImpl.build(MockSettingsImpl.java:228)
at org.mockito.internal.MockitoCore.mock(MockitoCore.java:61)
at org.mockito.Mockito.mock(Mockito.java:1908)
at org.mockito.Mockito.mock(Mockito.java:1817)
at com.stepstone.base.core.common.test.SCTestModule.createAndAddMockBindingForClass(SCTestModule.kt:33)

It turned out that the thread would get stuck when reading a file. Solving it is a different story 😉

Summary

Freezing tests can be a nuisance, especially in integration tests where you can easily find the issue. To help you with that use jps and jstack commands to figure out what’s going on in your Java Threads 🤘

Read more about the technologies we use or take an inside look at our organisation & processes. Interested in working at StepStone? Check out our careers page.

--

--

Piotr Zawadzki
the-stepstone-group-tech-blog

Principal Android Developer at Stepstone — passionate about technology, Android geek, photography enthusiast.