The Chase After a Sometimes-Bug
As if we don’t have enough regular bugs, there are — rarely, admittedly — some bugs that not only need to be fixed, but first need to be found. These are the “sometimes” bugs. And such a bug appeared one day in our Moodle site.
Sometime-Bug, (not so) Pleased to Meet You
It all started when we created in one of our Moodle sites a new type of user — a super-guest — that would be a registered-user mutation: it would be created dynamically when clicking the “Login as Guest” button, but would be deleted from the system after a limited time.
The system was already up, the super-guest type had been tested and was about to be deployed to Production, when suddenly there were reports from the QA team: sometimes, when trying to enter a course as a super-guest, they got an error message that they could not register to this course. What??
We immediately started trying to zoom in on the bug: Does it happen in all courses? Does this happen on other computers? In short, the standard preliminary tests performed with bugs.
This is where the plot started to thicken: It turned out not to be the case in all courses, nor in all computers. In fact, it was very difficult to reproduce the bug. It happened only to few, and to them too —only sometimes.
Catch it While It’s Running
Seeing that this was the case, we decided to change direction and start reviewing the code in order to hone in on the part of the code where the bug might be. One of the programmers was able to reduce the code to a specific area of the login check function, where one of the tests could result in a failure to register to the course. But that led us to a dead end because to understand how to get to the said code, we’d have to encounter the bug and follow it with debug, which brought us back to square one — trying to reproduce the bug.
We sat several programmers on the task, each of us repeatedly entering the system as a super-guest with the goal of bumping into the rogue bug.
But to no avail: not only did we hardly come across it — meaning that most of the time the super-guest was registered to the course easily and seamlessly — but even in the rare times in which we did get the error message, as soon as we started running our debug mode - the bug slipped away.
Never would I have thought I would be frustrated each time the super-guest entered the system without a hitch…
Will the Bug Forever Evade Us?
Three days after the bug was first discovered, I found myself alone in the battle — it was winter and the other programmers were absent due to illness. I did have hope, however: the previous day we had found a particular computer in which the bug happened, so that day I robbed the computer from its owner and tried to encounter the bug. But of course, when the bug realized it was being pursued, it stopped showing up on this computer too!
Still hoping to encounter the bug, I returned to the opposite tactic: diving into the depth of the code — to that section already reduced to by one of the other programmers — and trying to figure out exactly where the bug was happening. I grabbed a knife between my teeth and set off.
[Warning: The following sections have a particularly hectic dig at the code. Anyone who wants to skip to the end, click here].
Bring Your Shovel, We’re Going Digging
As mentioned before, we had already established that the bug was in a function that checks the user’s logins. This function is called
require_login, and is in the
lib/moodlelib.php file. The function goes from line 2771 to 3101 (a small little function, only 330 lines…). The error message is received when the function redirects to the
/enrol/index.php file. We saw that this redirect happens on line 3073 when the
if (!$access) condition is true. Since
$access is first initialized in line 2987, it was clear that somewhere between lines 2987 and 3073 it gets an incorrect false, which creates the problem.
I analyzed the code and saw this flow:
$accessis set in line 2987
- The first condition,
if (is_role_switched ($course->id), is irrelevant — we are not in a role switching situation (line 2989)
- The second condition,
if (is_viewing ($coursecontext, $USER)), is also irrelevant, as it is for admins, managers, inspectors, etc. (line 2993).
- We get to the
- Check the
$USER->enrol[‘enrolled’][$course-> id]variable. This variable should be the time when the user’s registration to the course ends. The first time you enter the course it is not set yet, so we do not enter this condition (lines 2998–3009).
- The following condition,
if (isset ($ USER->enrol[‘tempguest’][$course->id]))checks if the user is a guest. Again, this isn’t our case (lines 3010–3020).
- Then check
$accessin case it has become true at this point (line 3022). In our case — this is the first time — it is not true yet, so we enter the
- Here we call the
enrol_get_enrolment_endfunction, which returns the registration end time (line 3025). If it returns a truthful value (not false), it is set to
$ USER->enrol[‘enrolled’][$ course-> id], and
$accessis set to true (line 3033).
Here I realized that the
enrol_get_enrolment_end function is the key: if it returns false then we do not enter the code where
$access should get true and that would generate the Unable to Register error.
enrol_get_enrolment_end function resides in
lib/enrollib.php and extends from line 964 to 1039 (tiny functions they have there in Moodle…)
- It sends the following query to DB (line 969):
SELECT ue.* FROM mdl_user_enrolments ue
JOIN mdl_enrol e ON (e.id = ue.enrolid AND e.courseid = :courseid)
JOIN mdl_user u ON u.id = ue.userid
WHERE ue.userid = :userid AND ue.status = 0 AND e.status = 0 AND u.deleted = 0
-- where: status = 0 /*:active*/, e.status = 0 /* :enabled */
- It then checks that the start and end dates are valid dates (lines 982–1001)
- Finally, it checks (lines 1005–1031):
- That the end-time happens after the start-time
- That the start-time has arrived
- That the end-time has not yet passed
If these three conditions are met it returns the end date, otherwise it returns false. In the case of the super-guest, the start time was defined to be the current time — the time in which the guest clicked the button and entered the system. The end time was defined to be a little while later, and therefore the condition should return the end date. But obviously something was going wrong and the function was returning a
false value that was preventing the user from registering, therefore here lies the bug.
Come on Bug: Shoot, Don’t Talk
Now that I knew exactly where to look and what to check when encountering the bug, I lay there in ambush, waiting for the bug to attack.
For the rest of the day I kept entering the system as a super-guest, waiting for the bug to occur. And then, suddenly, at the grace of heaven, I came across it! I immediately performed the tests that are performed in the code — extracting data from the DB and comparing the time.
In terms of retrieving the data everything was fine — it had data in all the relevant tables. But when I came to compare the hours, I was in for a surprise. As I previously said, the start time was supposed to be set as the time in which the user entered the system, so that she could go straight into the course. But to my amazement, the start time was defined to be two minutes after the time the user entered: I had gone in in at 11:09, but the start time was set to 11:11… I.e., the check that the start date had arrived (in section 3.2 above) is the one that failed, and that was the reason that
$access’s value was false. Apparently, the process of creating the temporary user creates a future start date, rather than a current start date!
And most importantly — why was it happening only sometimes??
I Got You, You Slippery Bug!!
I decided to keep on trying to log in as a temporary user, and continued watching the start times. Much to my surprise, I saw that the start time did not change: for about half an hour I logged in as a temporary user, and the whole time the start time was the same — 11:11.
So I went to the code that sets the start and end times of the super-guest’s registration and took an actual look at it:
$today = time();
$startdate = make_timestamp(date('Y', $today), date('m', $today), date('d', $today), date('H', $today), date('G', $today), 0);
$enddate = make_timestamp(date('Y', $today), date('m', $today), date('d', $today), date('H', $today) + 2, date('G', $today), 0);
As you can see, the code uses the
make_timestamp function which is a Moodle function. I looked at the signature of the function (which is also in
function make_timestamp($year, $month=1, $day=1, $hour=0, $minute=0, $second=0, $timezone=99, $applydst=true)
I went over the parameters we send to this function to make sure they match what the function expects to receive:
- In the first variable we send a year:
date (‘Y’, $ today)
- In the second variable we send a month:
date (‘m’, $ today)
- In the third variable we send day:
date (‘d’, $ today)
- In the fourth variable we send an hour:
date (‘H’, $ today)
- When I reached the last variable, my jaw dropped. The function expects to get minutes, but instead we send the following value:
date (‘G’, $today). The variable G is not minutes - as requested by the signature, but hours! That could explain some things …
So why did the bug only happen on some computers, and why did it only happen occasionally? Why is it that sometimes the date-time created was wrong and sometimes not?
The answer lies in what the date function’s return value for
(‘G’, $today) is. This function returns the time in a 24-hour format without leading zeros. That is, at 11 o’clock it returns 11, and at 6 pm it returns 18. And this is the parameter passed to the
make_timestamp function instead of minutes! Therefore, a user who enrolls any time between 11am and 12pm, their start time will always be 11:11, no matter how many minutes after 11 they actually sign up at. This means that if the student enters and enrolls in the first 10 minutes after 11p.m., they will not be able to enter the course because the start time is in the future. And as the day progresses, it gets worse, the worst being at 11pm, where for the first 22 minutes after 11 the user will encounter this bug! (So it’s best to register at 1a.m.…)
The only silver lining in this exhausting battle was the simplicity of the solution: We replaced G with i, and all was well with the world.
And so this elusive bug came to its solution, and the super-guest is happy :)