Horrors of Fixing Windows.
Usually I don’t want to bore the world with stories how I was fixing someone’s computer. However I used so many rude words this time that I feel the story is worth sharing. Send your kids to bed, this horror is not for everyone.
All of a sudden, Windows 10 on my wife’s laptop broke. After the boot, I could move the mouse but that was all I could do. I was not able to click anything and input from keyboard didn’t work either. I gave it 30 minutes to recover but with no luck. It seemed like the old joke when you screenshot someone’s desktop and set it as a background while removing all icons on it.
I knew fixing this was going to take ages unless I was super lucky to google some magic answer. It wasn’t my lucky day. The problem is that many different failures of Windows have the very same symptoms. This is the price the users have to pay for the abstraction from low-level things that Windows gives. Figuring out what really happened under the hood is one of the most painful things in this operating system. Yes, you can turn on some features like ntbootlog but these usually don’t tell you anything useful. There is no alternative TTY with “usable” CLI, like Unix systems have, so you cannot switch and ps | grep to investigate while your graphical interface is dead.
Calling a Microsoft support usually does not help for the problems of this calibre. So what to do next? Yes, a safe mode and hope OS works at least in it. This time I was luckier, it did work in it. Now I knew that the issue (very likely) comes from something that runs in the regular but not in the safe mode. In official troubleshooting articles, you always find that you should stop all non-Microsoft stuff during boot time. Well I don’t know why but in my cases it was always Microsoft stuff that broke things. First thing to try was to disable regular applications in Startup section of Task Manager. I gave it a try but of course it didn’t help.
Eye from Mordor had to start searching elsewhere and next logical thing were Windows services (via System Configuration — Services). So again I tried to disable all non-Microsoft services. Didn’t help so it was the time to start “playing” with the Microsoft ones. When I say play in this context, the closest definition is “let’s play a game” from Saw movies. The reason why it is such a dangerous game is because by disabling some of these, you can completely break your login screen. And yes, that is what happened to me later on, but first things first.
To find the evil Microsoft service, I was thinking about applying binary search algorithm. You know, disable half of the services, identify which half is the broken one, disable half of that half and repeat until the single one remains. However, as noted before, disabling Microsoft services can be dangerous so I used aless “aggressive” approach and disabled around 10 services per try. This single try consisted of 3 simple steps:
- Disable (next) 10 Microsoft services
- Reboot to regular mode (and find that these didn’t cause the issue)
- Reboot to safe mode, enable them back and go to step 1
I don’t know what they do but there are over 200 Microsoft services in standard installation which results in more than 20 tries with my cautious approach. When I was a half way through, I broke my login screen. The funny thing is that the old way with pressing F8 during boot to get into safe mode did not work anymore. The only explanation I found was that the boot process is so quick that it cannot capture key press — lol. To get into safe mode, you have to get into login screen and choose reboot while holding down SHIFT key. After reboot, you are given a blue screen with possibilities including safe mode. But what if you break your login screen?
You can imagine that the level of raging was pretty high this time. What am I going to do if now I’m not able to get into safe mode to enable that service needed for the login screen? Few minutes later, I had my brightest moment of the day. I remembered that the blue screen with possibilities is also shown when you mess something up very badly. Typically when you are not able to boot for a few times. So what I did was that I used “hard” power off using the power button from the login screen 2–4 times in a row. Bingo, the blue screen was back and I was back in the game. Just FYI, the login-breaking service is one of these two:
- Device Association Service
- Device Install Service
I was not in the mood for trying to find out which one of them is it. I continued with my tries (disabling 10 services in 3 steps). I didn’t have much luck that day as “my service” was in the one but last. But finally I managed to reduce the list of potentially broken services to these two:
- Windows Search
- Windows Update
If you would do a survey across the world: “What piece of software generated the most suffering?”. I’m pretty sure Windows Update would achieve a devastating win. At this moment, I felt so stupid. Why didn’t I start with Windows Update if I knew it was such a …bad software? But this time, suprisingly, Windows Update was innocent. Windows Search was the culprit. With good feeling “it was not Update, no need to feel that stupid”, I disabled it once and forever. Now I’m going to seal the entrace to System configuration with the most powerful runes I possess so noone can release the beast again.
Maybe I could do something more clever. Maybe I could use some tool I don’t know. If you have any ideas what I could have done better, please tell me in the comments. I’d really love to learn how to fix these things better next time :-).