Recovering from a Failed I2C Bus Recovery

It started last night when I sat down to understand what was going on with my wife’s MacBook Air. It had spent the week sitting on the coffee table, blandly displaying a boot progress bar that was failing to make any progress, after I attempted to upgrade to OSX Sierra.

This was both surprising and unsurprising — we had periodically experienced the laptop failing to reboot properly, either it was slow to boot, or hung and required a several power-cycles. But eventually it worked. This had all been mildly annoying but was not a show-stopper.

However this time it was different. No amount of power-cycling was yielding a successfully booted machine.

Stepping back in time, I had purchased the Macbook Air a year or two ago on the vague assumption that it would be a nice machine to use, with low maintenance overhead; something that was easy for Kate to work with and not cause too much fuss. My experience with maintaining Macs is minimal, but I was happy to find that my assumption was largely confirmed and I didn’t need to get too adventurous to keep it working. So, I’m conflicted on how to feel about this, because the sheer bloody-mindedness required to resolve this latest failure staggers me.

So the machine won’t successfully boot. More to the point, we fail to reach interactive userspace: Something’s broken in either the firmware or kernel. A quick search revealed the handfull of options available to change the boot process, which became a handy reference.

Attempting Recovery Mode (Command-R) had no impact, but I found that hammering Shift during boot to enter Safe-mode managed to make a small-but-noticable difference in the progress bar’s crawl to the right. Odd, but not fundamentally useful, as the Air still didn’t successfully boot.

Given the laptop was “in the middle” of an OS upgrade, I figured it was time to try booting to Single-User-mode (Command-S) to see if I could have any influence on that, perhaps undoing the update configuration to see if it booted normally.

This was a fraught experience. Looking at the root filesystem for Single-User-mode testified to this being a minimal initrd-like environment, not the actual root filesystem. However, the “Install macOS Sierra.app” folder was right there in the environment’s root. Further, on entry to Single-User-mode instructions were printed for remounting the root as Read/Write from its default of Read-Only.

If only these instructions were sensible. The fsck and remount completed successfully, but were immediately followed by a flood of warnings:

disk1s3: device is write locked

No changes to the filesystem stick. It seems the message is genuine. Looking in /dev/ there were several disk and partition entries, but naively attempting to mount those was also unsuccessful for the most part (though unrelated to the write lock). I was unable to remove the update directory thanks to the write lock, and it was time to reboot out of Safe-mode to look for other options.

The previously linked article also mentioned resetting the PRAM (parameter RAM) with Command-Option-P-R. And it turned out trying this tack yielded more progress than even the attempt at Safe-mode! The progress bar eventually proceeded all the way to the right, but remained stuck there, the machine never taking the next step to the login screen.

Given the progress bar had been sticking at various points, and now at 100%, it seemed like the right time to try the article’s next suggestion: Verbose mode via Command-V:

Right. The last output we see is from what appears to be the camera driver. Failing to boot due to camera failure modes would be disappointing, but further evidence is required before drawing that conclusion. If it is the camera that’s preventing us from successfully starting up, maybe Safe-mode will help. I’d like to boot to Safe-mode in Verbose-mode.

It turns out that this not-even-remotely-Mortal-Kombat button mash of Shift-Command-V doesn’t apply the desired settings — the system boots up in Verbose-mode but not Safe-mode. Further digging lead to c|net, who had an article on What to do when a Mac won’t boot to Safe Mode describing use of nvram and the boot-args boot variable.

Rebooting to Single-user mode to get a shell, I executed the described ‘nvram boot-args=”-v -x”’, only to find that setting ‘-x’ is rejected as a System Policy SandboxViolation, and only ‘-v’ sticks in boot-args! Further digging showed that System Integrity Protection could be disabled, but only from Recovery mode, a mode which we cannot reach as the boot process hangs in perpetuity when trying.

But at least -v stuck. On a whim, given I now shouldn’t be required to hold Command-V during boot to enter verbose mode, I tried holding Shift and powering on.

Success! The laptop starts in Verbose-mode and boots through Safe-mode! Not only that, but it hits the Sierra installer which executes to completion. I rejoice and reboot.

And hit a kernel panic. Exactly adjacent to the loading of the camera driver.

At this point I started a conversation with @AppleSupport on Twitter to express my concern, and carried on trying to resolve the problem in the mean time. They were quite responsive, but kept diving down the usual support paths of checking disk integrity and asking me to do things I couldn’t do as my machine wouldn’t boot outside of Safe-mode. Further, it was around 3AM at this point, and I couldn’t very well follow their final advice to arrange a visit to the Apple Store with the laptop.

All of this involved many more reboots than I’ve described here, and along the way the evidence was growing around the camera driver being at fault. The panic output during an I2C bus recovery operation and continual boot stalls at the exact same point in the boot process when the camera driver was probing lead to the next idea, of blacklisting the camera driver.

If that was a thing.

Which it isn’t — well it is, but it appears to generally be reserved for Apple to irritate people by blacklisting their Ethernet drivers. So what do you do?

Apparently, delete the kext directory from /System/Library/Extensions. I took a rough stab at the AppleCamIn and AppleS2CamIn symbols in the kernel output correlating to /System/Library/Extensions/AppleCameraInterface.kext and applied the appropriate mv command, which obviously failed because I wasn’t root. So in the vein of getting sandwiches made, I applied sudo, which also failed — obviously: System Integrity Protection was busy protecting integrities, and we can’t disable it. Because we can’t reach Recovery mode. Because the Recovery mode boot path still loads the Camera module. So we can’t remove the camera module.

Apple — Please can I have the right to shoot myself in the foot. That would be great. Thanks.

However, given I could reliably reach Safe-mode maybe we could do the inverse — use kextload to poke the graphics and sound drivers into the kernel to improve the user experience.

Nope. Can’t do that either. Because Safe-mode System Policy. These Drivers Are Dangerous. They outright refuse to load, and there’s not -force option in sight.

Apple — Please can I have the right to shoot myself in the foot. That would be great. Thanks.

So; what now? Despair?

Maybe.

Or we just ignore OS X’s policy enforcement.

This is easily done by booting Linux off a USB stick, and moving the AppleCameraInterface.kext directory from there. SystemRescueCD has been very reliable for me in the past, so I popped that on a USB stick, threw it in a USB port, rebooted, held Option and my breath. Sure enough, the boot process switched to the device selection screen, and I had an obviously named “EFI boot” device to boot from. Naturally I booted that and the keyboard completely failed to work, and even better the OS couldn’t find its own filesystem from the initrd, after taking minutes to get out of the kernel due to USB bus probe errors.

From some forums and wikis it seemed like Ubuntu had support for Macbook Air machines. I had a server-edition ISO lying around, so I popped that on a stick, selected my obviously named “EFI boot” option, and watched the same USB errors occur for a few minutes. However, this time the keyboard worked at least, so I aborted the install process and dropped to a shell, only to find that support for HFS filesystems didn’t exist in the installer kernel and busybox environment.

So I skipped off and downloaded the Ubuntu desktop image, and went through the same process. This time Ubuntu also failed to find its root from the initrd. On a whim I unplugged and replugged in the USB stick with the live environment. For whatever reason the hotplug event sorted out the USB stack. From my emergency initrd shell I ran /init and arrived at a functional Ubuntu desktop. Not even in safe mode.

It would be nice if this was the end of the woes, but if you hadn’t gathered from the length of this epic whinge there’s still more.

For a start, Linux doesn’t support HFS+J, the journaled variant of HFS+ and won’t mount these filesystems writeable. In a burst of sanity OSX does support journaling, so this is something that needs to be disabled on the filesystem before we hit Linux. Luckily, `diskutil disableJournal /dev/diskXsY` isn’t something that’s governed by the System Integrity Policy and is a quick sudo away.

What’s not a quick sudo away is actually finding the filesystem to mount. It turns out Apple pack their HFS+ filesystems in a CoreStorage volume in a partition of the disk, so we need to determine where the end of the HFS+ filesystem lives inside the CoreStorage volume for all the filesystem metadata to be found by the driver.

Naturally this process involves tools that aren’t packaged in the stock Ubuntu image, and the Macbook Air’s Broadcom WiFi adapter firmware isn’t either. So we need to transfer `testdisk` over via USB unless we want more pain to enable WiFi. There is by far enough pain already in this story that I won’t add more.

So to recap thus far: We’re in Linux and have our OSX root filesystem mounted in Read/Write mode.

I move `…/System/Library/Extensions/AppleCameraInterface.kext` out of the way then unmount.

And reboot to OSX.

Not even in safe mode.