As was the case with my 2016 post, this isn’t intended to twist the knife into Salesforce — having worked in the IT industry for over 30 years I’m well aware that things go wrong, sometimes catastrophically, and that everybody involved will be working as hard as possible to minimise the impact and the outage. I’ve been that guy cursing the database backups for not restoring fast enough, or trying to reverse engineer what happened from the current flaming state of a system. These things happen, but how you handle the communication around them is key.
On May 17th 2019, three years and one week after the infamous NA14 meltdown, and yet again just before the Salesforce World Tour London, we had Permissiongeddon. At the time of writing (16:30 UTC) it is ongoing Across a number of orgs, we found that users, could access all data in the system, and community users could access all data in that community. In the event that there is sensitive or competitive information being stored that is suddenly available to everyone, the implications could be significant.
With a situation of this nature unfolding, communicating to every Salesforce customer that there is potentially a problem around the security of their data is an obvious first step. I’ve worked for companies in regulated industries where we would just to shut down all access until we’d figured out what was going on, as it’s better that users see no data than data they shouldn’t be able to see.
But if you don’t know there’s a problem, you can’t do anything to protect yourself.
Rule #1 of damage control
Get everything out in the open as soon as you can.
Much like the previous outage, communication from Salesforce was woeful. Several hours after we raised a case there was still nothing on the trust site to notify other customers there was a potential problem, instead there was the comforting green tick indicating no ongoing incidents. By now it was becoming common knowledge, as there were discussions raging on Reddit and Twitter. Support were directing people reporting issues to trust, which still said there was no problem, and as we remember from last time, that’s how you lose control of the narrative.
I don’t expect trust to be updated the moment that there is a perceived problem — that would result in false positives and turn trust into the little site that cried wolf. That said, there’s a reasonable amount of time to get a statement out and in my view there should be a process to follow that ideally takes less than an hour. In this case, given that it potentially compromised data, there should have been sirens and flashing lights everywhere, but there was silence. Nothing to see here, move along.
I don’t know why this is something that Salesforce struggles with when there is a genuine emergency, given the size of company they are. Maybe the size works against them and there are too many layers that need to sign off, or maybe it’s analysis paralysis where fear of saying the wrong thing and spooking customers takes hold, so nothing is said which spooks everyone just as hard. Whatever the reason, it’s very disappointing. After a while it starts to look as though they are hiding the problem in the hope of fixing it before anyone notices. I’m sure that’s not the case, but not everyone has my positive outlook on these situations.
Like last time, it gives me no pleasure to write a post of this nature, and I sincerely hope that this it the last time that I have to do it. But I have to do it, I can’t let this kind of thing happen without calling it out.
As always, these are nobody’s words and views but mine.
I’m better known in the Salesforce community as Bob Buzzard — 14 x Certified, including Technical Architect, Multi-time MVP and CTO of BrightGen, a Platinum Cloud Alliance Partner in the United Kingdom.
You can find my (usually) more technical thoughts at the Bob Buzzard Blog