Email Archiving and the Pitfalls of Stubbing
Until recently, the majority of email archiving providers relied on a technical approach called stubbing to manage storage issues. But does archive stubbing help manage storage efficiently and if it does, why have so many email archiving vendors abandoned it?
What Is Stubbing?
Organizations employ email archiving solutions primarily to ensure the long-term retention and availability of email for compliance and legal discovery. The bonus comes in the form of storage reduction. Archiving solutions can help to reduce the load to the mail server while providing unrestricted access to email. One way to achieve this? Stubbing.
Stubbing is the process in which email attachments or entire email messages are moved from a user’s mailbox to an external archive location. However, the content isn’t completely removed from the original mailbox — what’s left behind is a remnant or a “stub”, essentially a smaller version of the email which serves as a pointer or link to the original. When accessing stubbed emails, the user can retrieve the full message from the archive with no delay. Theoretically speaking, a brilliant way of reducing storage requirements and a strategy that has been used extensively by email archiving companies.
There are two types of stubbing in email archiving — attachment stubbing and entire message stubbing. With attachment stubbing, only the email attachment data gets removed from the server and is replaced by a link. If users opt to stub the entire email, the message body will be replaced with a text excerpt which contains a link to the original message in the archive.
At first glance, the benefits are manifold. Replacing email attachments with stubs helps to remove the attachments from the Exchange server and results in considerable space savings. From the end users’ point of view, nothing has changed — they can access their emails quickly and easily in the same folder where they originally placed them. Looks like a clear win-win. Until Microsoft’s very own Perry Clarke sounded a warning.
The 5 Pitfalls of Stubbing in Email Archiving
1. Stubbing Doesn’t Improve Server Performance
According to Clarke, the General Manager for Exchange Mailbox Server at Microsoft, stubbing leaves us with “only a set of indexes and records for the message” and comes with all the drawbacks associated with tiered storage taken to an extreme. Clarke describes stubbed emails as “Frankenstein messages” that have been created with 3–5% of information in Exchange and the rest out of it”. With every request, information needs to be pulled from two different places, which is why Microsoft believes stubbing is “a very convoluted, difficult to manage process” that’s ultimately problematic in email archiving.
Microsoft repeatedly pointed out that email archiving solutions that relied on stubbing did not show any considerable performance improvement or the reduction of storage workload. “Removing the message bodies and attachments from Exchange reduces the mailbox size, but it does not significantly change the server performance for users accessing Exchange”. Microsoft further clarified that it was the number of items, not their size that was the primary performance driver in Exchange and confirmed that a folder containing full email messages and the same folder containing stub files would have roughly the same performance.
2. Stubs Can’t Be Recreated
Marko Dinic, Jatheon’s CEO, explained why the company decided against stubbing and opted for an alternative technology. “Stubbing still takes up resources in the Exchange server. You’re basically offloading storage, but keeping the objects. This causes tremendous issues in Exchange as the number of items grow even though they seem “empty”. Stubbing is not a good strategy in email archiving because of data loss. If there’s any data corruption or loss, there’s no way you can recreate stubs. There are recommendations to offload data that’s not being constantly accessed, and not all data is accessed on a regular basis in email archiving, so it would make sense to use stubs. But the arguments against are simply too strong. In Jatheon, it was a decision that was made early on — we needed to think beyond stubbing. We needed to think smarter and more long-term.”
3. The Implementation Is Risky
Stubbing entails major changes to the configuration of the Exchange server. It also requires the deployment of software below Outlook that needs to pull data from two different places with each request. This software needs to be kept up to date with any changes in Outlook or in the archiving solution.
4. It Creates Complexity for End Users
Stubbing promises simple, seamless access and a familiar experience to end users. Their mailboxes look the same — the folder structure is preserved and so is every single email. The problem becomes apparent when users perform a search in Outlook and get partial results.
The information from emails and attachments has been moved because of stubbing, so users need to search through the archive to get complete results. Moreover, stub files turned out to be incompatible with some mobile devices, a major drawback in times when mobile access to data is crucial for business continuity.
5. Stubs Can Mean Legal Troubles
The ediscovery process usually begins with an archive-wide search which is then exported for further processing. The only problem with this procedure is that it skips the email stubs which reside in the live email system and which contain metadata that can sometimes be crucial evidence.
Precisely because email stubs look exactly like regular emails to the end user, they can be manipulated in a number of ways, and this trail of actions needs to be visible. Pulling an email from an archive without recoupling it with its stub in a way that’s legally defensible might be characterized as spoliationand can’t be considered full disclosure of evidence.
Conclusion
The final verdict? Like everything else in life, if it sounds too good to be true, it probably is. Stubbing can save storage, but it never proves cost-effective in the long run. It has the potential to complicate and dramatically raise the costs of ediscovery and give nightmares to both Exchange admins and your own technical support team.
Jatheon is a global leader in email, social media and mobile communications archiving and eDiscovery with 15 years of experience with on-premise archiving for regulated industries and a new, next-generation cloud email archiving solution. Unlike other archiving solutions, Jatheon relies on more advanced technology than stubbing but also reduces the workload of email servers. To learn more about archiving your unstructured business data with Jatheon, contact us or schedule your personal demo.
Originally published at jatheon.com on February 18, 2019.