How Complete is “Complete” When It Comes to Digital Evidence?
Courts provide little guidance on the balance between “too much” and “not enough” digital evidence needed to prove cases. The burden falls to attorneys and investigators to make decisions based on hard-to-answer questions.
with J.D. Ronan, Robert J. Peters, Matthew Osteen, Alicia Loy, Brandon Epstein, & Joseph Pochron
“Less is more,” the saying goes, and it’s a useful adage whether you’re designing PowerPoint slides, purchasing a smaller home, or cutting down on your monthly bills.
When it comes to information, though, “less is more” is on shakier ground. Think about the “information firehose” of email, social media posts, podcasts, the 24/7 news cycle. Is it less of all that that you want? Or less, and also the right information?
That’s the dilemma at the heart of digital forensics today. On one hand, the volume and variety of digital data — along with the increasing complexity in obtaining it and piecing it together — is too much for many forensic labs. Months-long backlogs can lead to delays in getting the evidence to investigators and attorneys who need it for their cases. Cost constraints — not to mention the COVID-19 pandemic — compound the problem.
On the other hand, a minimalist “less is more” approach to evidence hamstrings justice itself. As forensic examiners know, data — or its absence — isn’t always as it appears. It needs to be authenticated and corroborated with other pieces of evidence to put a particular user behind the keyboard (or screen) of a particular device.
Technology can solve some, but not all, of these problems. For instance, artificial intelligence can be invaluable to prioritize which of the hundreds of thousands of pictures or messages should be examined more carefully. But it can’t prove who put them on the device.
Courts provide little guidance on where the balance lies, so the burden falls to attorneys and investigators to make decisions based on hard-to-answer questions including:
- What’s actually relevant to help build a case’s fact pattern?
- Can the evidence presented as “complete” be properly authenticated?
- How far back should a timeline go; where’s the privacy rights balance between too broad and too narrow?
- Can a jury properly weigh — and convict — based on evidence that might not be complete? (And would that conviction be upheld on appeal?)
- What does this all mean for digital forensics examiners and attorneys on a practical level?
In the United States, these questions are grounded not just in the Constitution’s Fourth Amendment, which protects individuals’ right to privacy; but also in the Federal Rules of Evidence. These cover relevance, the doctrine of completeness, the exclusion of relevant evidence, and the authentication of evidence needed to support a finding that an item introduced into evidence is what it’s claimed to be.
Balancing among these and other rules can become challenging when it comes to digital evidence in two respects:
- The extent to which screenshots or printouts can show the totality of a conversation, social media thread, or multimedia exhibit.
- The context of what one court called a “universe” of text messages — or email, or patterns of life in general.
How complete are screenshots?
In 2015, printouts were still the most common form of proffered social media evidence (1) — even though software had existed for several years purporting to preserve posts and messages in a format closer to “native.”
To be admissible, the evidence can be authenticated in two steps: first, “accurately reflecting the content and image of a specific webpage on the computer,” and then, in their authorship. (2)
Authenticating content and image isn’t as simple as getting a “witness with knowledge,” such as an account’s owner, to testify to a post’s authenticity:
- If the account owner is the defendant, authenticating the evidence could be self-incriminatory.
- A third party with access to the account could have uploaded, changed, or deleted the content.
- The “webpage” or account could be spoofed, also by a third party who may or may not be known.
- Social media is inherently ephemeral, with some mobile apps and features designed to automatically delete posts within a certain period of time without user action.
It’s true that, even with all pains taken to preserve the evidence — putting a device into Airplane Mode, for instance, or serving a platform service provider with a preservation order — a screenshot or even a photocopy may be all that’s left to show a tweet, a snap, or an Instagram Story existed at a point in time.
However, it’s also true that screenshots and photocopies are an imperfect solution. Screenshots can be altered. Badly photocopied photographic evidence can be impossible to evaluate, and pages of printed text messages may not be relevant in their totality. (3)
Add to this: screenshots and printouts don’t offer the full extent — or context — of multimedia that might appear in the body of a tweet, post, or text message. Even before AI made “deepfake” images and videos possible, a screenshot or photocopy didn’t account for edited or otherwise manipulated multimedia evidence.
Some evidence, such as child sexual abuse material, can be authenticated by comparing its hash against others in a database such as Project VIC. However, databases of hashed illicit images don’t exist for every crime, and the method doesn’t work for newly produced material.
Of course, establishing the authenticity of social media posts, messages, images, and videos isn’t only about the evidence itself. It’s also about who posted it. Courts have generally rejected evidence (4) that isn’t corroborated across a range of other evidence, (5) or in other words, “taken as a whole with all of the individual particular details considered in combination.” Without those details, a reasonable jury couldn’t assess the evidence’s authorship.
Even so — and even at a time when more data than ever is available — both investigators and prosecutors often take a narrow view of which “individual particular details” could be considered corroborating.
In part that’s because evidence takes time and effort to corroborate. But digital evidence is a powerful way to fully support compelling victim statements. The failure to corroborate one piece of evidence raises the risk that a court will reject it, and overall weakens a case.
How complete is the case timeline?
Another way to scope corroborative social media evidence — and authenticate its authorship — is through the use of timelines. For example, forensic examiner Troy Schnack blogged (6): “The Holy Grail is seeing a contraband download started near a person checking their personal webmail or using an identifiable login name into social media, shopping or other web site.”
This type of pattern-of-life analysis is a concept that some digital forensics experts have embraced from the intelligence world. By establishing a victim’s or suspect’s normal patterns, it becomes easier to see deviations from that normal.
In turn, deviations offer a starting point for further investigation. They might indicate illicit activity. They might reveal the manipulation of data. However, they might also mean that someone had access to a device or account that wasn’t its owner.
Thanks to the rise of smartphones, personal fitness devices, and home assistants among others, the opportunity to use data to ask — and in some cases, answer — these questions exists to an extent it did not in previous years.
The rub is that, to establish a pattern of normal behavior, an investigator or examiner needs a much greater quantity of data than many courts might be comfortable with. That’s the premise of the so-called “mosaic theory of the Fourth Amendment.” (7)
As analyzed by law professor Orin Kerr, mosaic theory “allows individual law enforcement steps that are not searches to become a search when collected together.” (8) In 2010 Kerr argued that the theory wasn’t “persuasive”: “…how do the police know when a mosaic has been created such that the sum of law enforcement techniques, when aggregated, amount to a search?” he questioned. (9)
However, that was before a much broader range of digital data from devices, apps, the cloud, and third-party internet and cell service providers (among others) made even more of a mosaic of people’s lives. The data can help establish the authenticity of digital evidence, but at what cost to people’s privacy rights under the Fourth Amendment?
Courts offer little guidance in this area. Some now reject search warrant applications for entire devices based on particularity, stipulating searches that target data acquisition from, say, only certain app-based conversations between certain people within a certain timeframe. (10)
Courts are well-intentioned in attempting to protect privacy interests as well as gather relevant data up front. However, from a digital forensic as well as an investigative standpoint, this creates numerous challenges.
First, digital forensics tools aren’t generally designed to limit searches in this way. Tools designed for lab-based analysis are designed to make a forensic copy of a device’s entire file system, if not also its unused memory, where deleted data could reside.
That’s an important feature on two levels. First, it enables analysts to verify that the data exists as it appears. And second, it allows other analysts — say, those hired by the defendant — to validate law enforcement’s work.
Once again, deeply examining every device is resource-intensive. To reduce some of the burden on labs, digital forensics tool vendors designed “field extraction” or kiosk-based tools. Their function is to allow “front-line” investigators to acquire and preserve logical data: the messages and other data that haven’t been deleted.
These tools’ main function is to provide leads to begin an investigation — preserving a digital chain of custody without necessarily seizing a device. The tools do limit the scope of searches, but they are far from a complete forensic solution.
That distinction is key. The “chain of custody” message is on shaky ground when field tools’ users are capturing only portions of data, not the device’s entire database structure. When investigators release devices back to their owners without making a forensic copy — which would otherwise capture much more app data, deleted data, and metadata such as geolocation — they miss the chance to either strengthen their case or revise their hypothesis.
Having data rather than a forensic copy also makes it more difficult to validate the methods used to obtain the data. Forensic examiners can use the same or different tools to reproduce the results, but only if they have a pristine forensic copy to work with.
Additionally, a constrained timeline can hamstring investigators from finding either exculpatory data, additional suspects, or additional victims, all of which might exist outside the scope of what a judge initially deemed appropriate.
(A similar problem exists with keyword searches. For example, restricting a search to contraband-related file names ignores the fact that suspects don’t always mark their files “Illegal Stuff Here.”)
For instance, limiting a search to “the ‘Messages’ icon and/or text messaging applications” could exclude relevant communications: many apps whose main functionality isn’t “text messaging” nonetheless include a chat function which the defendants could have used. (11)
In addition, a warrant that limits a search of images or videos from a certain time and date range may not account for the times and dates the media were actually taken. That’s a gray area, too, since other photos and videos taken prior to the specified range still exist on the device during that range.
So, how complete is “complete”?
Take a timeline together with the manipulability of digital data, and courts have to consider not just the breadth of a search — the timeline and associated parties — but also its depth. Thus, perhaps a better question to ask is whether a defendant has the right to demand a more complete forensic examination, or forensic copies of their devices versus reports supplied by opposing counsel.
For example, in a 2017 capital murder case, a defendant successfully argued that their defense expert should be able to create their own forensic image of their mobile device, rather than rely solely on the state’s “victim-information-redacted cell phone data report[s]” showing the defendant’s use of the phone. (The defendant claimed the existence of deleted text messages that could cast doubt on the state’s case, but ended up pleading guilty to murder.) (12)
In the civil realm, plaintiff Earthcam, Inc. argued the need for physical images from additional devices — as opposed to the logical images provided by the defendant, OxBlue Corp., only from certain devices — to determine “the existence of any deleted files, the contents, the date of last use, and whether any of the data was moved.” (The court rejected the argument.) (13)
On the other hand, spoliation of evidence in one location might lead to a need to find backup evidence in other locations. In a different civil case, Escamilla v. SMS Holdings Corp., a judge ruled that an extensive search of one defendant’s work and home computers, as well as key employees’ hard drives, deleted data from an image of file and print servers, and backup tapes, was reasonable. Not only were the materials relevant, but their destruction on the defendant’s devices had also prejudiced the plaintiff, and the materials were needed to determine the extent of prejudice. In addition, the court ruled, the defendants hadn’t proved that undue burden or cost prevented them from accessing the data. (14)
How instructive are civil cases to criminal cases? The guidebook “Criminal e-Discovery: A Pocket Guide for Judges” highlights tension, even collision, between criminal and civil electronic discovery. The two worlds have “different public policies underlying criminal and civil litigation, constitutional requirements, and special ethical obligations of prosecutors and defense counsel.” (15)
At the same time, the authors acknowledge, digital data in “even routine drug cases and bank robberies” is voluminous and complex — to the point where it “can quickly exceed document-based paper discovery in a white-collar or corporate prosecution from fifteen years ago.”
Further, while the Federal Rules of Civil Procedure 26 and 34 mandate e-discovery procedures, Federal Rule of Criminal Procedure 16 does not. That disparity, the “Criminal e-Discovery” authors noted, led two federal magistrate judges to turn to civil e-discovery rules for guidance.
The guidebook’s authors described arguments in favor of bringing criminal electronic discovery more in line with civil procedure, even accounting for a prosecutor’s “unique nonadversarial discovery obligations”. With the fundamental fairness of due process, speedy trial rights, and the right to effective counsel on the table, the authors wrote:
“When the government provides e-discovery in a reasonably organized fashion, it can help the defense efficiently review discovery and can lead to more productive plea discussions, less litigation, and speedier resolution of a case.”
The practical future of digital discovery
When it comes to discovery, then, prosecutors face a dilemma. Too little information, and they risk consequences from Brady rule violations. Too much information, though, could ultimately risk the same.
In 46 states, open discovery requires prosecutors to share with defendants all the evidence against them, and for defendants to do likewise. That ostensibly levels the playing field, allowing both sides to build the strongest possible cases. (16)
“All the evidence,” however, can include PDF reports generated by digital forensics tools — which can be hundreds of pages long. The process of going through them to flag evidence, the guidebook “Criminal E-Discovery” suggests, is both costly and risks introducing human error in missing the proverbial “needle in a haystack.” (17)
The answer may be to look, again, to civil litigation. From that realm, some forms of AI, such as natural language processing, help to evaluate reports supplied under discovery. Another form, technology-assisted review such as predictive coding, tags documents for relevance.
Because predictive technology is constantly learning, it can assess and tag similar documents based on the decision of relevance to present to the reviewer. That way, finding relevant data is faster and easier — and applies more scrutiny — as predictive coding prioritizes potentially relevant documents for review sooner rather than later.
These methods help attorneys to sort data long after it’s been acquired and analyzed. At a case’s outset, though, no technology exists to help determine or prioritize which cases to go deep on.
It would be time- and cost-prohibitive to analyze every case for, say “deepfakes” or patterns of life, especially when digital evidence is rarely the sole evidence against a defendant. Cases are built on other forms of corroboration. Whether to go deeper depends on whether the state is able to prove its case beyond a reasonable doubt, with or without all the possible data.
For example, in a 2018 Maryland involuntary manslaughter case, an appeals court noted that although the “universe of text messages” had not shown the full extent of conversations that preceded a drug overdose death, the prosecutors didn’t need to — they had established a timeline using text messages that clearly showed the victim still seeking drugs from the defendant mere hours before his death. (18)
However, timelines aren’t always so cut and dried. Even a case that gives that appearance can involve more than meets the eye:
- In the investigation phase itself, a suspect with access to a victim’s phone could manipulate evidence that might be used to incriminate them.
- During the forensic examination, tools may not always accurately reflect date and time stamps (19)— a persistent problem, (20) indicating that it isn’t a problem vendors can easily solve.
It can be easy for prosecutors to develop blind spots and confirmation bias. (21) That’s likewise true of investigators and forensic examiners, who not only rely on prosecutors to make decisions, but may also harbor their own biases — which, when faced with overwhelming data volumes, can lead to cognitive shortcut-taking. (22)
Better scrutiny can ensure this doesn’t happen. However, it starts long before the discovery process. Questioning the hypotheses that underlie analysis and investigation starts at a case’s outset and continues throughout, not to undermine professional judgment, but to bolster it.
The process is outlined in some recent academic papers, which promote bringing more scientific rigor to digital forensics. These methods include structured argumentation as well as a standardized evaluative interpretation method using likelihood ratios.
These methods could go beyond helping attorneys and investigators. They could have a knock-on effect to juries, too, as they weigh digital evidence — in other words, “…whether an expert in the instant case actually applied the methodology that the judge found valid generally is a matter of weight, as is any conclusion the expert reaches that is applicable to the litigants.” (23)
In other words, attorneys and forensic examiners — on their own “side,” not just opposing counsel’s — can serve as valuable checks and balances on each others’ thought processes. This might seem to create additional effort at a time when courts are more seriously than ever strained by backlog. That, however, is a good argument to apply more scrutiny, not less.
- Angus-Anderson, Wendy. “Authenticity and Admissibility of Social Media Website Printouts.” Duke Law & Technology Review. Vol. 14 №1. 2015–2016. https://scholarship.law.duke.edu/cgi/viewcontent.cgi?article=1282&context=dltr accessed 26 August, 2020
- Angus-Anderson, “Authenticity and Admissibility of Social Media Website Printouts.”
- Siewert, Patrick. “Screen Shots Are Not (Good) Evidence.” Pro Digital Forensic Consulting & Investigation blog. April 24, 2020. https://prodigital4n6.com/screen-shots-are-not-evidence/ accessed 26 August, 2020.
- Angus-Anderson, “Authenticity and Admissibility of Social Media Website Printouts.”
- Commonwealth v. Mangel 181 A.3d 1154 (2018). http://www.pacourts.us/assets/opinions/Superior/out/opinion%20%20affirmed%20%2010346700333996977.pdf accessed 27 August, 2020.
- Schnack, Troy. “Timelines in P2P Forensic Cases.” Troy 4n6 blog. March 11, 2018. https://troy4n6.blogspot.com/2018/03/timeline-in-p2p-forensic-cases.html accessed 27 August, 2020.
- Kerr, Orin S. “The Mosaic Theory of the Fourth Amendment,” 111 MICH.L. REV.311 2012. https://repository.law.umich.edu/mlr/vol111/iss3/1 accessed 27 August, 2020.
- Kerr, Orin S. “D.C. Circuit Introduces “Mosaic Theory” Of Fourth Amendment, Holds GPS Monitoring a Fourth Amendment Search.” The Volokh Conspiracy blog, August 6, 2010. http://volokh.com/2010/08/06/d-c-circuit-introduces-mosaic-theory-of-fourth-amendment-holds-gps-monitoring-a-fourth-amendment-search/ accessed 27 August, 2020.
- Kerr. “D.C. Circuit Introduces “Mosaic Theory” Of Fourth Amendment, Holds GPS Monitoring a Fourth Amendment Search.”
- State v. Robert Andrews. (A-72–18) (082209). August 10, 2020. https://epic.org/amicus/fifth-amendment/andrews/State-v-Andrews-Opinion.pdf accessed 27 August, 2020.
- State v. Robert Andrews. (A-72–18) (082209).
- Walthers v. Hon. Astrowsky State. №1 CA-SA 17–0106. Petition for Special Action from the Superior Court in Maricopa County No. CR2014–108856–001. Filed May 18, 2017. https://law.justia.com/cases/arizona/court-of-appeals-division-one-unpublished/2017/1-ca-sa-17-0106.html accessed 27 August, 2020.
- Earthcam, Inc. v. OxBlue Corp. 49 F.Supp.3d 1210 (N.D.Ga 2014). https://www.leagle.com/decision/infdco20140925774 accessed 27 August, 2020.
- Escamilla v. SMS Holdings Corp., No. CV 09–2120 (ADM/JSM), 2011 WL 13243580, at *39 (D. Minn. June 28, 2011) https://law.justia.com/cases/federal/district-courts/minnesota/mndce/0:2009cv02120/108121/454/ accessed 27 August, 2020.
- Broderick, Sean et al. “Criminal e-Discovery: A Pocket Guide for Judges.” Federal Judicial Center. Third Printing, 2019. https://www.fjc.gov/sites/default/files/materials/06/Criminal%20e-Discovery_First%20Edition_Third%20Printing_2019.pdf accessed 27 August, 2020.
- Lewis, Rebecca. “What to know about New York’s new discovery laws.” City & State NY, February 10, 2020. https://www.cityandstateny.com/articles/policy/criminal-justice/what-know-about-new-yorks-new-discovery-laws.html accessed 28 August, 2020.
- Broderick. “Criminal e-Discovery.”
- Johnson v. State. 225 A.3d 769 (2020). https://www.leagle.com/decision/inmdco20200131306 accessed 28 August 2020.
- Schnack, “Timelines in P2P Forensic Cases.”
- Brignoni, Alexis. “Trust but verify: Formats, timestamps, and validation.” Initialization Vectors blog, March 17, 2020. https://abrignoni.blogspot.com/2020/03/trust-but-verify-formats-timestamps-and.html accessed 28 August, 2020.
- Zeidenberg, Peter. “A hard lesson for prosecutors.” Politico, April 1, 2020. https://www.politico.com/story/2012/04/a-hard-lesson-for-prosecutors-on-public-integrity-074715 accessed 28 August, 2020.
- Cherry, Kendra. “What Is Cognitive Bias?” VeryWellMind.com, July19, 2020. https://www.verywellmind.com/what-is-a-cognitive-bias-2794963 accessed 28 August 2020.
- Faigman, David. “Evidence: Admissibility vs. Weight in Scientific Testimony.” The Judges’ Book: Vol. 1, Article 11. 2017. http://repository.uchastings.edu/judgesbook/vol1/iss1 accessed 28 August 2020.
Christa Miller is a journalist and co-founder of Forensic Horizons.
J.D. Ronan is a United States-based attorney with both prosecution and defense experience, who specializes in digital evidence and high tech crimes.
Robert J. Peters is Senior Attorney at the Zero Abuse Project, where he develops and delivers training and technical assistance to prosecutors and child abuse multidisciplinary team members on crimes against children.
Matthew Osteen, a subject matter expert on topics related to digital evidence, financial crime, intellectual property theft, and third-party data, is General Counsel and Cyber and Economic Crime Attorney for the National White Collar Crime Center (NW3C).
Alicia D. Loy, J.D., is a Law Clerk at NW3C, where she assists in the development of curriculum for NW3C’s Judges and Prosecutors courses, provides support for NW3C’s prosecutorial technical assistance program, and develops content for webinars and podcasts on aspects of internet-facilitated crime.
Brandon Epstein is a Certified Forensic Video Examiner (CFVE) and Certified Forensic Video Analyst (CFVA), who has been qualified as an expert witness over a dozen times in the past two years.
Joseph Pochron currently serves as a Senior Manager for EY in their Forensics, Digital Investigations & Privacy practice in San Francisco, CA.