The Cost of Inaccurate Documents & Identity

A handshake sealed a deal in the 19th Century, in the 20th Century it was paper and pen, but in the 21st Century, business is increasingly conducted digitally. As companies and users adopt a more digital mindset, efficient and accurate electronic data collection and identity verification have proven integral. There’s continued opportunity for automating these processes, and in many cases migrating data collection, analysis, and authentication to mobile applications to improve overall user experiences, thereby:


  • Human Error — Average typing accuracy is only 92%
  • Fraud — 3% — 5% of documents submitted in the digital channel are still fraudulent


  • Convenience — A 2016 study “found that 56% of all bills are paid online today and that most consumers paying online do so via billers’ websites rather than via financial institution bill pay or third-party bill pay sites [such as Mint Bills or Moven].”
  • Efficiency — Fiserv developed a proprietary Scan to Pay application for one financial institution, and found in a 2014 survey “that 37 percent of consumers indicated technology like Scan to Pay would motivate them to begin paying bills — or pay bills more frequently — via smartphone. The same survey showed that among 65 million Internet-using, smartphone-owning U.S. households, 27 million had paid bills using those phones.”
  • Safety — For example, AirBnB has over 4 million listings and 150 million users, with only peer-to-peer identification verification in place for users or hosts. Personal and material safety are of paramount importance to both parties and would be enhanced with additional layers of ID authentication.

So how do we fix this problem? We think strategic innovation in document translation and authentication along the digital document pipeline would increase convenience and safety and decrease error and propensity for fraud.

The first step is accurate and efficient translation of documents to verifiable data

For documents that didn’t originate in an electronic format, accurate and efficient digitization is the first step.

The Smithsonian Institute, for example, is crowdsourcing transcription services of its handwritten archives, with teams of dozens of volunteers manually typing and verifying each set of documents — a project that would take decades. Current technology dictates this necessarily inefficient process due to the handwritten nature of the materials, and human error while transcribing mandates redundancy.

Optical Character Recognition (OCR) technology has existed for years, and has improved greatly over time, though it too is still plagued with inaccuracies. Most banks today offer remote capture services for check depositing, a fairly straightforward application of OCR given checks’ consistent formatting. With the consumer embracing this automation, we see great potential for services expanding upon these technologies. Mobile capture of data for populating form fields can reduce abandonment and increase accuracy over manual form fill — doubling the number of forms filled to completion on mobile devices.

A financial document digitizing software needs to:

  1. Identify the correct data fields in a document, as their layouts vary widely
  2. Have a contextual understanding of what the data means, e.g. a price could be a credit or a debit, the address could be the return address or mailing address, etc.

We recently invested in Papaya Payments, a service that allows users to take a picture of any bill and pay it immediately from their mobile device. Their technology intrigued us because

  1. It is now building a database to be able to transcribe any bill instantaneously
  2. It understands who needs this information and can transmit accurate information to these different parties (financial institution, payee, customer)

Now that you’ve collected the data, what do you do?

When receiving third-party electronic files, businesses need to verify that the data collected is authentic, accurate, and unadulterated. Graphics editing programs are readily available and easy to use by laypersons, though all editing leaves a trail which can be traced to disprove authenticity. This task of digital forensics is currently conducted manually by companies like Alȳn Digital Forensic Services. Developments in neural networks and machine learning are helping automate file authentication.

We are fascinated by companies like that leverage pixel and file compression anomalies to detect whether documents have been tampered with. Defense Advanced Research Projects Agency (DARPA) is also researching ways to use machine learning to identify digital forgeries not only in documents, but all media files.

Digital file authentication has a wide range of applications:

  1. In the financial sector: filing of fake insurance claims, tampering and overstating paystubs to qualify for larger loans;
  2. In the legal sector: verifying contracts, intellectual property theft, falsified wills, custody disputes, etc.
  3. In the retail/peer-to-peer sector: allowing individuals to feel confident in document validity.

We foresee a future that could make automatic document verification commonplace and even output a digital accuracy score for each file.

You’ve verified that the document is accurate, now you have to verify the identity of the end user

Forged or stolen identities are a huge problem. “The Federal Trade Commission (FTC) estimates that as many as 9 million Americans have had their identities stolen each year.” Identity forgery isn’t always so grandiose as stolen social security numbers and emptied bank accounts. I had a personal experience with forgery that left me shaking my head. While doing my home remodel, I discovered that the deputy inspector we hired (while licensed), falsified the report and signed off using another inspector’s name, who knew nothing about my property. It didn’t even cross my mind to validate their signatures, nor would I necessarily know how to do so.

Companies like DocuSign offer paperless document signing for real estate contracts, human resources, etc. However, actual signatures are a weak form of authentication because they are not identically replicated by the user at each transaction, nor are individuals trained to identify forgeries during transactions.

Instead, in the digital age, a variety of alternate electronic “signatures” are becoming more commonplace. Identity can be authenticated using three factors.

  • Something you know (such as a password or pin number)
  • Something you have (such as a smart card)
  • Something you are (such as a fingerprint or other biometric method)”

We at Fika Ventures are looking for platforms that can complement companies like Checkr, which is used by many peer-to-peer companies such as Uber, Lyft, GrubHub, and Zillow to verify an individual’s ID. Perhaps for a company like Wag!, a dog walker now needs to log in (using something they know) and scan their ID (something they have) so the platform can verify who they are. Adding this layer of authentication could reduce the number of complaints and lawsuits lodged by Airbnb hosts and guests because “Bad actors follow the path of least resistance,” and a verified identity adds accountability. We’ve spent time with companies like Berbix and are excited by how their impressive ID verification products might integrate further into the electronic document funnel.

Once we have accurate, authentic documents that are tied to the correct identity, peer-to-peer marketplaces and financial institutions will more efficiently, safely, and conveniently do business. With advances in machine learning and strong partnerships in place, these processes can have applications in the legal, educational, medical, and corporate industries as well. Fika Ventures eagerly anticipates continuing to take an active role in this innovation.