The One Reason Why AI/ML for AML/KYC has failed (so far)

Ben Charoenwong
5 min readSep 16, 2019

--

Source: https://www.pexels.com/photo/full-frame-shot-of-eye-251287/

Given the rise in computational power and data availability, we are often led to believe that the banking industry will be disrupted. Fintech companies are popping up left and right, trying to solve a myriad of questions and challenges. One of those is the Anti-Money Laundering (AML) and Know-Your-Client (KYC) compliance processes imposed by regulators. But despite the technical chops and new methodologies (or old ones like neural networks and deep learning with new data), it won’t work.

Why? Let’s look at the existing practice of data science, and why artificial intelligence made a resurgence since the 1950s. The rise of machine learning and artificial intelligence as stand-alone fields, despite being simply applied statistics and computer science, is due to the unexpected increase in data availability. Even Frank Rosenblatt — who coined the term “Perceptron” and built the first set of neural network models in 1958 — thought it was doomed to failure due to the shortcomings of linear models, which itself was due primarily to data constraints. (It is typical that when there is not enough data, we must typically assume a more parsimonious model, a “first-order approximation” if you will.)

Why AI/ML?

The rise of big data and computational power is what really spurred the success of what we now call “deep learning” models. These new models themselves are souped-up versions of the underlying neural network and follow the same structure but with maybe with different optimization procedures. The founders of modern artificial intelligence did not foresee having billions of observations over which to train a model with tens of millions of parameters, around the order of magnitude in which a deep learning model starts to perform well.

So let’s look at the banks. Did they suddenly benefit from a windfall of data generation due to enhance their models? Perhaps they gained more information about users based on their metadata or phone usage. But even before that, they had a much better data pipeline: the transactions of a particular individual themselves! Detecting anomalies based on a series of transactions has always been an area of forensic accounting. Yet, banks are frequently caught with facilitating money laundering and managing money for shady characters linked to terrorism and crime.

Effectively, the big banks failed in using that existing rich dataset to improve the AML/KYC process.

There is only one reason for this failure: the AML/KYC process is not a matter of data science. It is a matter of incentives. Fundamentally, the problem is trivial. But banks make too much money violating these processes.

The well-known case of HSBC settling US$1.9 billion for laundering US$881 million for known Mexican drug criminals. Even before that case, prosecutors in the U.S. and U.K. had known about HSBC’s money laundering activities for Saudi and Bangladeshi clients tied to terrorist activities. Even though their respective federal governments blacklisted these clients, the bank simply by-passed the flagging system by replacing spaces in the clients’ names with dots! Surely any computer scientist worth a dime would have included condition checks on the string, normalizing them so that the strings are comparable and robust to small permutations of names. For example, there’s the Levenshtein algorithm (meaning-based algo’s, robust to replacing a number with the word), Soundex algorithm (designed to be robust to phonetic differences), or Jaro-Winkler algorithm (designed to be robust to typos, based on an “edit distance,” among other countless newer developments.

Even without a hard match, the algorithm should produce more useful flags (e.g., red for most suspicious, orange for slightly suspicious, and green for not suspicious). The algorithm should have used a combination of the client name, the receiving bank name, and the address of the beneficiary to raise those flags.

But despite that, “no one” picked up on this (or some people did and were subsequently silenced). It seems…almost intentional (indeed, the prosecutors were able to prove to court beyond a reasonable doubt that was the case).

Given how much money they make from clients with shady histories, it isn’t clear that they want to improve their AML/KY. The HSBC case from above is not isolated. Many large banks have had issues like this before. Just as one example, based on a Drug and Enforcement Agency investigation in 2005, Wachovia was alleged to have facilitated over US$420 billion in transactions with no known verifiable Mexican source, at least US$110 million which were directly linked to drug profits. It paid US$160 million in fines.

Until the big banks and bulge brackets decide to solve the laundering and AML problem, there will not be much progress.

So where are we with the promises of AI/ML for AML/KYC? Surely, lawyers and administrative staff who charge by the hour do not have an incentive to make the process quicker. In the sea of ever-growing expanding and increasing regulatory complexity, the lawyers are the mainstay winners. It is based on the willpower of those in charge of conducting compliance, and big banks will not exhibit that willpower.

Nuanced Promises

The adoption of technology is an equilibrium game. It is a simple game-theoretic exercise to show that to the extent possible, the dominant strategy of the big banks is not to implement state-of-the-art software to improve the quality of AML/KYC practices. Why? Suppose they deployed it and uncover a handful of their clients fail the KYC process. If the competitor banks don’t implement it, the client will leave the bank and flee to the competitor banks.

On the other hand, if the competitor banks deploy it, then the client flees the competitors and comes to the bank. In both scenarios, the large bank will want to not adopt the intensive AI/ML for AML/KYC. In this exercise, small banks are irrelevant since they do not have the servicing capacity for such complex clients in the first place.

The name of the game for the AI/ML promise is the lower costs of compliance. It may convert an AML/KYC due diligence process from 5 hours into seconds.

Although big banks may not want to deploy such rigorous AML/KYC processes, the usage of machine learning can have the potential to help smaller players in the investment advisory space, decreasing the cost to acquire customers. These smaller players cannot compete with the big banks, and the wealth management market may remain fragmented between upper-middle-income and ultra-high net worth clients. However, on the extensive margin, potential investors who have had trouble accessing the market may benefit from the lower cost of entry.

All through this process, large banks will likely not respond apart from adopting cosmetic procedures to claim they are on the cutting edge. So, expect lots more marketing material on cutting edge data science but don’t expect actual usage. The only large potential disruption from that is if banks feel competitive pressure (either due to regulators or among themselves) to reach further into the investor pool, in which case AI/ML may be deployed for smaller investors on the extensive margin as well. But, we should not expect shady activity among big banks for complex customers to end any time soon.

--

--

Ben Charoenwong
Ben Charoenwong

Written by Ben Charoenwong

Assistant Professor of Finance at the National University of Singapore. Michigan and Chicago alum. I write random musings and complain about business media.