RegTech: From KYC to KYD to … KYDL
I recently had the pleasure of being part of a panel discussion organized by A-TeamInsight in Singapore as part of their day long summit on RegTech. We discussed on various challenges which FIs face, what are some of the upcoming changes in the space and how they can be better prepared. In the summit, many speakers stressed on how Data Quality forms a crucial part of RegTech. One of the key points I mentioned was how KYC (Know your Customer) had already given way to KYD (Know your Data). And probably it is time now to take this a notch higher via KYDL —Know your Data Lineage.
KYDL is an acronym I am coining to illustrate the importance of Data Lineage, and why I feel it is a step in the right direction towards the maturity of the RegTech industry.
Background — RegTech
The term RegTech (short for Regulatory Technology) dates back to 2015, when FCA introduced the word to “identify ways to support the adoption of new technologies to facilitate the delivery of regulatory requirements”.
Since then, RegTech has grown out of being just a slice of the FinTech pie, to encompass any technology solution helping business with regulatory compliance in their respective sector — whether it be the pharmaceutical industry, cybersecurity, or ESG (Environmental, Social and Corporate Governance)— the recent buzzword. And similarly the spending on RegTech has increased over the years, as can be seen in the chart below.
It is thus understandable that firms would be interested in ensuring good “bang for their buck”, when spending millions of dollars each year on RegTech, and not to keep reinventing the wheel for any new regulation which comes up. There are primarily five key pillars of RegTech (Reference #9) — as seen in Fig 2 below.
Out of these, while KYC addresses one of the key pillars — the Identity Management pillar, KYD helps firms with some of the other pillars too — like Transaction Monitoring, Regulatory Reporting and Compliance. Let’s explore more on this below.
KYC — to KYD (and KYT)
In an article (Reference #3) published on The Global Treasurer, author Selwyn Parker highlights how the recent sanctions on Russia has made FIs realize the importance of Know Your Data (KYD) as opposed to Know Your Customers (KYC).
In the article, Selwyn quotes Max Heywood, head of public sector at Elucidate as follows: “Banks will need to have sophisticated detection processes that go beyond manual monitoring of transactions, towards a more comprehensive approach that can analyse what in some cases will involve millions of data points”.
But KYD is not something new. Especially in the areas of RegTech which focus on usage of AI/ML, which in turn focus on data. Ask any data scientist the amount of time they spend on data preparation — and a response of 60%-80% would be the average answer.
Now if only there was a way to present cleaner data — or the story behind the dirty data right away — that’s what KYD is targeting.
There are some thought leaders who take a step further to say that Know Your Transactions (KYT) is also important — which, as one might guess, helps in transaction monitoring. In the financial world, it is not just sufficient to know individual data elements — but what they mean, when strung together. For example, in capital markets, a cleared transaction without a clearing timestamp is something which KYC or KYD might not have flagged. But KYT, as the next level of ‘intelligence’, can come in handy here and help in identifying bad data quality at a transaction level.
Regulator’s focus: Switch to CDEs
There was a time when the regulators were pushing financial institutions for transparency (or completeness of regulatory reporting). When transparency became mainstream, the focus naturally shifted to accuracy. Which means, the quality of the data became important. And now even as we speak, regulators feel they need additional information — a view on additional Critical Data Elements (CDEs). While there is no singular definition of CDE , even amongst regulators, it is agreed that critical data influences the company’s management decisions and performance, and the criticality of data elements can vary from company to company.
In the space of transaction reporting for OTC(Over The Counter) derivatives, most of the major regulators are changing the legislation over the course of 2022–2024, starting with the CFTC (US based regulator). And as part of the changes, they are increasingly asking for more and more data elements to get reported.
Now in order to report CDEs, these need to be sourced — and sourcing of CDE’s can pose three problems:
- The CDE is not captured in System of Origin (SOO) at all, or not captured in the fashion expected by the regulator*
- The CDE is not accurate in System of Record(SOR)
- The CDE may not have a well defined SOR
*System of Origin is where data would typically originate from, while System of Record is like the authoritative source for the data— like a data lake — after the data has been remediated and validated. (Reference #11)
Just imagine how useful it would be if, the regulator asked for a particular data field in their next incremental change to legislation, and you exactly knew the following
- Where did the data field originate
- Which systems did it feed down to
- What are the transformations which happened as it fed down each level and finally into the SOR, and
- How is the data attribute in the SOR different to what is required in the RegTech application. For example a customer type can be captured as an “Individual” in the SOR while the RegTech application might need “Natural Person”.
The answer to almost all the questions is Data Lineage, and which takes us to to the next point: what is Data Lineage and why is it important.
KYDL — why is it important
What is Data Lineage: Data Lineage is the front-to-back ‘mind-map’ of data, as it flows through various channels in an organization, while also capturing the transformations which happen en-route on it.
Data Lineage does have its fair share of promoters and deterrents. As per a McKinsey research on Data Lineage (Reference #1), one of the arguments used by deterrents is the lack of guidance on how far “upstream” should a firm go, or how detailed the documentation should be, for each hop of the data. Sometimes the deterrents argue that the analysis of the Data Lineage might also lead to re-architecture of the various applications, in order to simplify the ecosystem where data is flowing.
However, having a robust data lineage has the following advantages
- Easier impact analysis for a data field: Whether requested externally from regulator, or internally — impact analysis would be significantly faster and more accurate
- Better confidence on data: Having the transparency of the source of data (SOO), transformations to the data and getting finally captured in System of Records(SOR) provides more confidence on the data itself
- Better Regulatory Compliance: Data traceability is much faster if the Data Lineage is available
Summary
The current regulatory landscape is ever changing. Existing regulations are getting rewritten, new regulators are realizing the importance of well defined regulations, and regulatory inspections happen periodically. Overall, there is increased pressure on firms for regulatory compliance, and, as it so happens, regulatory compliance is not optional.
In such a dynamic landscape, I feel that KYDL becomes quite critical and will help organizations to map out their fast pace of change. I also believe that for a company that embraces the KYDL paradigm, RegTech implementations will get reduced to a shorter timeframe with lesser cost. This can also ensure better Data Quality, and hence better regulatory compliance.
References:
1. https://www.mckinsey.com/capabilities/risk-and-resilience/our-insights/optimizing-data-controls-in-banking
2. https://www.acaglobal.com/insights/six-ways-financial-organizations-stumble-over-kyd-financial-crime-technology
3. https://www.theglobaltreasurer.com/2022/03/30/in-the-race-to-comply-with-sanctions-against-russia-treasurers-must-shift-from-kyc-to-kyd/
4. https://nomadit.co.uk/conference/rai2022/paper/64250#:~:text=The%20RegTech%20(regulatory%20technologies)%20industry,risk%20mitigation%20in%20supply%20chains.
5. https://acams.digitellinc.com/acams/sessions/3345/view
6. https://www.linkedin.com/pulse/difference-between-system-record-source-truth-santosh-kudva/
7. https://www.eqs.com/compliance-blog/what-is-regtech/
8. https://www.kbvresearch.com/regtech-market/
9. https://www2.deloitte.com/lu/en/pages/technology/articles/regtech-companies-compliance.html
10. https://www.forbes.com/sites/gilpress/2016/03/23/data-preparation-most-time-consuming-least-enjoyable-data-science-task-survey-says/?sh=403721e56f63
11. https://cdn.ymaws.com/edmcouncil.org/resource/resmgr/featured_documents/BP_Data_Man_Gloss_v0.2.1.pdf