FinML — The Promise of Synthetic Data in the Fight against Financial Crime

Hendrik
Tide Engineering Team
2 min readMar 8, 2021

In this episode of FinML, we had a panel discussion with Edgar Lopez Rojas, Graham Barrow, Annette Fong and Hendrik Brackmann on the merits of using synthetic data and the challenges that compliance organisations face when trying to work with synthetic data, moderated by Paul Starrett!

Key takeaways from the talk:

  • Compliance departments face a struggle to fight financial crime while at the same time satisfying privacy regulations. In the past, data protection and privacy were often seen as a more pressing concern due to higher regulatory fines and bigger concerns from customers. It is now unavoidable to seek to balance between the effective use of information to expose wrongdoing and the responsible use of personal data.
  • In order to use private datasets as well as share financial crime related data across companies, synthetic data has emerged as one possibility. Synthetic data tries to learn distributions from real data sets and create synthetic ones that resemble this original dataset but do not contain private information.
  • Since every company is slightly different, there is a possibility to take a dataset that is shared across many companies and adapt it to the need of a specific company similar to the ideas of transfer learning.
  • New thread angles, such as specific fraud topologies that are emerging, can be modelled explicitly using synthetic data and hence be used to test existing systems against circumstances that have previously not been observed.
  • The downsides of using synthetic data are that synthetic datasets can’t be correlated with information outside of the synthesised data set and that it is very challenging to produce a good synthesised data set.
  • Explainability and fairness are likely going to become more important in the regulator’s eyes. It seems though as if there are no techniques yet to remove bias from a synthetic data-set.
FinML — The Promise of Synthetic Data in fight against Financial Crime

About FinML

FinML is a meetup group dedicated to applications of Machine Learning in finance. We are a group that is dedicated to discuss the economic and statistical concepts behind running Machine Learning in the real world. We strongly believe in discourse, which is why our sessions are 30 min presentation and 30 min open discussion. Sign up here to be invited to all of our meetups and contact myself in case you are interested in speaking at an event!

--

--