Uncovering the Truth Using Comprehensive Data Analysis

on Overseas Company Investment in the London Property Market

Daniella Tsar
Apr 26, 2017 · 5 min read
Image for post
Image for post

At Thomson Reuters Labs, we drive innovation through data science and visualisation using novel approaches to create solutions that help our customers — all the while harnessing the power of linked and shareable data. We regularly deliver projects tackling corruption and risk.

Last year, we were approached by Transparency International UK to analyse the London Property market — highlighting the power of data in exposing money laundering risks. With the help of our, ICIJ’s and OpenCorporates’ data, and the expertise of Transparency International, we were able to demonstrate that London land and property are a particular target for those looking to launder the proceeds of corruption. We were able to show that data remains one of the most effective weapons in the ongoing fight against corruption.

The objective of the report was to determine land and property in London that is owned by Politically Exposed Persons (PEPs).

From a data science perspective, the main result is the fact that we have found no information on almost half of the companies that we’ve investigated using 4 different datasets. This means, that even the 4% of land titles that we found to be linked to PEPs are likely to be just a small proportion of the real number.

The intelligent use of data that is available is a highly effective tool in the fight against global corruption. Our research has shown the great power that exists in bringing different sources of data together to deliver a holistic picture of risk.

Data and Methodology

We found 44k land titles (these include land and property) in London owned by 24k overseas companies. Our task was to find as many further connections to these companies, as possible, to get a holistic view of company structures.

Image for post
Image for post

Routes to Uncovering further Connections to Companies

To match company names from the Land Registry to other records, we used the following datasets:

  • The ICIJ’s database: Contains over 500,000 offshore companies, foundations and trusts that were found in the Panama Papers, the Offshore Leaks and the Bahamas Leaks investigations.
  • Thomson Reuters PermID: Open, permanent, and universal identifiers. Used for matching and cross-referencing with Thomson Reuters datasets for additional links.
  • OpenCorporates: The largest open database of companies in the world, containing over 120 million corporate entities from more than 115 jurisdictions.


Both PermID matching and OpenCorporates (through OpenRefine, a tool recommended for matching a large number of companies) use algorithms that give an accuracy/confidence score based on how similar the input string is to the match through APIs. Being two separate systems, the underlying nuances differ, but essentially both use similarity metrics to match a company name to an actual entity within their databases. For both of these databases, just as for ICIJ’s Offshore Leaks, we used country information in the matching to increase confidence in the match.

We ran through all of the companies in all three databases, to make sure that connections in any of the three datasets were not ignored. For example, had we stopped once finding two officers of company ‘X’ in ICIJ’s Offshore Leaks, we would have missed the ten other connections that the OpenCorporates matching might have uncovered subsequently, thus greatly weakening our analysis.

The map below shows the local authority breakdown of unknown companies that we were unable to match to any record in any of the three methodologies used. The postcodes from the Land Registry were geo-referenced against the postcode directory from the Office for National Statistics to geospatially locate records within the Greater London area.

Image for post
Image for post

In the report, we show that about 4% of land titles whose owning companies we’ve managed to identify are connected to PEPs (Politically Exposed Persons). This still leaves us with almost half of companies on which we were unable to find any information. It is also worth noting, that, of the majority of these, approximately 1,000 land titles are located in high-value areas of London, notably the City of Westminster, the City of London, and Kensington and Chelsea. And fewer than 6% of these had a monetary value associated to them. This missing data makes it difficult to follow and identify suspect illicit wealth, likely to be significant.

The UK has committed to introduce transparency measures in the form of a beneficial ownership registry for overseas companies owning land and property in the UK. This will not reach its true potential in preventing money laundering if the data is not complete and standardised. The lack of unique and consistent identifiers can be just as limiting as the lack of data in any analysis. Only good quality and complete data can deliver a holistic picture of risk.

Read the full report here: http://www.transparency.org.uk/publications/london-property-tr-ti-uk/


Thoughts and blogs from the world’s largest open database…

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store