Scanning CompanyHouse.gov.uk with Neo4j

Data Reply
DataReply
Published in
2 min readMay 2, 2018

It is now almost a year since I worked on this very interesting graph analytics project, and even though I wanted to blog about it sooner I never got the chance to. Luckily, I finally did, while at the recent GraphConnect conference, which takes place annually in London at QE II centre. Allow me a short parenthesis here: Neo Technology, as always, put a lot of effort in making sure that the conference is a memorable experience for all attendees. Neo4j 3.2.0 was released, with a variety of new features and, finally, the company reports that scalability issues are being addressed (or so the product engineers claimit remains to be tested!). The atmosphere was more than inspiring, and so, between following some interesting talks and having stimulating discussions with other attendees during breaks, I was able to squeeze some time to share my experience.

So what is it all about?

CompanyHouse is the United Kingdom’s registrar of companies. All forms of UK companies are obligated to be incorporated and registered with CompanyHouse and file specific details as required by the current Companies Act 2006, which are digitally recorded.

CompanyHouse is a member of the Public Data Group, which was formed in 2011 to improve the amount and quality of data publicly released in order to make more data available, with the objective of increasing economic activity. Thus, CompanyHouse data are publicly available through their website while, as of 2016, data is also available through a RESTful API.

Obviously, we are talking about a very rich dataset which can be analysed in a multitude of ways. One of those ways is through graph analytics; but how can one construct a graph of companies using these data?

Originally published at www.datareply.co.uk.

--

--