Reflecting on Opta

Sam Gregory
8 min readJun 9, 2018

--

One World Cup sim using Opta’s predictive model

Yesterday brought an end to two years working at Opta. It’s been a really cool first job and I think I’ve progressed a lot as an analyst and data scientist but also just as someone learning how to work and operate in a team. I genuinely have only good things to say about the data science team at Opta and if you are interested in joining they have an opening! Feel free to DM me if you have any questions about the role.

The nature of the role has meant I’ve been much less active in the analytics community than I had been previously so I thought it would be a good way to reflect on my time and shine a bit of light on what I’ve actually been doing for the past two years by answering some questions. So here goes!

Fair warning I’m going to answer all of them…

Going to chose to answer this one seriously. It’s really cool that I’ve been able to work with Tom for about 4 years now in different roles from the podcast to consulting with Analytics FC and most recently at Opta. Most of the work that we’ve generated in this time has been stuff we’ve both worked on collaboratively, which makes it sort of hard from the outside to see that we actually have quite different skill sets. Finding people who have different but complimentary skill sets to you is helpful both in learning and producing quality work. We’ve both also improved so much over this period and I think a big part of that has been our ability to work well together and push each other to learn knew things. Hopefully not too corny an answer, but will miss working with Tom.

Good question and one I don’t really know the answer to yet.

Imminently what’s next for me is a flight to Lithuania in a few hours as I make the slow journey to Russia for the World Cup which I’m pretty excited about. After that I’m off travelling for a while before heading back to Canada. The decision to leave Opta was a personal one, namely that I’ve decided to move back closer to home.

I love London and after living here three years have a great group of friends that I will be sad to leave behind, but I just felt like the time was right to move back. For those who have the opportunity to move abroad — I highly recommend it, all the cliches about meeting new people from different backgrounds and getting new perspectives on things are true… and London is such a fucking cool city.

One of my housemate introduced me to Dulwich Hamlet and but then complained that we didn’t have xG numbers for the Isthmian League so I would say a net neutral.

But seriously if you are reading this are in London at any point and have never been to Dulwich Hamlet please go. It’s one of the coolest and most unique football clubs in the country.

I am still embarrassed by how underdressed I am in this photo…

Favourite experience is a tough one. Maybe not what you are looking for but seeing xG numbers on Match of the Day for the first time this season and knowing that it was a model I had a hand in building was a pretty surreal experience.

No, but I’m jealous I never got to use football data for my P.E. homework. Also what kind of P.E. class has homework?

Choosing between carnage and scenes is a false dichotomy. The answer is always carnage scenes.

I was in a meeting about a year ago where I had to explain xG to Joey Barton. Sitting across from a man who I started a fake transfer rumour about and talking about my work was a very out of body experience.

For what it’s worth he was quite receptive and seemed interested in what I was saying. To demonstrate the concept I used the last goal he’d scored which was about a 0.12 xG chance if I remember correctly and before I even mentioned xG he said “The keeper should have never let that in from there,” which made the sell that much easier. Just goes to show that even football people who haven’t heard of things like xG are already thinking in these terms.

The more reflective answer to this question is that it wasn’t any one big mistake really but hundreds of tiny ones. I think I make a mistake or do something incorrectly just about everyday. The key is not to get frustrated at yourself but learn from these mistakes and just get incrementally better after every mistake, which is something I still need to improve on.

The real answer though is that really early on in the job I had read-write access to a database for some reason and accidentally change Idrissa Gueye’s player_id to 0 across the entire Opta database.

After this learned I only need read access.

Hopefully further ahead than they are now!

In all honesty I’m a little bit disappointed with how analytics has progressed in the club space. I know these things move slowly but if you asked me this question 5 years ago I think I would have given a more optimistic answer than what the reality is today. So my hope is that we start to see more data scientists or advanced analysts hired by clubs and that they start to be more involved in decision making. I know that’s a boring answer but I really don’t know.

This is a tough question. I think in terms of what I wanted out of this job — I would say it gave me everything I thought it would, but obviously things are never as straightforward as that. There were pleasant surprises and of course less-than-pleasant surprises so it wasn’t exactly “what I thought it would be” but I think the job spec is pretty clear in terms of what you actually do and in terms of personal development I am really pleased about what I got out of this job. Plus of course I enjoyed it!

I wrote one!

https://medium.com/@GregorydSam/getting-into-sports-analytics-ddf0e90c4cce

I use python most often but that’s just what I’m most comfortable with. I think anyone who says x language is the best for data science is wrong. Find what works for you and inevitably you will run into some problems or tasks that force you to switch around. If you are looking to learn a new language I would start with one of python or R, but again that’s just because those are the two I’m most familiar with.

I’m bad at reading full textbooks and working through them. It’s something I think I’d like to take more time to do in the future.

This changed over time. My number one goal in my role has always been creating new “advanced analytics” metrics to sell to clients in the Pro and media spaces. The other big part of my job is doing ad hoc data work for other internal customers — the data editorial staff, OptaPro staff working with clubs on consulting projects etc. As time went on I think the higher ups in the company saw more value in the data science team producing new metrics with the success of things like xG in the media so we moved more towards working on these longer term projects. It’s hard to put a % figure on it because it depends on what I’m doing that particular week or sprint and what the ad hoc request workload is.

I’m going to chose to take this as favourite OptaJoe style statistic:

More players get sent off on nights when there is a full moon. I’m still kind of in awe of this one (and no don’t ask me if it’s statistically significant).

When the coffee machine works 6.5/10 (a standard Nespresso machine), problem is how often it breaks.

I feel like most things I learn are fairly intuitive.

One piece of advice I read in a book called Clean Code which I found sort of counter intuitive at first was that comments in code are a crutch and should be avoided because your code should be clear enough to read on it’s own. The reason I found this counter intuitive is just that I’d always considered comments an important part of writing clear code.

My code isn’t nearly good enough yet to exist without comments but it’s something I think about when I am coding now, wondering how it would look without the comments and it hope it helps make my code more readable. Writing cleaner and easier to understand code is one of the areas I think I have — let’s say the most room to improve.

Structuring questions in a way that they can be effectively answered using data. Tons of coaches, journalists etc. have good questions but they aren’t structured in a way that they can be answered well with data. I think this is the most important skill to be an analyst taking some football question asked by a non-technical person and structuring it in a way that can be approached using data. It’s clear why this is the most important skill — without it you can’t answer the questions that people who can pay you are asking.

As for how to get data there are a few ways: apply to an event like the OptaPro Forum or the US Soccer hackathon, make a request for academic use if you are writing a thesis or paper, or better yet apply for the data science vacancy!

Also — not Opta data — but I’d feel remiss if I didn’t leave this link here.

I see no reason why your Sunday league side can’t benefit from at the very least understanding some analytics concepts like xG, but I assume you mean in a more structured fashion. The answer is probably “as far as the money goes”. If there is an edge to be made and enough money to pay someone to do analysis then there will probably be a space for analytics work. Charlie Reeves was doing analytics work at Forest Green FC when they were in the National League before he left to Everton so things are clearly already happening lower down the pyramid.

Again — I wrote one!

https://medium.com/@GregorydSam/getting-into-sports-analytics-ddf0e90c4cce

I don’t really want to talk too much about my salary but obviously it was good enough that I took the job.

As for promotion and progression — the football analytics scene is interesting because it’s so new. I don’t really know what the “next steps” are in terms of the pathway for these kind of careers and to be honest I find that really exciting, but that might not be for everyone. Were I to stay at Opta and Perform (the company which owns Opta) I think there definitely would have been room for promotion either within the role or onto a new one, but again it’s a new career path and everyone is kind of figuring it out at the same time.

I didn’t do much cricket work at Opta but I had the opportunity to do bits and pieces. Learning a sport through data provides a very different way of viewing the sport — I came at cricket without any of the preconceptions or myths that data are often used to challenge. So yeah I will genuinely miss cricket!

Thanks for all the questions, it was fun answering them. On a parting note I’m not sure where I’ll pop up next but I’d like to be a bit more active in the community in the meantime and I think there are lots of opportunities right now to get more involved. It’s been very cool watching so many people get jobs in the industry who all started out like me blogging and playing around with data. I’m excited to see where things go next both for me personally and football analytics as a whole. But first off to the World Cup…

--

--

Sam Gregory

⚽️📊📉📈 | Data Science + Sports Analytics | Grad Student @iHealthSportVU