Behind the Algorithm: Abhinav

David Ardagh
Jul 23, 2019 · 6 min read

Introduction

In these problems, competitors had a fantastic opportunity to experience the work of a quant trader first hand. You guys were given 85,000 masked features and managed to produce signals for the series of target variables we set.

The three main stages people had to go through were:

  1. Data cleaning
  2. Feature engineering
  3. Finding and optimising the best model

This was a tough challenge and there were lots of good attempts made. Here today we have the overall winner, Abhinav, talking about the way he approaches these types of problems.

He has some really good advice about building a base case model and submitting it early, so we would recommend giving it a read below!

The Interview

Firstly, I’m not a student, I graduated from IIT Roorkee in 2016. I studied engineering but didn’t like motors and that stuff and became interested in applied statistics. Whilst at university I used to participate and compete in the WorldQuant program, which is an offline trading and research program where you get a stipend for how well they do.

Then post-college I took up a job with a data science healthcare consultancy for a year and a half. After that, I was part of a startup for 6 months where we worked on an IPL based idea that helped the teams in auctions. We met two of the big teams and tried to help them. We realised in the end that the teams weren’t willing to spend any money, so it became more of a hobby thing. I think it would have made money in the UK where there are betting companies.

Image for post
Image for post
Abhinav Unnam

Since there I’ve been working on ideas in statistical arbitrage around improving trading signals. The competition is very similar to what we actually do in real life. Currently, I’m working for a firm in Mumbai, where I’ve just moved to from Bangalore.

At college, I studied a combined 5yr BSc and MSc, so I told myself that each year I would try something new (from my second year onwards). I had 2 semesters each year and 8 things to learn. For example, I originally tried computer vision before deep learning became popular and Fourier transformations were big.

Eventually, I found that stats are actually really useful, it’s not just MBAs making stuff up. On top of this, I did an internship where I was working in electrical engineering, which I realised wasn’t something I could do for 20

years. So after that, I decided to become a fully-fledged developer and realised that stats could be applied in lots of different domains which would keep things interesting.

I got started by doing courses on Edx. I slowly did more and more until I paid more attention to it than my own degree. Then I did the WellQuant stuff and by the time I graduated it was becoming really popular, but I managed to win a couple of competitions before that. I’ve won an iPhone and iPad in the past.

It was mainly a learning experience for me, I was getting back into competitions. What I liked though was that a lot of competitions are far fetched and a long way from real-life problems, but this competition was very similar to what I do in real life.

After a while, it stops making sense to take part in competitions just for vanity reasons, which is why I haven’t done any in a long time. Here there was a lot of features and that meant there must be an interesting approach — I had a hunch and I wanted to try it out.

I think the biggest thing is that there was a lot of features and the data was super noisy. This means many approaches like deep learning will overfit to the data and just won’t work straight away. So you have to build a model that is much more robust to noise. On the other hand, there were so many features that something like linear regression would have problems with multi colinearity, so that was the other side of the challenge.

The way I overcame this was to reduce the variance and bias using a random forest approach. You take lots of trees and average them out, this reduces the variance and keeps the bias the same. In this case, you can use this approach but with a series of linear regressions, not decision trees. The only thing you need to make sure is that the models are all independent!

For me there’s no direct, off the shelf approach that will always work, instead, you have to approach it from a purely statistical point of view. This competition made me scratch that part of my brain and be creative. It was fun actually, I really enjoyed doing this competition.

I am one of those people who try out everything.

I think part of the reason I did this competition was that it was super quick to be able to get something that you could test. It took me like 45mins from end to end, you didn’t have to use the toolbox and you[Auquan] provide a ready-made template so you only have to work on the core part of it.

What I’ve found in myself is that if I get the first iteration done quickly then I get hooked. If it takes time then I often get distracted by something and won’t come back. Even if it works or doesn’t work it will get my attention and I’ll sit down and figure things out.

In this case, by the first iteration, I mean seeing your name on the leaderboard. Whether you do well or not well you see yourself up there and you want to do better. I think it’s really important to get a base case that you can build from. Plus, the first iteration has the most inertia, if you get over that then you want to improve.

I think you should be able to get something down in 15mins, so you have a base case submission you can compare with. In this competition, it just took me 15mins. Then it actually only took me 45–50mins to get to my final submission as the script was very small so it could be run really quickly.

Being honest though I only got it that quickly because I have this book called introduction and elements of statistical learning. I’d read and understood this independent models approach, which just clicked for this problem. The implementation never takes that long so the fact I had a hunch with the noisy data meant it was really quick.

For me, it is just making sure you have this base case quickly. If you don’t you end up going down… It’s like in cricket right, you don’t want to come out to the crease and start hitting 6’s straight away, you want to get off strike and build on that.

Yeh yeh, it was a crazy match. Seriously. There will never be a match like it, getting that close and then tieing on the super over was insane.

auquan

Auquan aims to to engage people from diverse backgrounds to…

Thanks to Auquan

David Ardagh

Written by

Cornish born and working in a Fintech in London (how original). I try to make big things simple.

auquan

auquan

Auquan aims to to engage people from diverse backgrounds to apply the skills from their respective fields to develop high quality trading strategies. We believe that extremely talented people equipped with right knowledge and attitude can design successful trading algorithms.

David Ardagh

Written by

Cornish born and working in a Fintech in London (how original). I try to make big things simple.

auquan

auquan

Auquan aims to to engage people from diverse backgrounds to apply the skills from their respective fields to develop high quality trading strategies. We believe that extremely talented people equipped with right knowledge and attitude can design successful trading algorithms.

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store