Why ArcticDB Works So Well for Quant Research and Data Science
Introduction
If you are reading this you are probably expecting an article that explains how the performance, feature set and cost of ArcticDB make it a compelling choice for you. Whilst that is all true, there is a much more important reason why you should be using ArcticDB for your research and data science. Read on to find out more.
How I Started Using ArcticDB
The time was 2016 and I was working as a risk quant — my team was mostly creating models to risk manage the complex products we traded. This was interesting and challenging; the models were mostly for calculating price and risk of complex securities given a set of market data. The portfolio risk was measured by applying lots of market scenarios on top of the pricing models.
At around this time, the risk team and portfolio managers had started asking for statistical models. We were good at measuring the result when the market does something specific, but people also wanted to know what the most likely scenarios are, based on the recent history.
We started to build statistical models based of various analyses of historic market data. We quickly realized that:
- Testing statistical models is very data intensive.
- Model strengths and flaws are only revealed by thorough testing.
- Addressing the flaws requires changing the models and re-testing.
- Soon a whole family of different models emerged.
- A few of those models became useful but there was no clear winner.
- So we needed to run a collection of different models on an ongoing basis.
At this point my process that was working well for the pricing models was struggling to cope when applied to the statistical models. The problems were:
- Using files for the data had become chaotic and error prone. The multiple model versions made this worse.
- The target production database was SQL. I was starting to question that choice for models that would continue to evolve over time because experience had taught me that schema changes were difficult and expensive using SQL.
We needed a new approach.
Fortunately, I knew that the testing process was similar to that for quant trading models so I talked to researchers working in that area about their workflow. They gave me lots of great advice and the common theme that emerged was that they were all using Arctic — the predecessor of ArcticDB — and they loved it.
Research in ArcticDB
Moving my data into Arctic was very straightforward — there is very little to learn if you already know Pandas and Python. The performance was good (now great in ArcticDB) and the API was easy to learn and use. I immediately became a huge fan.
After the Arctic refactor was complete and I continued model development I noticed some other fantastic benefits:
- When I needed to change the layout of my data as models and requirements changed over time, I could do so easily and without needing help from DBAs (Database Administrators).
- Having a development queue of new models was easy to support in ArcticDB using different environments. New models could be thoroughly tested for an extended time period in a test database, then promoted painlessly to production.
- Building data pipelines around DataFrames as the unit of storage is a great template for research.
Fast Forward to the Present
The reason why I joined the ArcticDB team and recommend it for research is simple — it hugely improved my productivity for building, testing and deploying models and I wanted to help other researchers and data scientists improve their productivity.
I had learned to use Arctic in a fraction of the time it had taken me to learn SQL. And I still needed help from experts with SQL when things got complicated. We also found that others in the team were able to learn and become productive using Arctic in almost no time. In fact, we recently collected feedback from our ArcticDB users and by far the most common theme was that ArcticDB had an almost zero learning curve for new joiners — they were able to be productive from day one.
In one sentence: ArcticDB gets out of your way so that you can focus on your data and models — your effort and thinking are concentrated on where they most count.
Thank you for reading my blog post. If my story has struck a chord with you, you can try ArcticDB for free and with no setup right now. Click the notebook link below to be running ArcticDB in seconds. If ArcticDB helped my research productivity maybe it could help yours too?
Notebooks: https://docs.arcticdb.io/latest/notebooks/ArcticDB_demo_lmdb/
Docs: https://docs.arcticdb.io/latest/
Website: https://arcticdb.io/