Why Traders and Finance Professionals Need to Learn Python

Subramanya Nagabhushanaradhya
Analytics Vidhya
Published in
7 min readSep 27, 2020

Jupyter is the new Excel

If you are a trader or you are working in the financial services, Excel is your bread and butter; you can analyze prices and other tick data, evaluate your trading portfolio, calculate VaR, perform backtesting and the list goes on and on. You are an expert in pivot tables, formulae, charts or even VBA and PowerQuery. You can crunch a solution quickly but before you know it, your spreadsheet expands in tens of tabs, thousands of rows and spaghetti VBA — so its readability and maintenance go right out the windows. And when you dare to press F9 to refresh the results, you can grab a cuppa and still wait a bit more (if your computer has not crashed in between)!

👉🏻 Sounds familiar? You are probably nodding fervently.

As useful as it can be, for delivering real insight from data, spreadsheets simply won’t provide all of the answers you seek. Let’s see what makes Excel popular and where it falls through the cracks, but most importantly why YOU should start using Jupyter Notebooks as an alternative powerful analytical tool.

Excel Heaven

A few factors have made Excel so valuable, most noteworthy being its simple yet powerful reactive computational model with built-in functions. A spreadsheet is like a blank canvas where an “amateur programmer” can write code using Excel’s proprietary language (i.e. formulae and VBA).

Blanks deploy a wide range of IT systems to run business-critical processes, however, there has been an explosion of Shadow IT in the form of spreadsheets, which complement these core systems and serve as the default tools for new business initiatives. Due to the enterprise systems not being as flexible to accommodate the quick time-to-market that is required in a dynamic financial environment, Excel comes in and fills this gap!

Handy as it may be, is this the right approach?

Excel’s role in the enterprise needs to be redefined!

Excel Hell

Trying to use spreadsheets for advanced, responsive analytics over a large volume data, is using the wrong tools for the job.

Here are some of the issues Excel power users face:

  • Cascading Errors: Excel is notorious for errors being propagated down a column and then across the whole spreadsheet creating a snowball that turns into an avalanche of trouble. Scarily, some of them go unseen until it is too late. A few cases have been publicized where things have gone awry with huge cost implications.
  • Scalability: Excel has a limit on the number of rows and columns it can hold, but with datasets increasing at exponential speed, spreadsheets soon run out of memory or utilize most of the CPU. When this happens, the risk of data corruption goes way up.
  • Performance: In complex workbooks, changing one number can affect hundreds of reiterative calculations and Excel takes time to work each one out. When a certain size is reached, it does not only slow down the spreadsheet itself but also affects the performance of any other applications that require memory space.
  • Testing: It is almost impossible to test the correctness of a spreadsheet and then prove the subsequent changes have not broken anything else.
  • Traceability/Debugging: Often an even small change can significantly corrupt your sophisticated formula making it extremely hard to identify and fix.
  • All-Inclusive: The data and calculations are all contained within the Excel file and run a local computer. This means that collaboration is constrained, making version control a nightmare. Also, the data is static based on the latest refresh, rather than being updated in real-time as conditions change.

All these problems have already been solved for traditional software decades ago.

And last but not least:

  • Operational Risk: All spreadsheets start as small/quick-fix calculations but some turn into permanent enterprise-grade solutions. They feed a number of business processes, but due to a lack of visibility of the entire landscape lineage, the integrity of many financial, operational and regulatory processes is threatened.

★ If you would like to read more about End User Computing (EUC) risks, this is a really good article.

Hello Coders; Bye-bye Traders

Traditionally, Banks split quantitative trading teams into quant traders and quant developers. The former come up with trading ideas and the latter translate those ideas into executable code. When as a trader you are also a proficient coder, this segregation is no longer required, which gives you a competitive advantage: not only Banks look to optimize their costs and hence you can survive a redundancy, but also you can put your own trading models to action and use this time-advantage to your benefit. Being able to automate mundane tasks is so liberating and coding, in fact, is so creative!

It is pretty much a generational thing too, as more junior traders come already equipped with coding skills from uni, so if you want to maintain your employability you need to get involved.

How to Excel with Python and Jupyter Notebooks

So what are the solution to those traders and financial professionals who find Excel limiting (if not outdated)?

Learn Python and use Jupyter Notebooks as a container!

Python

Python is reasonably easy to learn and very versatile and hence there is an increased uptake within the financial community. It is now a prerequisite for many quantitative roles, alongside with Excel. It is less elaborate than C++(or Java), meaning that: ❶ the learning curve is not as steep, and also ❷ the amount of code required to complete a task is substantially smaller by a factor of 5x or 10x.

Python’s growing popularity is evident in the vast number of libraries that support pretty much anything you will need as a trader:

  • reading, writing, cleaning, massaging, slicing/dicing data
  • maths, stats, and time-series
  • financial analysis: trading and quantitative finance, market data analysis, stock/derivative market analysis, Bloomberg data access, execution engines, backtesting, risk analytics, etc
  • machine-learning pipeline (e.g. predicting market prices)
  • plotting and beautiful/interactive visualizations
  • SQL support
  • send emails
  • web scrapping (e.g. market prices online)
  • task automation/scheduling
  • Excel integration (if you really like Excel so much)

❗️ Find a concise summary of libraries here (not affiliated).

Jupyter

Along the same lines, a Jupyter Notebook is a web-based computing environment that enables you to author documents that include: live code, graphs, widgets, rich narrative text (including links, equations, etc), pictures, etc. More specifically, you can:

  • Edit code in the browser, with automatic syntax highlighting, indentation, and tab-completion/introspection.
  • Run code from the browser, with the results of computations attached to the code which generated them.

They provide an all-inclusive, self-contained record of the computation and as such, they can be used in lieu of Excel as a real-time data analysis platform.

Courtesy: Python for Finance Tutorial

The sweet spot

So you might wonder:

👉🏻 Shall I still be using Excel? Of course…

Spreadsheets are still your best bet in the following use cases:

  • Correctness and accuracy is not a priority
  • Data is not too big (i.e. no need for scalability)
  • No need for real-time updates
  • Using Excel as a scratchpad to quickly put a prototype together
  • No need for long term maintenance.

What you need is something allowing for rapid development, validation of correctness and extensibility, while keeping the same reactive model as Excel. And this is exactly what the Jupyter Notebooks provide!

No, Excel is not obsolete, but Jupyter Notebooks are better data analysis tools!

Although I have explicitly mentioned some substantial advantages of using the Python/Jupyter combo over Excel or implied the as counterexamples in the “Excel Hell” section, I have collected the top 10 of them here:

✔️ Powerful data manipulation — It is the data scientist’s toolbox after all! ✔️Advanced visualization capabilities — No more boring charts!

✔️ Better user experience

✔️ Big dataset processing — No more crashing!

✔️ Performance management — Making advantage of multiprocessing

✔️ Test-Driven Development — Think quality!

✔️ Open source accessibility — “There is a lib for it!”

✔️ Error traceability

✔️ Easier automation

✔️ Self-documenting — Don’t underestimate the value of this!

Where to start

I realize that coming new into this can be intimidating, but programming is not about learning the language’s syntax but wires your brain for problem-solving using the right languages and tools: in our case, Python and Jupyter:

Here are two great resources to get you started (not affiliated):

The key point is:

Stay relevant!

It is hard to imagine the corporate world without spreadsheets!

Banks start discovering Python as an alternative to Excel for their critical workloads. Barclays just recently made headlines with their effort to teach their traders to code. I think you should follow this paradigm too and you will not be disappointed!

Let Excel do what Excel is good at and move the analysis and automation into Python and Jupyter Notebooks.

There is a happy world of dynamic and interactive analytics waiting for you!

Thanks for reading!

--

--