When Python is Faster Than Julia
Julia’s JIT compiler produces code the runs much faster than Python but that compilation also takes time, which can result in Python programs being faster overall.
There’s a good quote on the Julia language web site: it says that Julia walks like Python but runs like C. Meaning that Julia is as easy to work with as Python — it has an easy syntax that can be learned quickly — but Julia programs run at a speed that rivals compiled languages like C. Python, on the other hand, is an interpreted language that is quite slow.
But for simple scripts like the one I use below to generate up-to-date graphs for the Covid-19 pandemic, Python might be a better bet.
Compilers v interpreters
Compilation means that the code that you write is converted into machine code, or something close to it, before the program is run. Machine code is the language that the processor in your computer executes directly, consequently compiled programs run fast. C is the archetypal compiled language and it is very fast. The C compiler produces an .exe program (on Windows) and that is the program that is executed. If you want to change you program, you must edit the source code and re-compile it.
Interpreted programs are not converted. The code that you write is ‘interpreted’ at run time — the Python interpreter reads the code that you wrote, line by line, and executes the appropriate machine code depending on what it reads. This is inherently slow.
Julia’s JIT compiler
Julia is not a conventional compiled program, it has a JIT (just in time) compiler. This means that there is no compilation before run time. When the program is run, the Julia compiler compiles the code that it reads on the fly, just before it needs to be executed.
Maybe this doesn’t sound much different to the operation of an interpreter but it has a major advantage. Once the code is compiled it can be stored in memory and the next time it needs to run, no conversion is necessary because the compiled version is already there.
But sometimes the compilation time slows the execution of the program so much that it ends up not being particularly fast at all.
Python v Julia
Please understand that this is not a scientific exercise but just a simple example to illustrate why, for all Julia’s advantages, you might be better off choosing Python instead, for some tasks.
I’ve written a couple of simple scripts that I run as Jupyter notebooks. One is in Python the other Julia. All they do is load a CSV file and plot a graph of the variables.
The CSV file in question comprises stats for the Covid epidemic which is updated daily. The script downloads the latest stats and draws a graph.
file = HTTP.get("https://raw.githubusercontent.com/datasets/covid-19/master/data/countries-aggregated.csv")
stats = CSV.read(IOBuffer(file.body))
stats2 = groupby(stats, :Country)
plot(ukstats.Date, [ukstats.Confirmed,ukstats.Deaths,ukstats.Recovered], labels=["Confirmed" "Deaths" "Recovered"])
And this is the Python one:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
stats = pd.read_csv('https://raw.githubusercontent.com/datasets/covid-19/master/data/countries-aggregated.csv')
stats2 = stats.query("Country == 'United Kingdom'")
They are functionally much the same but here is how they differ when you run them.
Precompilation takes time but only once
When you run the Julia program for the very first time it precompiles the libraries, that is Plots, CSV, HTTP and DataFrames. This is a fairly lengthy process because Plots is big and, on my PC, it took about ten minutes.
That is a very long time but it only happens the first time that you use the library, so it’s not fair to criticize Julia on that count because on subsequent runs that will not happen.
Running the programs
In normal running, the Julia program takes longer the first time that it is run presumably because of the JIT compilation. It takes about two minutes on my machine. Subsequent runs are very fast — less than a second.
The Python program also takes slightly longer the first time it is run. I imagine that this is because it is importing the libraries. First runs take about four seconds and subsequent runs take less than two.
So, if you only want to run the program once, just to generate a single graph, then the Python script is much quicker, however, if, for some reason, you wanted to run the program more than once (maybe to change the country) then the Julia program wins. But only just, the difference between 2 seconds and less than a second is hardly worth worrying about.
Who is the winner?
Much as I like Julia, for this type of simple script Python wins because I only want to run the code once and two minutes is just too long to wait for a graph.
But is this unfair on Julia? This is a specific task that I wanted to perform and it turns out that Python was a better choice. But other more processing intensive types of program will surely be more suited to the Julia approach.
Originally published at https://projectcodeed.blogspot.com on July 10, 2020.