Intro to Julia as a Programming language : An Analyst POV

Nagapavan Nukala
10 min readSep 4, 2020

--

A bird’s eye view of Julia as a Programming language, it’s capabilities and shortcomings

“Julia”, which got the limelight in the recent years is considered as the Programming Language of the future. Although we have awesome languages in market such as R, Spark, C++, etc., Julia was able to create such impact.There are a bunch of reasons for it. We shall discuss few main points among them.

About me … What can you expect from this article?

I’m an Analyst at a start-up, where we use various Statistical, Machine Learning Techniques to leverage the Client performance by empowering them in decision making. From the inception of my journey in analytics domain, I am curious about learning the concepts of Statistics and Machine Learning Algorithms and implement them in platforms such as R, Python. My team extensively works on Python to build the Base ML framework. After a considerable exposure to Python, I was amazed with the ease of coding & numerous libraries available out there in Python. But when we wanted to run high scale jobs , we had to approach Spark. As part of exploring the alternatives out there, we found some solutions such as using Parallel processing, exploiting GPUs for better performance, etc., Interestingly, we got to know that this problem is called the “Two Language Problem” & there’s a language “JULIA”, designed to solve this very problem.I’ll give a detailed explanation of this “Two language Problem” in the later sections.

I started exploring Julia out of my curiosity and found fascinating similarities between Julia & Python.When you complete this article, you’ll be knowing the virtues of Julia, it’s capabilities in multiple domains and the shortcomings of it. Most of the info in this article is available in the official website of Julia. They have released it’s new stable 1.5.1 version recently. From this blog, you can expect to get a bird’s eye view of Julia, it’s purpose, perks and shortcomings, apart from certain interesting facts that may trigger you to begin programming in Julia.

WHAT ’s Julia?

Julia is an open source, high-level dynamic programming language. You can write and deploy codes easily using Julia. Julia started its development in 2009 and released its 1.0 version in 2012

The Purpose:

Back in 2012, there was a conception that a programming language can be either fast or productive, but not both. This requires us to approach multiple languages due to lack of some important features in one language. For example, writing the original code in Python and later scaling it to Bigdata platforms like Spark, etc., This problem in programming world is called as the “TWO LANGUAGE PROBLEM”. Julia was designed to overcome this problem by providing a platform that can not only perform all the functions but can also scale it to Higher Dimensional data.

Julia’s Tagline summarizes the virtues of Julia in crisp way:

Looks like Python, feels like Lisp, runs like C (Fortran)

It implies that an amalgamation of the best of all worlds has been achieved through Julia. Also, the creators of Julia, Jeff Bezanson, Stefan Karpinski, Viral B. Shah, Alan Edelman , being experts in MATLAB, Lisp, Python, Ruby and Perl have tried to create a solution that brings out the best features of these platforms and also addresses the shortcomings of them. They were working in areas such as Scientific Computing, Machine Learning, Data Mining, Large-Scale Linear Algebra, Distributed and Parallel Computing

Julia is well suited for High performance numerical analysis & Computational Science. We can discuss further about its capabilities in later sections.

Why Julia?

When the founders were asked — Why did they create Julia, they gave an interesting reply. You can find the total reply in the blog, a sneak peek is provided below:

In short, because we are greedy.

We want a language that’s open source, with a liberal license. We want the speed of C with the dynamism of Ruby. We want a language that’s homoiconic, with true macros like Lisp, but with obvious, familiar mathematical notation like Matlab. We want something as usable for general programming as Python, as easy for statistics as R, as natural for string processing as Perl, as powerful for linear algebra as Matlab, as good at gluing programs together as the shell. Something that is dirt simple to learn, yet keeps the most serious hackers happy. We want it interactive and we want it compiled.

This answer sums up many reasons that convey Julia is awesome.

Julia in Multiple Domains:

As we discussed before, Julia is well suited for High performance numerical analysis & Computational science. This section covers various domains where Julia is being used. The developers of Julia packages made it suitable to various Eco-systems. The list of domains is as follows:

1. Data Science

Packages used in Julia: “DataFrames”

Data Analysts often rely on a handy tool that helps them perform some data munging operations like Cleaning junk values, Missing values Imputation, Outlier Treatment, Stacking, Unstacking, Aggregations, etc., These operations constitute ~70% of their work. To perform these Data wrangling operations with High dimensional Data, Julia’s “DataFrames” package is observed to be very helpful.

2. Machine Learning

Packages used in Julia: “MLDataUtils”, “Flux”, “Lathe”

There’s a very thin line between this part and the previous domain “Data Science”. While “Data Science” domain focusses on Data Wrangling & Intelligent data driven insights generation. This domain focusses on building models that predicts the outcomes which can be a potential part of Solution. If you’re familiar with Python, “sklearn” package is the tantamount of area of discussion. On the same note, Julia “MLDataUtils” package has many ML scientific models embodied as a package. Also, some non-trivial models that require operations such as Label encoding, Splitting the data into Train & test (splitobs function), Stratified Sampling, K-fold re-partition are implemented in an efficient way

Packages like “Flux” & “Lathe” also have decent implementations of these functions.

3. Parallel Computing

Julia has an in-built support for parallelism using the “Distributed” package. One can leverage the capacity of full resources of a given machine (considering the cores especially) using the parallelism concept. This shall help us to reach the pinnacle when we work on areas of Mathematical & Scientific computing.

Other evolved languages like Python also can run operations in parallel. Python’s GIL(Global Interpretor Lock) allows only one thread to execute at a time even in a multi-threaded architecture.This makes multi-threading not a good choice. Due to this reason, people opt for multi-processing when multi-threading doesn’t solve their problem.You can understand this better through this article Multi-threading vs Multi-processing. However, multiprocessing module in python performs Serialization & Deserialization of Data between threads or nodes, (Intention is to minimize disk’s space or bandwidth requirements) which consumes more time. On the other hand, Julia does this in a more refined way. Further, Julia’s parallelization syntax is less top-heavy than Python’s, lowering the threshold to its use.

The Julia compiler can also generate native code for various hardware accelerators, such as GPUs and Xeon Phis. Packages such as “DistributedArrays.jl” and “Dagger.jl” provide higher levels of abstraction for parallelism.

Application:

The Celeste.jl project achieved 1.5 PetaFLOP/s on the Cori supercomputer at NERSC using 650,000 cores.

This was a project of Cataloguing the visible universe, which had to optimise parameters of188M stars and galaxies, loading and processing 178 TB across 8192 nodes. To achieve this, Celeste exploits parallelism at multiple levels (cluster, node, and thread) which in turn made the process complete in 14.6 minutes.

4. Scientific Computing

Julia comprises of Simulation specific packages in domains such as Differential Equations (DifferentialEquations.jl), Optimization (JuMP.jl, Optim.jl), a general-purpose quantum simulation framework (Yao.jl), and many more.

On another note, Julia has domain-specific packages like

BioJulia — Biology

JuliaOpt — Operations Research

JuliaImages — Image Processing

QuantumBFS, QuantumOptics — Quantum Physics

JuliaDynamics — Non-linear Dynamics

QuantEcon — Quantitative Economics

JuliaAstro — Astronomy

EcoJulia — Ecology

Where is Julia used Currently?

Apart from the Application of “Celeste.jl”, which was mentioned in “Parallelism” section , many tech giants have realised the capability of Julia. This article covers various applications of Julia and the impact Julia can create in start-ups especially working on Machine Learning problems. I recommend you could have a look at it.

“Amazon, Apple, Disney, Facebook, Ford, Google, Grindr, IBM, Microsoft, NASA, Oracle and Uber are other Julia users, partners and organizations hiring Julia programmers”
-Shah, CEO of Julia Computing.

Advantages of Julia over other Programming Languages:

As we have discussed that Julia founders have included the best features of many languages, Julia obviously has many advantages.

In this blog, you can see that most of my comparisons are with Python. Now some of the readers may feel that it’s unfair to compare Julia, which is in such early stage compared to Python, that has evolved over course fo last few decades. I fully support this view. Python which was there in the market from 90’s and has evolved a lot, especially in Data Science and ML Domain. But, it’s very important to acknowledge the areas that has scope for improvement.Only then, the developers and the users community can evolve to bring out a better version of the language.So, let’s see the areas that Julia can outperform Python and what could be the scope for Julia to develop.

  1. Speed!!!

In its default state, Julia is faster than Python & is as fast as C. Native Julia beats the native python’s speed in various orders of magnitude. The reason is the type declarations & JIT (Just In Time) compilation with help of LLVM (Low level Virtual Machine) compiler used by Julia. It means that despite being a compiler level language, its code is compiled at run-time. Also, Julia involves pre-compiling the code at the beginning.

You can refer this website, to benchmark the speed of various languages by considering the execution time of different operations.You can also check the Benchmanrking done by Julia themselves.

2. Easy Package Installation

Installing packages in Julia is very easy. They use a modern package management system that pulls the packages from the GitHub page from the REPL interface. On the other hand, Python packages installation can be quite complicated sometimes especially due to different versions (2.x, 3.x).

3. Multiple Dispatch

This is also known as Multiple methods. It’s the ability of the programming language to dynamically dispatch a function or a method based on run time type / attributes of its arguments. This also is a feature in Python.

4. Easy Code Conversion

It is very easy to convert code from Python and/or C to Julia. But the other way around is not an easy path. Converting code from Python to C or C to Python is very difficult. But Julia can interface with external libraries very easily written in C and Fortran. Data can be shared easily with Python using the “PyCall” library.

5. Easy to Use

Coding in Julia is very easy. It’s easy to understand and use. Especially, if you’re from Python background like me, you won’t find it much difficult to use, because Julia almost gives you a feeling that you’re coding in Python.

6. Metaprogramming

Julia supports metaprogramming. Julia programs can generate other Julia programs, and even modify their own code, in a way that is reminiscent of languages like Lisp.

7. Automatic Memory Management

Like Python, Julia has automatic memory management. Julia doesn’t burden the user with the details of allocating and freeing memory. The idea is that if you switch to Julia, you don’t lose one of Python’s common conveniences.

Disadvantages of Julia over other Programming Languages:

1. Smaller Community

Julia being a new language has its reach limited to a smaller group of people. So, the Julia community is small, making it little difficult for doubts clarification.

2. Not many Libraries

There aren’t many libraries in Julia compared to much evolved languages like Python. But, it shall eventually improve with time & growing community.

3. Arrays are 1-Indexed

Julia in contrast to general programming languages is 1-indexed, which might be difficult for programmers with background of C, Python. There were many discussions on this “0 vs 1 indexing” thread. Julia founders felt that this wasn’t a major issue and it defeats the idea of presenting the Julia code as mathematical. Also, they mentioned that implementing arbitrary indexing can slow down things.

To summarize a comparison of various features among the programming languages is given below:

Feature wise comparison of Julia with other languages

Now, these comparisons might go on. But, I believe that this blog has given enough reasons for those who haven’t started programming in Julia yet. And, it’s very important to know for an Analyst “When to use a language and when not to use it” . This can be answered after one starts working on the language and create their own perspective to it.

Future Work:

Currently, I’m working on my next blog — “Comparison of an End-to -End basic Linear Regression Implementation in Python & Julia”. The intent is to provide a guide for other Julia newbies out there like me, by drawing comparison between Julia and Python. Please share your feedback and suggest some more topics that you want me to explore.

--

--