Defining Custom Units in Julia and Python

I’ve been writing code to deal with temperature in different units Celsius, Kelvin and Fahrenheit, because I do calculations involving the gass laws to figure out how big aerostat habitats have to be on Venus. For those who follow me on medium you might know of my interest in space exploration and colonization of Venus.

I will show you how we can represent different units for temperature in Julia and Python. This is an interesting case for Julia because it shows quite clearly the advantages of using language supporting multiple-dispatch , in comparison to a more traditional object oriented language such as Python, which relies on single dispatch.

Python Solution

I had a number of false starts to get down a good Python solution, and not being as good with Python as with Julia, this might not be the most optimal way of solving the problem. I am happy to hear alternative solutions to this problem.

Below is a Python class defining the Celsius temperature unit. Notice we need to define conversion functions for all the other temperatures and define the add and subtract operators to operate on a common unit.

class Celsius(object):
def __init__(self, temp):
super(Celsius, self).__init__()
if type(temp) == float or type(temp) == int:
self.value = temp
else:
self.value = temp.to_celsius().value

def __repr__(self):
return "Celsius({0})".format(self.value)

def to_celsius(self):
return self

def to_kelvin(self):
return Kelvin(self.value + 273.15)

def to_fahrenheit(self):
return Fahrenheit(self.value*9/5 + 32)

def __add__(self, temp):
return Celsius(self.value + temp.to_celsius().value)

def __sub__(self, temp):
return Celsius(self.value - temp.to_celsius().value)

Then we have to create a similar class for each of the possible temperature units. For Kelvin we start like this:

class Kelvin(object):
def __init__(self, temp):
super(Kelvin, self).__init__()
if type(temp) == float or type(temp) == int:
self.value = temp
else:
self.value = temp.to_kelvin().value

The conversion functions will be different for each temperature class. This is for Kelvin:

def to_kelvin(self):
return self

def to_celsius(self):
return Celsius(self.value - 273.15)

def to_fahrenheit(self):
return self.to_celsius().to_fahrenheit()

All arithmetic operations such as adding or subtracting temperatures has to be reimplemented for each class.

def __add__(self, temp):
return Kelvin(self.value + temp.to_kelvin().value)

def __sub__(self, temp):
return Kelvin(self.value - temp.to_kelvin().value)

The problem with this approach is that whenever you add another temperature unit such as say Fahrenheit, you have to add a to_fahrenheit() to every single temperature class. Which means if you downloaded a temperature library in Python, there is no obvious way of expanding that library with another class.

A second problem is that the way we have defined __add__() and __sub__() they are completely unsymmetrical. The order of the arguments determines which type the result will be converted into. This is stupid. Imagine if we added an integer and a floating point number that the type of the result would depend on whether the float or int came first?

Solving this problem would require elaborate if-else statements for each arithmetic function, which would be even more impractical.

Unlike OOP Julia uses multiple dispatch, for matching a function call to a specific function implementation. With multiple dispatch the Julia JIT will look at the type of every function argument to pick the correct specialized function (called a method in Julia).

Julia Solution

In Julia we can explicitly define what happens in each type case. We can e.g. make sure that whenever we deal with Celsius and Fahrenheit the result is always in Celsius regardless of which argument comes first.

+(x::Fahrenheit, y::Fahrenheit) = Fahrenheit(x.value + y.value)
+(x::Fahrenheit, y::Celsius) = Celsius((x.value - 32)*5/9 + y.value)
+(x::Celsius, y::Fahrenheit) = y + x

Of course writing function for all of the possible combinations is a bit tedious, so the Julia standard library has defined a number of practical functions for handling conversion and promotion to common types. First we define our temperature types:

abstract type Temperature end

struct Celsius <: Temperature
value::Float64
end

struct Kelvin <: Temperature
value::Float64
end

struct Fahrenheit <: Temperature
value::Float64
end

Defining Promotion Rules

Then we use the promote_rule() function to register which type two types should be promoted to when involved in an arithmetic expression.

promote_rule(::Type{Kelvin}, ::Type{Celsius})     = Kelvin
promote_rule(::Type{Fahrenheit}, ::Type{Celsius}) = Celsius
promote_rule(::Type{Fahrenheit}, ::Type{Kelvin}) = Kelvin

The Type{Kelvin} looks strange. It is simply the type of the Kelvin type. So if I made this call promote_rule(Kelvin, Celsius), it would return the type object Kelvin.

Defining Conversions

Next is to define how you convert from one particular temperature to another.

convert(::Type{Kelvin},  t::Celsius)     = Kelvin(t.value + 273.15)
convert(::Type{Kelvin}, t::Fahrenheit) = Kelvin(Celsius(t))
convert(::Type{Celsius}, t::Kelvin) = Celsius(t.value - 273.15)
convert(::Type{Celsius}, t::Fahrenheit) = Celsius((t.value - 32)*5/9)
convert(::Type{Fahrenheit}, t::Celsius) = Fahrenheit(t.value*9/5 + 32)
convert(::Type{Fahrenheit}, t::Kelvin) = Fahrenheit(Celius(t))

So the way the conversion function works is given that you have some temperature t of unknown type, and you want to convert to Celsius, you write:

convert(Celsius, t)

The benefit of doing conversion this way over the Python approach of adding a to_celsius() method, is that it does not require modifying existing classes. That means you can use an existing Temperature library without having to change the source code.

As with the Python constructor, we can define constructors for the different temperature types which takes other temperatures as arguments.

Kelvin(celsius::Celsius) = convert(Kelvin, celsius)
Celsius(kelvin::Kelvin) = convert(Celsius, kelvin)
Celsius(fahrenheit::Fahrenheit) = convert(Celsius, fahrenheit)
Fahrenheit(celsius::Celsius) = convert(Fahrenheit, celsius)

Defining Arithmetic Operations

Lets define how temperatures of the same type are added or subtracted:

+(x::Celsius, y::Celsius) = Celsius(x.value + y.value)
-(x::Celsius, y::Celsius) = Celsius(x.value - y.value)

+(x::Kelvin, y::Kelvin) = Kelvin(x.value + y.value)
-(x::Kelvin, y::Kelvin) = Kelvin(x.value - y.value)

+(x::Fahrenheit, y::Fahrenheit) = Fahrenheit(x.value + y.value)
-(x::Fahrenheit, y::Fahrenheit) = Fahrenheit(x.value - y.value)

To deal with combination of temperatures of different types, we utilize the Julia promotion machinery, which relies on convert() and promote_rule(). Julia standard library has a function promote(x,y) which will make x and y into the same type according to the promote rules. So we define addition and subtraction between two abstract temperatures:

+(x::Temperature, y::Temperature) = +(promote(x,y)...)
-(x::Temperature, y::Temperature) = -(promote(x,y)...)

Usage

We can now try this out at the REPL. Using bpython REPL to try to the python solution, we get this:

>>> Celsius(1) + Celsius(3)
Celsius(4)
>>> Fahrenheit(3) + Celsius(4) + Kelvin(4)
Fahrenheit(-410.47)

Here you can see the advantage of the Julia approach using promotion rules.

julia> Celsius(1) + Celsius(3)
Celsius(4.0)

julia> Fahrenheit(3) + Celsius(4) + Kelvin(4)
Kelvin(265.0388888888889)

You see even though we start with Fahrenheit, the result still gets converted to our defined common denominator Kelvin.

Another advantage I’ve seen while doing this is that you don’t have to define a way of displaying the Julia types. In Python for each type we need to define a method like this:

def __repr__(self):
return "Fahrenheit({0})".format(self.value)

If we had not done that for say Kelvin our REPL would give us a result like this:

>>> Kelvin(3)
<__console__.Kelvin object at 0x107a29e10>

In Julia there is a handy default representation equal to the syntax for creating an object of that value in Julia. And should we desire another representation we can define a show() function:

julia> function show(io::IO, k::Kelvin)
print(io, "$(k.value)°K")
end
show (generic function with 239 methods)

julia> Kelvin(4)
4.0°K

Again the advantage of this approach is that we don’t need to modify an existing class.

Reducing Boilerplate

My python solution is 60 lines of source code while the Julia version is almost half at 32 lines, despite offering more functionality.

But it is possible to reduce boilerplate and line count further by utilizing Julia macros. Each temperature type and the definition of the arithmetic operations are quite similar.

Lets define a list of the types we want to create. Identifiers for functions and types use symbols, which we write with a semi-colon prefix.

types = [:Celsius, :Kelvin, :Fahrenheit]

We can then iterate over each of these types and define the type, add and subtract functions. Everywhere where it says $T will be interpolated with symbol for one of our temperature types.

for T in types
@eval begin
immutable $T <: Temperature
value::Float64
end

+(x::$T, y::$T) = $T(x.value + y.value)
-(x::$T, y::$T) = $T(x.value - y.value)
end
end

We can also define constructors taking other temperature types as arguments. Here I am using the Julia ability to take iterate over a cartesian product in a for-loop:

for T in types, S in types
if S != T
@eval $T(temp::$S) = convert($T, temp)
end
end

It is easy to think that using macros will be very difficult to maintain, but in Julia you can always easily jump back to the exact line where a function was generated. In this example I put the code in the file codegen_temp.jl. We can use the which macro to check where the + operator for the Kelvin type was defined:

julia> @which Kelvin(3) + Kelvin(4)
+(x::Kelvin, y::Kelvin) in Main at codegen_temp.jl:12

Or I could use the @edit macro to open my preferred editor at the exact line where the + operator was defined.

Performance

While it is practical to wrap up floating point numbers in particular types to avoid unit mismatch, you might assume we’ll end up with a performance penalty because of this. Should we not get extra indirection and extra function calls?

In Python that will certainly be the case. We can use the dis package to look at the Python code generated when adding up different temperatures.

>>> def calc():
... return Fahrenheit(2) + Celsius(3) + Kelvin(4)
...
>>> import dis
>>> dis.dis(calc)
4 0 LOAD_GLOBAL 0 (Fahrenheit)
3 LOAD_CONST 1 (2)
6 CALL_FUNCTION 1
9 LOAD_GLOBAL 1 (Celsius)
12 LOAD_CONST 2 (3)
15 CALL_FUNCTION 1
18 BINARY_ADD
19 LOAD_GLOBAL 2 (Kelvin)
22 LOAD_CONST 3 (4)
25 CALL_FUNCTION 1
28 BINARY_ADD
29 RETURN_VALUE

You can see with the CALL_FUNCTION that we are calling the individual functions for creating temperature objects and addign them.

Just the initializer for Fahrenheit looks like this:

>>> dis.dis(Fahrenheit.__init__)
62 0 LOAD_GLOBAL 0 (super)
3 LOAD_GLOBAL 1 (Fahrenheit)
6 LOAD_FAST 0 (self)
9 CALL_FUNCTION 2
12 LOAD_ATTR 2 (__init__)
15 CALL_FUNCTION 0
18 POP_TOP

63 19 LOAD_GLOBAL 3 (type)
22 LOAD_FAST 1 (temp)
25 CALL_FUNCTION 1
28 LOAD_GLOBAL 4 (float)
31 COMPARE_OP 2 (==)
34 POP_JUMP_IF_TRUE 55
37 LOAD_GLOBAL 3 (type)
40 LOAD_FAST 1 (temp)
43 CALL_FUNCTION 1
46 LOAD_GLOBAL 5 (int)
49 COMPARE_OP 2 (==)
52 POP_JUMP_IF_FALSE 67

64 >> 55 LOAD_FAST 1 (temp)
58 LOAD_FAST 0 (self)
61 STORE_ATTR 6 (value)
64 JUMP_FORWARD 18 (to 85)

66 >> 67 LOAD_FAST 1 (temp)
70 LOAD_ATTR 7 (to_fahrenheit)
73 CALL_FUNCTION 0
76 LOAD_ATTR 6 (value)
79 LOAD_FAST 0 (self)
82 STORE_ATTR 6 (value)
>> 85 LOAD_CONST 0 (None)
88 RETURN_VALUE
>>>

Python byte code as you can see is quite abstract. It does not refer to registers or types so it is far away from actual assembly code.

Julia JIT

Julia utilizes a JIT to produce LLVM (low level virtual machine) byte code which is then turned into machine specific assembly code. Lets see what Julia will do in the same case. First we define our calc() function

julia> calc() = Fahrenheit(2) + Celsius(3) + Kelvin(4)
calc (generic function with 1 method)

Then we’ll look at the LLVM assembly code generated

julia> @code_llvm calc()

define %Kelvin @julia_calc_60794() #0 !dbg !5 {
top:
ret %Kelvin { double 0x407077BBBBBBBBBB }
}

Hmm this looks weird. There is actually no code there. It just returns a value. To more easily see what is going on lets, define a simpler version of calc()

julia> calc() = Celsius(3) + Celsius(4)
calc (generic function with 1 method)

julia> @code_llvm calc()

define %Celsius @julia_calc_60795() #0 !dbg !5 {
top:
ret %Celsius { double 7.000000e+00 }
}

As you can see, the Julia JIT actually just figures out that the value will be the same every time calc() is calculated and simply calculates that result, and let calc() return this. So the intel assembly code for this will be:

julia> @code_native calc()
.section __TEXT,__text,regular,pure_instructions
pushq %rbp
movq %rsp, %rbp
movabsq $4833046704, %rax ## imm = 0x1201270B0
movsd (%rax), %xmm0 ## xmm0 = mem[0],zero
popq %rbp
retq
nopw %cs:(%rax,%rax)

This all seems a bit unfair towards Python, so we’ll have to make a bit more elaborate example and see what happens.

julia> calc(a, b, c) = Fahrenheit(a) + Celsius(b) + Kelvin(c)
calc (generic function with 2 methods)

julia> @code_llvm calc(2.0, 3.0, 4.0)

define %Kelvin @julia_calc_60806(double, double, double) #0 !dbg !5 {
top:
%3 = fadd double %0, -3.200000e+01
%4 = fmul double %3, 5.000000e+00
%5 = fdiv double %4, 9.000000e+00
%6 = fadd double %5, %1
%7 = fadd double %6, 2.731500e+02
%8 = fadd double %7, %2
%9 = insertvalue %Kelvin undef, double %8, 0
ret %Kelvin %9
}

Even in this case it didn’t turn into very much code, despite the fact that we didn’t specify the types of the argument to calc(). Julia doesn't really require you to specify types. That is only use for specialization. You specify types in Julia to distinguish between different function implementations not for performance reasons. The Julia JIT will infer type information anyway.

The LLVM byte code gives a pretty accurate picture of the amount of native code this will produce.

julia> @code_native calc(2.0, 3.0, 4.0)
.section __TEXT,__text,regular,pure_instructions
pushq %rbp
movq %rsp, %rbp
movabsq $4833048464, %rax ## imm = 0x120127790
addsd (%rax), %xmm0
movabsq $4833048472, %rax ## imm = 0x120127798
mulsd (%rax), %xmm0
movabsq $4833048480, %rax ## imm = 0x1201277A0
divsd (%rax), %xmm0
addsd %xmm1, %xmm0
movabsq $4833048488, %rax ## imm = 0x1201277A8
addsd (%rax), %xmm0
addsd %xmm2, %xmm0
popq %rbp
retq
nopw %cs:(%rax,%rax)

So it is interesting to note that despite all the elaborate machinery in Julia for conversion and promotion, this is all done away with by the JIT and you get optimal machine code, no different from if we used floating point values directly for our calculations.

Improving Notation

So far when we are adding to temperatures together we write something like Celsius(3) + Celsius(4). But that is not how a mathematician would like to write it. They would rather write 3°C + 4°K. Fortunately you can actually accomplish this in Julia. In fact this is already common to do.

We simply add another arithmetic operation to our temperatures, which is to multiply a temperature with a regular number. With macros we can add this to every temperature type with a single line.

*(x::Number, y::$T) = $T(x*y.value)

Where $T will be a temperature type interpolated into the code. Then all we need to do is to add a constant representing a single degree.

const °C = Celsius(1)
const °F = Fahrenheit(1)
const K = Kelvin(1)

Here is an example from the REPL how we can use this:

julia> 3°C
Celsius(3.0)

julia> 3°C + 4°C
Celsius(7.0)
julia> 3°C + 4K
Kelvin(280.15)

You can see the full code, including the Python implementation on my Github.

Reduce Boilerplate With Type Parameterization

There is in fact quite a number of ways in which we can reduce boilerplate code in Julia, as I’ve been reminded of by DNF in the comment field. Here is an alternative to macros utilizing type parameterization. Below is a full definition of all the artimetic operations for all temperature types.

+(x::Temperature, y::Temperature) = +(promote(x,y)...)
-(x::Temperature, y::Temperature) = -(promote(x,y)...)

+(x::T, y::T) where {T <: Temperature} = T(x.value + y.value)
-(x::T, y::T) where {T <: Temperature} = T(x.value - y.value)

*(x::Number, y::T) where {T <: Temperature} = T(x * y.value)

The last part can be done in a number of ways. This is an alternative from DNF

*(x::Number, y::Temperature) = typeof(y)(x * y.value)

And just for kicks I’ll add my own little twist utilizing union types:

function *(x::Number, y::Union{Kelvin, Celsius, Fahrenheit})
typeof(y)(x * y.value)
end

This means y could be any of the three temperature types. However in this case it is not a great choice, as it would not handle the cases where a user of this package added their own temperature type.

However in some cases you have similar code for types which don’t have an inheritance relationship. For instance a 2D point, 2D vector or 2D unit vector will have a lot in common but they are not interchangable.

Further Exploration

If you want to explore units with Julia further, you should check out Unitful.jl. I’ve just done a quick test after Chris Rackauckas recommended it. With it you can import the units you want to use like this:

julia> import Unitful: ms, s, minute, hr, rad, °, mm, cm, m, km

It will then handle units for you in a sensible manner.

julia> 3ms + 4ms
7 ms
julia> 3ms + 4
ERROR: DimensionError: 3 ms and 4 are not dimensionally compatible.
julia> 8m/4m
2.0