Rust type language

George Shuklin
journey to rust
Published in
8 min readJun 11, 2019

Disclaimer: If you are learning Rust, take this with grain of salt. I’m learning Rust too and I may be utterly wrong in my guesses.

The more I read and do Rust, the more I realize, that Rust consists of two (three, if macros counts) languages. One is an average programming language which can be learnt by experienced programming in a week, plus month or two for standard library and staple crates. It describes nice things like loops, pattern matching, ranges, conditionals, functions, etc.

The second one is super hard and intimidating. It describes traits, types and lifetimes, and, amazingly, lifetimes is not the hardest part. Traits itself goes into first category: We have type and we have associated functions, a slightly unusual way to make classes. But as soon as we say ‘type parameter’ to the trait, or, associated type, things start to meltdown brains.

… Not really, but you can deal with them only if you acknowledge that they are hard language to learn. It’s not impossible, but you (I) need to focus on it. The grinding halt to learning Rust coming when someone try to learn Rust (type I) and get those things as peripheral nuisance to skip over. It’s like walking to the skyscraper roof border and trying to ‘skip over’ to the next roof, ignoring annoying little gap in between.

So, stare at Rust (type II). I’m a beginner into Rust, so I will try to explain thing I don’t understand. Hold your smile.

Easy Typing

Let’s start from basics. Type define three things:

  1. How much memory we need to hold data.
  2. What kind of operations we can perform on data.
  3. Compiler’s way to initialize those data (“string”, 0, 0x0000_0001, etc).

After we have a type we can make variables of this type. We can create function which require this type, and return this type. If we pass a variable with a wrong type, compiler can find this earlier (at compile time) and to scold us.

I don’t want to go into pointer quagmire here, so, end of easy story.

Simple Generics

If we look on types declaration and usage, they look like a some kind of simple expression. F.e. a=1 contains a simple expression (1), which is evaluated to …em, 1. Let’s type to explore what kind of expressions we can do with types.

F.e. if we say that a function takes any type and return it unchanged, it’s a simple expression:

fn func (a: any type) -> the same type as a

This is generics in their simple form. The Rust notation to it is:

fn func <T> (a: T) -> T

The interesting part here is ‘<T>’, which is declaration of ‘type variable’. This variable can hold a type. In this case any type. I want to repeat this: This variable holds a type, not a values of some type. And as type variable it can be used in place of any ‘type expression’. In this case it’s used in two places: a: T and -> T.

From types to type equations

It’s super mega giga hard part. Pay close attention.

What is ‘a: T’? Is it says ‘let T= typeof (a)’? NO. This is the hard part of whole understanding of Rust (type II).

The language of types is not imperative. It’s declarative.

That means that there is no T=typeof(a), and ->T is not coming after a:T statement.

There is no runtime here. It’s all done in compile time and it’s calculated with no notion on sequentiality. If you know linear algebra, it’s the same:

x + 2y = 3
4x + 5y = 6

You can’t say that upper equation is calculated earlier than lower one. They are solved together.

The same goes with rust type statements (I’ve said earlier expressions, no, they are statements or equations). During compilation compiler solves those equations together. If there is an answer, you have you binary, and if there is no solution you have a compiler error.

So, our simple generic is actually a set of equations:

a=T
return_value_of_func=T

Can we solve it? Of course, we can. The answer is ‘any’.

But wait, there is one more missing piece of code, the third equation which gives us less ‘any’ and more ‘something’. It’s the code where we use function.

If you write (in Rust) let b=func(1) , it wouldn’t compile. It’s satisfy our equations (the answer is ‘any’), but, in fact, Rust can’t allow multiple answers to the set of type equations at the final binary. Recall, we need to know how big this type is. So, it’s actually a compiler error. When we use function we produce a second set of equations and they are joined together.

funct:
a=T
return_value=T
b=func(typeof(1))

This code may have been compiled if only we knew the type of 1. But Rust have 1:i32, 1:u32, 1:u8, etc. Which one we are talking about? No one knows.

But the following code works:

let c: i32 = 1;
let b = func(c);

Why? Because we have a concrete type for func. Our set of equations become this:

func:
a=T
return_value=T
c=i32
b=func(c)

If we solve this thing we get that b=i32, the same type as c. And this solution is unique, so we can compile this. Moreover, when Rust uses our function, it knows that size of T is the same as size of i32.

If we use func with some other type, we’ll have another two sets of equations which would resolve in that ‘other type’.

We can expand our understanding of generics without stepping to the next level. Of course, we can have more than one type parameter.

fn func2<T,G> (a: T, b:G, c:T) -> G

It takes three parameters, parameter a and c have the same type, and b have another type, witch is the same as return value for func2. If we use it in this code:

let x:i32 = (1 as u32, 42, 2)

Rust will be able to calculate missing types. If you don’t see missing types, they are:

42: is it a u32 or i32? Or is it i8?

2: the same problem.

But if we write down type equations everything become unambiguous:

func2:
a=T
b=G
c=T
return_value=G
x=i32
func(u32, (42?), (2?))
=>func2(u32, ?, ?) -> i32

as you can see, we can calculate what types are passed into func2 based on function signature. Second argument is the same as return value (and we know the type of return value as it’s passed to ‘x’, which is known to be i32), and third argument is the same as the first (which is u32).

Solved!

No type no way

Unfortunately, we have one more bad problem to deal with. The body of the function. Let’s try to upgrade our function func2 to do something.

fn func2<T,G> (a: T, b:G, c:T) -> G {
b
}

What a boring function. It has only one bright side: it compiles:

If we try to do something, we gonna have a problem. Let’s say we want to return b if a=c, and return 0 if a!=c. What a failure awaits us…

  1. We can’t compare a and b.
  2. We can’t return 0 instead of b

… Because we don’t know what is allowed to do with our types. Remember, that types are saying what kind of operations we can perform on it’s values.

And if type is ‘anything’, than you can’t do anything with it, except for pass unchanged (pass into another function which agreed to take it or return it back to caller).

So, here we need to provide us some restrictions on function arguments for caller, and, at the same time, give us some permissions to do something with function arguments inside of the function’s body.

As for now, we know one way to do this: specify a concrete type. On one hand it gives us a lot of freedom in the function’s body (we know the exact type and we can do anything we want with it), on another it gives us over-restrictive function signature, which forces function’s user to provide precise types.

Type parameters

To revamp type power we need to take one more construction into consideration: composite types. Composite types are types which uses other types as their build blocks. Lists, tuples, arrays, vectors, structures, all of them are examples of composite type.

Let’s start from a concrete composite type:

struct Comp {
field1: u32,
field2: i32
}

A rather boring creature it is. Let’s look what’s gonna change in Rust (type II) language when we use Comp as a concrete and a generic type for a function.

Concrete version:

fn func(a:Comp)->Comp...
b = func(Comp{1, 2})

The type relations are:

Comp consists of u32, i32func:
a = Comp
return_value = Comp
b = return_type_of func (Comp)
func is called with Comp of 1:t1 and 2:t2

As you can see, it’s pretty trivial to deduct what type b have (it’s a return value of func, Comp. Moreover, it’s tirival to see, that ‘1’ have type of u32, and 2 have type of i32.

Let’s throw in some generics. We can pass type as an argument (parameter) to the structure, the same way we do with functions:

struct Comp<T1, T2>{
field1: T1,
field2: T2
}

This gives us two things: the same ‘monomorphization’ as before (converting from generic to concrete types by substituting type variables with values they hold), but, more importantly, it gives us idea of a type parameter.

Every time we use this structure, it brings an equation with itself. We can’t just use this type, we need to add this equation into resulting set. Moreover, we need to pass some types into this equation. I’m not sure it’s an equation anymore, it looks more like function on types. We pass two types and get the third one, consisting of ‘struct type’ with our types inside.

So, let’s look for code with generics in structures. I’ll skip intermediate steps and jump directly to full-fledged case.

struct Comp<T1, T2>{
field1: T1,
field2: T2
}
fn func<T, U>(a:Comp<T, T>, b:Comp<U, U>) -> Comp<T, U>{
Comp{field1: a.field1, field2: b.field2}
}
fn main(){
let c = func(
Comp{field1: 1 as u32, field2: 3},
Comp{field1: 2 as i32, field2: 4}
);
}

Side note: I feel myself riding a wave of understanding. Few typo aside, this code is actually compiles from my first attempt. And I glad I keep lifetimes aside for now.

So, let’s look into our Rust (type II) stuff:

#Comp is a function of two types (T1 and T2), which creates a type of those two combined into third type.Comp(T1, T2) = |T1, T2| new_type_of_T1_T2 # pseudo lambda notation herefunc:
a = Comp(T, T)
b = Comp(U, U)
return_value = Comp(T, U)
func_body:
return_value = Comp(T, U)
main_body:
c = return_type_of_func
a: Comp(u32, ?1)
b: Comp(i32, ?2)

As you can see, we have all we need to solve this:

?1 is u32, ?2 is i32, c is Comp(u32, i32)

Interesting thing here: It’s not completely generic function. We have very concrete Comp type there, and this is reason why we can construct a new return value. Compiler is given enough information to know at compile time what it can do with Comp type (how to create a value of that type).

Intermediate conclusion

The chapter with pure generics become more complex then I expected. Even without references, traits and lifetimes, the generics itself is a rather complex mechanism for expressing relations between types. Rust (type II) is hard, partially obscured by Rust (type I), and require a special way to reason about it.

I’ll continue to dig into Rust (type II) language in following chapters.

--

--

George Shuklin
journey to rust

I work at Servers.com, most of my stories are about Ansible, Ceph, Python, Openstack and Linux. My hobby is Rust.