Data Types, Expressions and Abstraction in Erlang

Published in

Getting Started with Erlang

11 min readFeb 25, 2019

Data types

Erlang, as with any other programming languages provides a wide list of data types. The data types, their usage and significance are listed below:

Terminology: ‘Data’ stored in a variable of any data type is known as term in Erlang.

Numerical Data types: There are two types of numeric data types available in Erlang: integers and float.

Atom: Atom is a literal, i.e. a constant with a name, much like constants in C. An atom needs to begin with a lower case alphabet, and can only contain alphanumeric characters. If it is enclosed in single quotes, it can ignore these rules.

Bit String and Binaries: Bit strings store an area of untyped memory. Binaries are bit strings of length exactly divisible by 8.

Reference: Reference is unique to Erlang runtime system, it can be called with make_ref/0.

Fun: Fun refers to a functional object. This is useful to create an anonymous function that can be passed as an argument.

Port Identifier: This identifies an Erlang port. Erlang ports are unique to it, and more information of its usage can be found following the following link: http://erlang.org/doc/reference_manual/ports.html

Pid: Process identifiers, as the name suggests identifies a process. The ‘spawn’ command creates processes, and returns the value of its data type.

Tuple: Tuple is a compound data type, it contains a fixed number of Erlang terms. Each term in a tuple is called an element, and number of elements is the size of the tuple. There are multiple ways to manipulate tuples in Erlang.

Map: Map stores a number of key-value associations. Its similar to dictionary data types in many languages. Each key-value association is known as association pair. The number of these association pairs is the size of the map.

List: List is a compound data types, and can have variable number of terms.

String: Strings are a bunch of characters enclosed in double quotes. However, string is not a datatype in Erlang, and is a list of characters. For example, the following two commands mean the same thing.

Record: Record is a data structure for storing a fixed number of elements. It has named fields, and is similar to the ‘struct’ construct in C.

Boolean: Boolean can have two values ‘true’ or ‘false’.

Type Conversions: There are a number of type conversions available in Erlang.

Expressions

Here is list of all valid expressions:

1. Expression Evaluation:

All sub-expressions are evaluated before an expression itself is evaluated, unless explicitly stated otherwise. For example: Expr1 + Expr2 — In this first each expression individually are evaluated and then addition is performed.

2. Terms:

Term is the simplest form of expression, that is an map, or tuple, string, integer, float, atom,list. The return value of term is the term itself.

3. Variables:

If a variable is bound to a value, the return value is this value. Unbound variables are only allowed in patterns.Variables start with an uppercase letter or underscore (_). Variables can contain alphanumeric characters, underscore and @.

Variables are bound to values using pattern matching. Erlang uses single assignment, that is, a variable can only be bound once.The anonymous variable is denoted by underscore (_) and can be used when a variable is required but its value can be ignored.

Variables starting with underscore (_), for example, _Height, are normal variables, not anonymous. They are however ignored by the compiler in the sense that they do not generate any warnings for unused variables.

The scope for a variable is its function clause. Variables bound in a branch of an if, case, or receive expression must be bound in all branches to have a value outside the expression. Otherwise they are regarded as ‘unsafe’ outside the expression.For the try expression variable scoping is limited so that variables bound in the expression are always ‘unsafe’ outside the expression.

4. Patterns:

The structure of pattern is same as the term but it can include unbound variables For example: Name1, [H|T], {error, Reason}

Patterns are allowed in various sections such as match(‘=’) expressions, case and receive expressions, clause heads .

The following is a valid pattern when we try to match strings:

f(“prefix” ++ Str) -> …

This is equivalent to the above syntax, but is harder to us to read:

f([$p,$r,$e,$f,$i,$x | Str]) -> …

If the following two conditions are met, an arithmetic expression can be utilized inside a pattern:

It uses only bitwise or numeric operators.
When complied, its value can be evaluated to a constant .

Example:

case {Value, Result} of
{?THRESHOLD+1, ok} -> …

5. Match:

The following matches Expr1, a pattern, against Expr2: Expr1 = Expr2

If the matching is successful in the above syntax, any variable in the pattern that is unbound becomes bound and the return value is Expr2. If the matching is unsuccessful, a bad-match run-time error occurs.

6. Function calls:

Line 1 : ExprF(Expr1,…,ExprN)

Line 2: ExprM:ExprF(Expr1,…,ExprN)

In the second line, both ExprM and ExprF must be an expression that is equal to an atom or be atom itself individually. The function which is called by using the fully qualified function name is often referenced to as a remote or external function call.

In the first line , ExprF should evaluate to a function or be an atom. If the function ExprF is defined locally, it is called by the program. But if, ExprF is an atom, the function is called using the implicitly qualified function name. Alternatively, if ExprF is neither defined locally nor be an atom, we need to explicitly import it from the M module, which is called as follows: M:ExprF(Expr1,…,ExprN).

7. If:

if-expression has different branches which are scanned sequentially till we find a guard sequence GuardSeq which is true. Then the corresponding Body (sequence of expressions separated by ‘,’) of that if-expression is evaluated.

The return value of the if expression is the return value of Body.

If any of the guard sequence is not true, an if_clause run-time error occurs.

8. Case:

First, he Expr expression is evaluated. Secondly, the PatternN patterns are matched with the result sequentially. If the match is successful and also if optional GuardSeq is true, the corresponding Body is evaluated.

The return value of the case expression is the return value of Body.

If the pattern does not match but the guard sequence is true , then a case_clause run-time error occurs.

9. Term Comparisons:

Expr1 op Expr2

Different datatypes can be used as arguments.The order is defined as follows: number < atom < reference < fun < port < pid < tuple < map < nil < list < bit string

Maps are ordered using size, if there are two maps whose size is same, then they are compared by keys and are arranged in ascending term order and then it is sorted by values in key order. In maps,integer types key order are considered less than key order which are floats types.Whereas, Lists comparison takes place element by element. Tuples are ordered in the same as maps, but if tuples are of same size, they are compared element by element.In comparison of integer to a float, the term which has lesser precision is converted to type of other term, unless it contains one of these operators: =:= or =/=.

The return value of Term comparison operators is a Boolean(true or false) value of the expression.

10. Arithmetic Expressions:

Line 1: op Expr1 Line 2: Expr1 op Expr2

11. Boolean Expressions:

Line 1: op Expr1 Line 2: Expr1 op Expr2

12. Short-circuit Expressions:

Line 1: Expr 1 orelse Expr2 Line 2: Expr1 andalso Expr2

Not everytime Expr2 is evaluated, it is evaluated only if:

Expr1 is equal to false in an orelse expression OR Expr1 is equal to true in an andalso expression.It returns either the value of Expr1 which can be true or false, or the value of Expr2 if it is evaluated.

13. List Operations:

Line 1: Expr 1 ++ Expr2 Line 2: Expr1 — — Expr2

In first line, the list concatenation operator ++ is udes for appending the second argument to its first and it returns the resulting list.In second line, the list subtraction operator — — produces a list which is a copy of the first argument. Following is the procedure: In second list argument for each element , the first occurrence of this element (if any) is removed.

14. Parenthesized Expressions:

(Expr)

Parenthesized expressions are useful to override operator precedence s.

15. Block Expressions:

Grouping a sequence of expressions which is similar to clause body is provided by the Block expressions. The return value of Block Expression is the value of the last expression ExprN.

Abstraction

“Abstract” simply means to hide something. In general, we have two classes of abstraction mechanisms in programming languages. They are Data Abstraction and Control Abstraction.

Data Abstraction allow the definition and use of sophisticated data types without referring to how such types will be implemented.

Control Abstraction provide the programmer the ability to hide procedural data.

Data Abstraction

Data abstraction is one of the most essential and important feature of programming. Data abstraction means providing only essential information about data to the outside world. We try to provide a separate interface through which to use and manipulate the data, but with limited visibility. While abstraction is not directly possible, it can be implemented using the dialyzer.

The method to achieve abstraction is Erlang is to enforce a data type to be opaque, and throw a type error if the user tries to dig into the structure. Since Erlang is a functional language, and is thus more data-flow oriented then control-flow oriented. While it does not have direct data abstraction, it does have process abstraction. What this means is the internal states if a separate process cannot be views from the outside, it can only receive and send messages. Due to this, the entire process can be substituted to another without the knowledge of the caller. This principle is key to abstraction in Erlang, and is how the dialyzer implements data abstraction.

Dialyzer: This is a module available with Erlang. It is a static analysis tool that that identifies software discrepancies, type errors and dead code.

Control Abstraction

Erlang is a functional programming language which means that functions can be used as arguments to functions and that functions can return functions. Functions that manipulate functions are called higher-order functions. The datatype which represent a function in Erlang is called a fun.

Functional Programs can not only manipulate regular data structures but can also manipulate the functions that transform data.

Funs can be used in different was as mentioned below:

1. Performing same operation on every element of a list: To do this we pass funs as argument to functions which is extremely common.

2. Creating our own Control Abstractions: This technique is very useful. Erlang has no for loop. But it gives us a flexibility to create our own for loop. The advantage in creating our own control abstraction is, we can make them to do exactly what we want them to do rather than relying on a predefined set of control abstractions which might not behave as what we want.

3. Implementing things like parser combinators or lazy evaluators: We can write functions which return funs. This is a very powerful technique, but such programs might become difficult to debug.

Defining our own Control Abstractions

As already said, Erlang has no while, for, if statements, we can create one by our self as below:

NOTE: All the programs should be saved with .erl extension.

For Loop:

The declaration export([for/3]) means the function can be called from outside the module for_loop.

Here we have defined a simple for loop. We can use it to make a list of integers or to find the squares of a list of numbers and so on.

To print a list of integers:

Compile the above for_loop.erl file using command c(for_loop).

And then run the program to print a list of numbers from 1 to 20 using for_loop:for(1,20,fun(I) -> I end).

Output of a program with for loop implementation

To compute the square of integers from 1 to 10:

Compile the above for_loop.erl file using command c(for_loop).

And then run the program to print a list of numbers from 1 to 20 using for_loop:for(1,20,fun(I) -> I *I end).

Output of a program with for loop implementation

While Loop:

Here we have defined a simple while loop which starts printing from 2. To compile and run the program, execute the commands as below:

Output of a program with while loop implementation

Branching

We can use pattern matching in this case. Below example shows how to implement the same:

NOTE: 1. io:format’s formatting is done with the help of tokens being replaced in a string.

2. Character used to denote a token is tilde(~). ~n denotes a linebreak, ~s denotes a string, ~w denotes an integer.

Based on the input of first parameter, branching happens in this case and the appropriate output is displayed. Compile and run the program as follows:

Output of a program for implementation of branching

Defining a function:

As already mentioned, Erlang is a functional programming language and hence we would expect to see a lot of emphasis on functions and how they work.

Syntax of a function:

FunctionName(Pattern1… PatternN) ->

body;

Function to multiply two numbers:

Note: 1. The mul() function takes 2 arguments.

2. Here we are defining mul, start functions with export function so that we will be able to use them.

3. We are calling the mul function from start function in this program.

4. We can also call the mul function directly and give the arguments at runtime or use the start function to print the output for already defined input.

Output:

Funs

Funs are used to define anonymous functions in Erlang.

Anonymous Functions

Erlang has an option to define anonymous functions. Anonymous function is a function which has no name associated with it. The anonymous function is defined with fun() keyword.