4. Comparing, Judging, and Timing

Published in

How can I use modelling as an empirical scientist?

7 min readMay 2, 2016

At some point you’ll find some things need to be done on a case-by-case basis, much like in real life.

In biological systems, there are many ways this can come up. Here are a few examples:

Firstly, explicitly dealing with chance and probabilities in simulations. Plants have a chance of dying before reproduction, the chance of finding a mate with a given distance traveled; Secondly, what happens next IF the plant dies before reproduction — what if it doesn’t? What happens IF the animal finds a mate? Thirdly, some matrix-based models require making specific matrices from data — using conditional and relational operators (if, else, < , > ..etc) can help slot each element into place within these matrices.

Conditional operators are functions that filter out things (while and if). Relational operators are functions that compare things (greater than, equal to). These two are often used in conjunction with each other.

Using conditional and relational operators

We can use an IF-statement to simulate chance (e.g. rolling a 6-sided die)

a = 3;
roll = randi([1 6], 1);
if roll > a
disp(‘you rolled more than’)
disp(a)
end

IF-statements can be used anywhere (e.g. inside a for-loop or another IF-statement).

Alternatively you may use a switch-statement for cases where you want to say “if a = roll” (if a is EXACTLY the same as your roll):

a = 3;
roll = randi([1 6], 1);
switch roll
case 3
disp(‘roll is equal to 3’)
otherwise
disp(‘roll is something else’)
end

Switch-statements can be used if a is a string variable(e.g. “a = ‘apple’). They may look overall nicer and cleaner than if-statements and appear to be more efficient with a faster execution time. For now, it probably doesn’t matter. Here’s how the above looks as an if-statement:

a = 3;
roll =randi([1 6], 1);
if a ==roll % use “==” to indicate ‘equals to’, “~=” for ‘not equals’
disp(‘roll is equal to ’)
disp(a)
else
disp(‘roll is somthing else’)
end

A while-loop is similar to an if-statement, but the ‘stuff’ inside the loop repeats until the condition (that a > 10) becomes false.

e.g.

a = 1;
count = 0;
while a < 10
a = a + 0.1;
count = count + 1;
end

Sometimes, one method is better or more efficient than another. Using a loop to generate a (5,5) matrix of random values (e.g. using loops and “rand(1,1)” to fill a matrix column by column, row by row — I was guilty of this early on) when simply using rand(5,5) is much better.

In Matlab, loops are quite slow compared to the alternatives and should be avoided if at all possible, especially in big programs.

Another useful tool which can be used is rem(a,2), which gives the remainder after dividing a by 2, and may be used to filter out odd and even numbers.
Different type of loops can be used in conjunction and within each other to suit your needs.

Note: tic and toc may be used as a stopwatch to time the run-time.

Initiating an array-based model

In the previous section we used some relational operators (<, ==, >=, etc). Rows of matrices can be used to store data, very much like in excel. Corresponding rows of different matrices and vectors can be used to store different data regarding something. For example, rows 1–5 of matrix V can represent field sites 1–5 for numerous site variables (the columns). Elements 1–5 of vector w can represent some other variable of field sites 1–5 . We explore this in the following sub-section:

Pond model

Making a model is like building a Lego model — a castle, a battleship! Each component Lego block plays its own role in the overall model — a model is composed of many functions (Lego blocks).

Imagine a habitat with ponds of different sizes, we will track the volume of water in them over discrete time steps (e.g. years).

A matrix Flow may contain the inflow into each ponds through time (rows correspond to specific ponds 1 : n and columns correspond to time (year) 1 : t).
A matrix Outflow contains the rate of drainage or evaporation from each pond, through time (of course this is dependent on the rainfall, but we won’t add this bit of detail :3).
Another matrix Volume may contain volume — how much water in there of different ponds through time (rows and columns arranged as in Flow). Volume is calculated from inflow and outflow in the previous year.

And finally,

A matrix Capacity may contain the maximum capacity possible for each pond (ponds correspond to rows).

For the 4 matrices, each row corresponds to a particular pond and each column corresponds to a particular year (Capacity is assumed the same each year).

How do we distinguish overflowing ponds and dried out ponds for each year? We’ll need all four arrays in order to answer this question.

Let’s do this.

Let’s say there are 5 ponds and we estimate how they go (in volume) for 10 years, based on flow and outflow. To begin with, for each year we create random values for each of the ponds (i.e. Flow(:,1)):

years = 10; ponds = 5;
Flow = randi([0 100],ponds,years);
Outflow = randi([0 100],ponds,years);

generate random values for capacity, and then random values for volume, based on capacities. After which we’ll need to repeat the capacity vector over 10 years. There are a number of ways to do this — we could use repmat.

Volume = randi([100 500],ponds,1);
Capacity = Volume + randi([-50 50],ponds,1);
Capacity = repmat(Capacity,1,years); % repeats the array over years

^Note we should never write “number = [1:5]” — that is what’s known as HARD CODING, which is bad. Life will be tough if you ever wanted to change the number of ponds to something else.

Now we calculate the volume at the next time step.

for t = 1:size(Flow,2)
Volume = Volume + (Inflow + Outflow);
end

This calculates the volume table, year by year (column by column) .

Logical indexing and limits

If volume is greater than capacity, it is overflowing — if volume is less than capacity, it is dry. We will identify these ponds using logical operators. Why “logical”? Because it is TRUE, otherwise it is FALSE. In Matlab, True is represented by 1, and False by 0.

You can “mark” arrays of equal sizes in the following way. Try this in the Command Window:

testA = randi([1 10],1,5)
testB = randi([1 10],1,5)
compare = (testA>testB)

which returns a 0 for false and 1 for true (if testA > testB element-wise). In our case we create the following matrices for use in indexing:

Overflow = (Capacity > Volume);
Dry = (Capacity == 0);

^This “marks” the ponds which are dry or overflowing for each year, which will be used for indexing below. Note that the find(x) function does a similar job.

Now, we want to cap capacity. For example, if capacity of a pond is 100, it would not make sense to volume over 100. Same goes for dry ponds, we can’t have negative volume. We can do both of these using logical operators:

Volume(Overflow) = Capacity(Overflow);
Volume(Dry) = 0;

However, first problem — this needs to be inside the Volume loop, otherwise we are simply changing the values to Volume AFTER all the calculations are done (e.g. A pond with a capacity of 500 shouldn’t have been calculated with a volume of 650). We need to take over-capacities and 0's into account when calculating volume for the next year.

Second problem, if we put these inside the loop, Capacity is too large a matrix— and Volume builds up bit by bit. The first line below assumes equal array size between Capacity and Volume:

Overflow = (Capacity > Volume);
Dry = (Capacity == 0);
Volume(Overflow) = Capacity(Overflow);
Volume(Dry) = 0;

To get around this problem, we can’t use repmat for Capacity. Capacity has to change in size with Volume, as Volume is calculated/expanded year-by-year (column-by-column). Like this:

for t = 1:(size(Flow,2)-1)
Volume(:,t+1) = Volume(:,t) + Flow(:,t) — Outflow(:,t);
Capacity = [Capacity Capacity(:,1)];
Overflow = (Capacity > Volume);
Dry = (Capacity == 0);
Volume(Overflow) = Capacity(Overflow);
Volume(Dry) = 0;
end

In this way, Volume values are limited between 0 and Capacity. Note that there probably are other ways to do this.

Overall the model might look like this. Of course, there are other ways to achieve the same thing — feel free to write this as a function:

^ Don’t worry about the red squiggle. The size of Capacity is meant to change with every loop.

—

The door is unlocked. You should now have the skills required for working with and understanding many models in ecology and evolution.
A foot into the world of programming/coding as by-catch ain’t bad at all!

If -else statement

a = 3
roll = sample(6)[1] or floor(6*runif(1)) + 1 or sample.int(6)[1]

if (roll > a) {
print(‘you rolled more than’, quote = FALSE)
print(a)
}
else {
print(‘you rolled less than’, quote = FALSE)
print(a)
}

print: setting quote = FALSE hides the double quotation marks.

Logical comparisons
>, <, <=, == same as Matlab
except for ‘not equal’, ‘not greater than’. e.g:
!= ( ‘~=’ in Matlab)
Pond model
Differences with the Matlab code highlighted below:

Line 3 & 4: sample(0:100) of a matrix that size will generate a warning message, but it’ll be fine.

Line 7: Volume was originally a vector. However, here we want a column vector. Transposing t(a) once will convert it into a single-row matrix. Transposing twice t(t(a)) will achieve a column vector.

Line 10: Unlike Matlab, Volume must be initialised before the loop. Volume has to have a size that doesn’t change with every loop. In this case we represent “unfilled” columns with 0’s, and fill it up column by column with the loop. You may want to use NA instead of 0’s in certain cases:
Volume = cbind(Volume, matrix(NA, nrow = pond, ncol = years-1)

Line 14: Volume[ ,(t+1)] We must include brackets around a formula even when indexing. In MATLAB Volume(:, t+1) is fine.

4. Comparing, Judging, and Timing

Using conditional and relational operators

Initiating an array-based model

Pond model

Logical indexing and limits

Published in How can I use modelling as an empirical scientist?

Written by Justin Wan

No responses yet