Experimental O time

10 min readSep 13, 2023

Have you ever wondered what theoretical O execution time looks like with real algorithm implementation? A couple of weeks ago I got curious as I realized, that I never saw any experimental data collected for the big O execution times. Either it’s obvious (which I think it is) or not needed, as every deterministic algorithm we have can be proven fully theoretically by counting loop hits. In the end, I thought it would be interesting to collect some data and see the results myself.

I chose Rust and Go for the implementation. Those two, because I was also curious if there is any significant difference between them. I could’ve picked C++ and JS, but I wanted to practice a bit more in Rust and Go was more convenient. Other than memory management differences, I don’t have any other significant criteria.

Attention: This is not Rust and Go benchmarks comparison. It is theoretical execution time vs experimental execution time comparison.

I chose one simple and well-known algorithm from each complexity group:

Sequential Search — O(n)
Fast power — O(log n)
Quick Sort — O(n log n)
Bubble Sort — O(n²)

Approach

As I’ve been doing the experiments on both languages, measurement and testing techniques should be the same. It required me to write the same algorithmic implementations for both languages.

Note: You can find the full code for this article in this repository: https://github.com/QuestionAndAnswer/bigotest

To avoid overloading this article with boilerplate testing code, it is described here in pseudocode.

struct Measurement:
    N: integer
    Time: time.Duration

function RunN[T](
  targetFn: function, 
  dataBakeFn: function, 
  maxN: integer, 
  points: integer
) -> list of Measurement:
    ns := linspace(1, maxN, points)

    res := empty list of Measurement

    for n in ns:
        data := dataBakeFn(n)

        start := current time
        targetFn(data)
        elapsed := current time - start

        res.append(Measurement(N: n, Time: elapsed))

    return res

function RunOTest(
  name: string, 
  fn: function, 
  preheat: integer, 
  repeats: integer
):
    for i in range(preheat):
        fn()

    RemoveReports(name)
    for i in range(repeats):
        measurements := fn()
        WriteReport(name + "_" + i, measurements)

There are two main functions here:

RunN - Measures the execution time of a target function for varying input data sizes. Linspace function has the same semantics as Python’s numpy.linspace function but for the integers. It divides range [start, end] onto equally spaced points, which then are used as input size n for our target function measurements.
RunOTest - Repeates the complete experiment cycle several times and records the results. Before running an actual experiment, there is a preheating loop, which ensures that the function is stable and aligned with the system’s (process memory allocation, CPU’s branch predictor “training”) state. This has been shown to give less noisy results.

Instead of repeating measurements for the same n several times at once and treating it as a record, I decided to collect several full runs of an algorithm separately. While this approach is a bit harder to stabilize, it ensures that I can process and compare each full measurement separately.

Because algorithms execution time scales differently, for each algorithm I used different N values. It wouldn’t be reasonable to wait several seconds for O(n) algorithm completion and several hours for O(n²) giving each n=10⁸.

For each test, 200 points are collected, which here we will be calling repeat. As we test O notation, which means an upper bound for the function growth, I’m testing worst-case scenarios, preparing data in the way that the algorithm will have to go through all its loops and branches.

For plots, I used Python and Matplotlib in Jupyter Notebooks. There are a few symbols used on the plots that need description.

Calculated as a mean of all repeat’s values for each point n. Basically, it produces another mean repeat data and helps represent a unified mean version of the algorithm runtime.

Sigma R — Measurement standard deviation

Calculated from the values of all repeats for each point n. Helps to understand how noisy measurement was for a specific point n.

Overline sigma R — The mean of measurements standard deviations

Calculated from measurements standard deviation and helps understand how noisy the whole data set was.

small t with prime tick — Normalized unitless time

O notation doesn’t have any measurement units. In order to compare real execution times with ideal execution complexity growth, all the execution time values are min-max normalized into the [0, 1] range and therefore unitless.

There is also MAE (Mean Absolute Error) calculated for the mean of repeats and ideal O function in each case. We’ll not use it much during the results review, but it is useful to see how much our results differ from the target O function.

Sequential Search — O(n)

Data is a linear array populated with zeros, while algorithms have to look for the number 100. Here into the alg function, I’ve added an additional iterative loop to stabilize reported values across runs, because for the smaller n values execution is so fast, that sudden CPU occupation, like with the GC, brings too much delay into execution and thus noise into the data. This approach would be used later as well.

otest.RunOTest("seq_search", func() []otest.Measurement {
  search := func(data []int, target int) (int, bool) {
   for _, v := range data {
    if v == target {
     return v, true
    }
   }

   return -1, false
  }

  alg := func(data []int) {
   for i := 0; i < 10; i++ {
    search(data, 100)
   }
  }

  return otest.RunN(alg, bakeArray, 10000000, POINTS)
}, PREHEATES, REPEATES)

Go produced saw-like still legit, linear execution time growth, with a decent amount of noise. My assumption for this shape is that GC intervenes in execution.

Note: I’ve tried to switch Go into manual GC collect calls before and after single run, with sufficient preallocated memory, but this resulted in even more noisy results.

Plot showing Go linear normalized execution time growth for sequential search. Collected data and calculated mean line has saw like shape with linear growth trend. MAE = 0.0171, Mean Standard Deviation = 0.0085

Rust implementation has no differences except for the language specifics.

otest::runner::run_o_test(
    "seq_search",
    || {
        fn search(data: &mut Vec<i32>, target: i32) {
            for d in data {
                if *d == target {
                    *d = 0;
                    break;
                }
            }
        }

        fn alg(data: &mut Vec<i32>) {
            for _ in 0..10 {
                search(data, 100)
            }
        }

        return otest::runner::run_n(alg, bake_array, 10000000, POINTS);
    },
    PREHEAT,
    REPATES,
);

The linear trend holds for Rust as well, but it also features a much cleaner and more stable result.

Plot showing Rust linear normalized execution time growth for sequential search. Almost perfect line (low noise, low deformations). MAE = 0.0029, Mean Standard Deviation = 0.0024

In this example, Rust produced about 3.5 times less noise in the results than Go (0.0024 vs. 0.0085).

Fast power — O(log n)

Because the fast power algorithm, surprise, is fast and doesn’t scale this much from n, I’m using the same approach as before and artificially increasing its execution time via an additional loop executing fast power for the same n several million times.

otest.RunOTest("fast_power", func() []otest.Measurement {
  power := func(N int, M int) int {
   power := N
   sum := 1

   for M > 0 {
    if (M & 1) == 1 {
     sum *= power
    }
    power = power * power
    M = M >> 1
   }
   return sum
  }

  alg := func(n int) {
   for i := 0; i < 10000000; i++ {
    power(1, n)
   }
  }

  return otest.RunN(alg, bakeN, 10000000, POINTS)
}, PREHEATES, REPEATES)

This produced quite stable and recognizable logarithmic growth.

Plot showing Go logarithmic normalized execution time growth for fast power algorithm. There are quite noticeable steps, like “ladders”, when logarithmic trend growing. MAE = 0.0213, Mean Standard Deviation = 0.0031

For Rust implementation, I added std::hint::black_box wrappers for the fast power call and its arguments because the compiler optimized out some parts of the algorithm and produced constant time execution results.

otest::runner::run_o_test(
    "fast_power",
    || {
        fn power(n: usize, mut m: usize) -> usize {
            let mut power = n;
            let mut sum = 1;

            while m > 0 {
                if (m & 1) == 1 {
                    sum *= power;
                }
                power = power * power;
                m = m >> 1;
            }

            return sum;
        }

        fn alg(n: &mut usize) {
            for _ in 0..10000000 {
                std::hint::black_box(power(std::hint::black_box(1), std::hint::black_box(*n)));
            }
        }

        return otest::runner::run_n(alg, bake_n, 10000000, POINTS);
    },
    PREHEAT,
    REPATES,
);

Rust results don’t differ much from Go’s, and produce the same logarithmic plot.

Plot showing Rust logarithmic normalized execution time growth for fast power algorithm. There are quite noticeable steps, like “ladders”, when logarithmic trend growing. MAE = 0.0254, Mean Standard Deviation = 0.0023

In this case, Go and Rust noise levels are close and only differ 1.4 times.

While algorithms produced expected logarithmic growth, there are quite noticeable “ladder” steps. The algorithm itself causes this stepping. The pair of “while m > 0” loop and “m = m >> 1” produces this result. Doing “m = m >> 1” is effectively division by 2. For the bigger numbers, we need to divide by two more times than for the smaller ones. Therefore more iterations are required and more time in total. This can be demonstrated with a loop iterations counter added into the algorithm for each n.

Plot showing iterations number done to input size length n — ideal logarithmic stepping growth. This confirms stepping nature of plots for Go and Rust data.

Also, notice that each step has twice as long length as the previous one. Again, because doubling or halving for bigger numbers means a bigger delta than for the smaller numbers.

Quick Sort — O(n log n)

I used a recursive version of the quick sort algorithm. The same as before, I used a few iterations stabilization loop here. As sorting is a mutable operation, I’ve allocated an additional data copy array and run sorting with the data copy each cycle instead of the original prepared data.

otest.RunOTest("qsort", func() []otest.Measurement {
  partition := func(arr []int, low, high int) int {
   pivot := arr[high]
   i := low
   for j := low; j < high; j++ {
    if arr[j] < pivot {
     arr[i], arr[j] = arr[j], arr[i]
     i++
    }
   }
   arr[i], arr[high] = arr[high], arr[i]
   return i
  }

  var sort func(arr []int, low, high int)
  sort = func(arr []int, low, high int) {
   if low < high {
    p := partition(arr, low, high)
    sort(arr, low, p-1)
    sort(arr, p+1, high)
   }
  }

  alg := func(data []int) {
   dataCopy := make([]int, len(data))
   for i := 0; i < 4; i++ {
    copy(dataCopy, data)

    sort(dataCopy, 0, len(data)-1)
   }
  }

  return otest.RunN(alg, bakeRandomArray, 1000000, POINTS)
}, PREHEATES, REPEATES)

Go results matching predicted n log n line, and running copy each stabilization loop iteration didn’t create any issues for the end results.

Plot showing Go’s n log n execution time growth for the quick sort algorithm. There is little noise increase closer to the end of an n axis. MAE = 0.0053, Mean Standard Deviation = 0.0029

Note: The ideal line might look like linear, but pay attention to the grid’s corners intersection. Diagonal of a square (which this plot is) will cross every corner of the cell of the inner grid, but our example has slight bend to the bottom, so it doesn’t. You may also compare this with the linear plots above to see the difference.

Rust version doesn’t differ from the Go’s implementation.

otest::runner::run_o_test(
    "qsort",
    || {
        fn partition(arr: &mut Vec<i32>, low: usize, high: usize) -> usize {
            let pivot = arr[high];
            let mut i = low;
            for j in low..high {
                if arr[j] < pivot {
                    arr.swap(i, j);
                    i += 1;
                }
            }
            arr.swap(i, high);

            return i;
        }

        fn sort(arr: &mut Vec<i32>, low: isize, high: isize) {
            if low < high {
                let p = partition(arr, low as usize, high as usize) as isize;
                sort(arr, low, p - 1);
                sort(arr, p + 1, high);
            }
        }

        fn alg(data: &mut Vec<i32>) {
            let mut data_copy = vec![0; data.len()];

            for _ in 0..4 {
                data_copy.copy_from_slice(data);

                sort(&mut data_copy, 0, (data.len() - 1).try_into().unwrap());
            }
        }

        return otest::runner::run_n(alg, bake_random_array, 1000000, POINTS);
    },
    PREHEAT,
    REPATES,
);

It also produces matching with ideal results.

Plot showing Rust’s n log n execution time growth for the quick sort algorithm. There is little noise increase closer to the end of an n axis. MAE = 0.0053, Mean Standard Deviation = 0.0029

Both plots have a slight standard deviation increase for bigger n. My assumption is that the algorithm operates under contiguous memory space in a non-serial way. This causes more significant memory jumps for each recursion branch. Nevertheless, both implementations produced identical MAE and mean standard deviation.

Bubble Sort — O(n²)

Love and hate of the sorting algorithms could not not to be on this list. All the same techniques for the stabilization are used as in the Quick Sort algorithm test.

otest.RunOTest("bubble_sort", func() []otest.Measurement {
  sort := func(data []int) {
   for i := 0; i < len(data)-1; i++ {
    for j := 0; j < len(data)-1; j++ {
     if data[j] > data[j+1] {
      data[j], data[j+1] = data[j+1], data[j]
     }
    }
   }
  }

  alg := func(data []int) {
   dataCopy := make([]int, len(data))
   for i := 0; i < 4; i++ {
    copy(dataCopy, data)

    sort(data)
   }
  }

  return otest.RunN(alg, bakeRandomArray, 10000, POINTS)
}, PREHEATES, REPEATES)

Go produced good-looking squaring values, with an acceptable amount of noise.

Plot showing Go’s n square execution time growth for the bubble sort algorithm. There is a slight bend down in the middle of a plot. MAE = 0.0263, Mean Standard Deviation = 0.0033

Rust implementation has no differences from the Go’s version.

otest::runner::run_o_test(
    "bubble_sort",
    || {
        fn sort(data: &mut Vec<i32>) {
            for _ in 0..data.len() {
                for j in 0..data.len() - 1 {
                    if data[j] > data[j + 1] {
                        data.swap(j, j + 1)
                    }
                }
            }
        }

        fn alg(data: &mut Vec<i32>) {
            let mut data_copy = vec![0; data.len()];

            for _ in 0..4 {
                data_copy.copy_from_slice(data);

                sort(data);
            }
        }

        return otest::runner::run_n(alg, bake_random_array, 10000, POINTS);
    },
    PREHEAT,
    REPATES,
);

And it produced less noisy results.

Plot showing Rust’s n square execution time growth for the bubble sort algorithm. There is a slight bend down in the middle of a plot. MAE = 0.0148 Mean Standard Deviation = 0.0009

The slight bend down in both cases was caused by min-max normalization. On the ends (n → 0 and n → 10K) both lines match because minimal and maximum values of the experimental data lay on the ends of the lines respectively. The most noticeable deviation is visible in the middle because the most divergent from ideal values are in the middle.

That’s it. Predictions hold (why shouldn’t they), the world is spinning, math works, as well as basic computer science. Hope you’ve enjoyed those plots and a tiny analytics on top!