Understanding Time Complexity: Big O Notation & Software Performance

10 min readJul 4, 2024

In today’s world of complex software development, There’s an inevitable need for software performance after development to enhance user experience and ensure reliability and compliance with industry standards. These performances can be attributed to appropriate data structures, algorithm optimization, and many other factors.

Time complexity is the measure of the efficiency of an algorithm by estimating the amount of time it takes to run based on the size of the input it processes, it analyses how code execution time grows with the data size, for example searching a database, matching a string, etc. Let’s get started!!

Big O Notation

Time complexity is expressed with the Big O notation, it’s the mathematical concept used to describe the efficiency of an algorithm as the input size increases. It categorizes algorithms based on how long it takes to sort a task as it grows.

For example:

Constant time complexity(O(1)): This is similar to finding something in a small drawer next to you, it’s quick and the time doesn’t change no matter how many more drawers you add. This means the algorithm’s execution time does not depend on the size of the data and it executes in a constant amount of time, making it very efficient regardless of input size.
Linear time complexity(O(n)): It is compared to flipping through each book one by one, the more books you flip the longer it takes to complete flipping. if the input size gets doubled, the time of execution also doubles.
Logarithmic time complexity(O(log n)): It is like using a structured index or catalog to narrow your search in a large library. it’s faster as you focus on smaller sections at a time. As the input size increases, the runtime increases logarithmically, making it more efficient than linear time algorithms for large inputs. Example in Binary search.
Quadratic time complexity(O(n²)): It is like comparing every book to every other book, the time grows as the number of books increases. As the input size increases, the runtime increases quadratically. This nested iteration makes it less efficient compared to linear or logarithmic algorithms, especially for larger data.

Understanding Big O notation is important for choosing the best and most efficient algorithm that performs well with increasing input.

Analyzing Time Complexities

Having understood the concept of Big Notation O, it is important to understand how to analyze the time complexities in Python codes. To analyze time complexity in a thread that runs multiple functions, like sorting functions, search functions, mathematical functions, string manipulations, etc., the following should be considered:

Identifying the critical operations: Some functions execute repeatedly compared to others, for example, loops. This contributes to time complexity.
Analyzing Operations Within Loops: Consider operation within loops focusing on the number of times a loop iterates and the input size(n), nested loops increase complexity.
Considering Data Structures: Some data structures take more time to process compared to others, for example, Hash tables and Array take less time to process compared to Trees.
Breaking Down Complex Codes: Codes with multiple sections should be broken down into sections and the time complexity analyzed independently, and the complexity of different sections added up to determine the overall value.
Big O Notation: Once the time complexity of a code is clear, it is good practice to express it in Big O notation for easy clarity and future comparative analysis.

Example of Time Complexities in Python List Operations.

Linear Time Complexity (O(n)):

def find_number(arr, target):
    for element in arr:
        if element == target:
            return True
    return False

arr = [1, 2, 3, 4, 5]
target = 3
print(find_number(arr, target))  
# Output: True

The find_number function in this example has an O(n) time complexity. It iterates through each list element until it finds the target. As the size of the list grows, the number of operations increases linearly with the arr input size. If the arr list is increased to 10 elements, it may take 10 checks to complete the iteration increasing the time. This explains linear complexity and it’s efficient for operations that require small datasets but slower for large datasets.
Constant Time Complexity (O(1)):

def get_first_element(arr):
    return arr[0]

# Usage
arr = [1, 2, 3, 4, 5]
print(get_first_element(arr))  
# Output: 1

The get_first_element function returns the first element in the arr list using its index, getting the first element in the arr list with 100 elements will take the same amount of run time as the operation involves a single step. Constant time complexity is the most efficient time complexity as operations take a fixed amount of time no matter how large the dataset or input size is.

Quadratic Time Complexity (O(n²)):


def has_duplicates(arr):
    for i in range(len(arr)):
        for j in range(i + 1, len(arr)):
            if arr[i] == arr[j]:
                return True
    return False

arr = [1, 2, 3, 4, 5, 1]
print(has_duplicates(arr))  
# Output: True

In the has_duplicate function, the outer loop iterates over each element in the list arr using index I, and the inner loop iterates through the remaining list to the end starting from (i+1) to avoid comparing a single element twice. The third line checks if arr[i] element is found in the remaining arr[j] and returns true. If the loop completes without finding any duplicate, it returns False.
In a nested loop like this, the algorithm’s runtime grows exponentially with the number of elements, the more the elements, the more the comparisons.
With 5 elements, there can be up to 5 * 4 / 2 = 10 comparisons or 10 * 9 / 2 = 45 comparisons with 10 elements. Quadratic time complexity is inefficient for processing large inputs.

Logarithmic Time Complexity (O(log n))


def binary_search(arr, target):
    left, right = 0, len(arr) - 1
    while left <= right:
        mid = (left + right) // 2
        if arr[mid] == target:
            return mid
        elif arr[mid] < target:
            left = mid + 1
        else:
            right = mid - 1
    return -1

arr = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
target = 7
print(binary_search(arr, target))  
# Output: 6

The binary_search function iterates through arr list using the index, left is the starting index of the list (0), and right is the last index of the list (len(arr) - 1) = (10 - 1 = 9)

first Iteration:
left = 0, right = 9 mid = (0 + 9) // 2 = 4 arr[4] = 5, which is less than 7. So, update left = 5.
Second Iteration:
left = 5, right = 9 mid = (5 + 9) // 2 = 7 arr[7] = 8, which is greater than 7. So, update right = 6.
Third Iteration:
left = 5, right = 6 mid = (5 + 6) // 2 = 5 arr[5] = 6, which is less than 7. So, update left = 6.
Fourth Iteration:
left = 6, right = 6 mid = (6 + 6) // 2 = 6 arr[6] = 7, which is equal to 7. So, return 6.
The function successfully finds the target 7 at index 6 in the array and if the target is not found, the function returns -1 . From the example, with 10 elements,
it may take up to 4 comparisons (log2(10) ≈ 3.32, rounded up to 4), and with 100 elements,
it may take up to 7 comparisons (log2(100) ≈ 6.64, rounded up to 7)
The list is repeatedly divided in half and each comparison eliminates half of the remaining elements. As the input size increases, the time taken to process the data or iterate through the list increases much more slowly, proving that logarithmic complexity is efficient for large datasets compared to linear (O(n)) or quadratic (O(n²)) time complexities.

Optimizing Code for Better Performance

Avoiding Nested Loops:
Writing functions and iterating through data structures with nested loops often leads to quadratic time complexity (O(n^2)) which is inefficient for working with large datasets. Nested loops can be resolved into single-pass algorithms, for example:

#function 1
def count_pairs(numbers):
  count = 0
  for i in range(len(numbers, target)):
   for j in range(i+1, len(numbers)):
     if numbers[i] + numbers[j] == target:
     count += 1
  return count
numbers = [1, 2, 3, 4, 5]
target = 5
print(count_pairs(numbers, target))
#Output: 2 (pairs: (1, 5), (2, 4))



#function 2
#optimizing count_pairs function.

def count_pairs_optimized(numbers, target):
  complements = set()
  count = 0
  for number in numbers:
    if target - number in complements:
      count += 1
    complements.add(number)
  return count
numbers = [1, 2, 3, 4, 5]
target = 5
print(count_pairs_optimized(numbers, target))

The first function, count_pairs, checks every possible pair in the list to see if their index numbers[i] + numbers[j] add up to the target, using a double loop, and goes through the list repeatedly resulting in a time complexity of (O(n²))
The count_pairs_optimized function uses a for loop to iterate over each number in the number list and checks if the complement of the current number (i.e., target - number) exists in the complements set If it does, that means there is a pair of number that sums up to the target, and the target is increased by 1. The current number is added to the complement set to avoid duplicate iterations over the same number.
The pairs that sum up to 5 are (2, 3) and (1, 4), so the function returns 2 which is printed by the output. This speeds up the process by using a set to remember numbers already iterated over and avoids double iterations.

Using Efficient Data Structure:
Choosing the right data structure can significantly impact the performance of your code. For instance, using dictionaries (hash maps) can provide average-case constant time complexity (O(1)) for lookups, inserts, and deletions, which is more efficient than using lists for these operations. Here's an example:

# Inefficient
def find_duplicate_1(numbers):
    for i in range(len(numbers)):
        for j in range(i + 1, len(numbers)):
            if numbers[i] == numbers[j]:
                return numbers[i]
    return None

numbers = [1, 3, 4, 2, 5, 3]
print(find_duplicate_1(numbers))
# Output: 3




# Optimized
def find_duplicate_2(numbers):
    seen = {}
    for number in numbers:
        if number in seen:
            return number
        seen[number] = True
    return None

numbers = [1, 3, 4, 2, 5, 3]
print(find_duplicate_2(numbers))
# Output: 3

The find_duplicate_1 function takes an argument numbers, which will be a list of integers and a for loop iterates over each index i in the numbers list. The loop runs from 0 to len(numbers) - 1, and checks If n elements inside the outer loop, is equal to the element at index j. If a duplicate is found (i.e., numbers[i] == numbers[j]), it returns the duplicate number and exits. The nested loops result in O(n^2) time complexity because each element is compared with every other element in the list, resulting in a quadratic number of comparisons.
The second function, find_duplicate_2 takes an argument numbers, expected to be a list of integers, and Initializes an empty dictionary named seen , this will keep track of numbers that have already iterated over. The for loop iterates over each element in the numbers list one at a time. If the number is not in the seen dictionary, it is added number to the dictionary with a True value, else it returns None The optimized approach improves performance by reducing the time complexity from O(n^2)to O(n). This is achieved by using a dictionary to keep track of seen elements, allowing for constant time checks and insertions.

Using Built-in Functions:
Leveraging Python’s built-in functions and libraries can optimize code performance.

# Inefficient
def concatenate_strings_naive(strings):
    result = ""
    for string in strings:
        result += string
    return result

strings = ["Hello", "World", "from", "Python"]
print(concatenate_strings_naive(strings))  
# Output: "HelloWorldfromPython"


# Optimized
def concatenate_strings_optimized(strings):
    return "".join(strings)

strings = ["Hello", "World", "from", "Python"]
print(concatenate_strings_optimized(strings))  
# Output: "HelloWorldfromPython"

Here, each concatenation operation creates a new string, which involves copying the entire string. On average, concatenation of the i-th string in the loop takes O(i) time. The time complexity is the sum of the series O(1)+O(2)+O(3)+...+O(n) which is O(n^2).The join() method iterates over strings once to calculate the total length of the final string and finally construct the final string. The overall time complexity is O(n) because it involves iterating over the list twice, but each iteration is linear concerning the number of elements in the list and their total length.

The optimized method using the join() a built-in Python method, significantly improves performance by reducing the time complexity from O(n^2) to O(n). This demonstrates how using built-in functions designed for specific tasks can lead to more efficient code.

Impact of Big O Notation in Software Development

As a software developer, it’s possible to run into lagging troubles after deploying your application, this could be caused by a series of bugs during development and code optimization is one of them. Big O notation plays an important role in software development, it provides a standard way to measure the efficiency of an algorithm as the data sizes increase. This notation groups algorithms based on how their execution time grows compared to the input size. This helps developers predict and manage performance.

The concept of Big O notation is vital in software engineering as it helps developers make informed decisions, i.e. selecting effective algorithms and data structures. Take for example, O(1) or O(log n) algorithms with lower complexities are preferred for tasks requiring rapid response times, while those with higher complexities like O(n²) are used wisely to avoid performance issues

Applying Big O notation effectively optimizes software performance, enhances user satisfaction, and ensures applications operate smoothly across various computing environments.

Conclusion

Optimizing code performance is essential in modern software development to enhance user experience and ensure reliability. By choosing appropriate data structures, optimizing algorithms, and using built-in functions effectively, developers can significantly reduce time complexity. Understanding and applying Big O notation helps in creating efficient code that scales well with larger datasets, leading to improved application performance.