Python Data Science: How To Write Algorithm That Solve Problems Faster

Riade
Codinoon
Published in
3 min readNov 25, 2020

Write Python Code That Can Solve Problems Faster And More Effectively without buying a ram or gaming PC

Today data science is one of the most popular categories among programmers and none programmers, every one want to learn it, probably learning it to get a dream job, to invent or for personal reasons.

But what beginners and mid intermediate level data scientist understand is that you can’t just open an editor and start coding, yes your code will run and work but the downside that is it very slow so how you can optimize your code to process millions of data without the need to add a ram or by a gaming PC?

Time Complexity or Big O Notation:

If you never heard about this let me explain: Big O is a method to measure how time does your code take to finish. Let's imagine a scenario where you are searching for an item in a list:

list = [some content]
for x in range(10):
if variable == list[x]:
return True

Very simple code and it will execute fast, but what if your range is 1 million? The code will run for a long period of time trying to get your value and at the end it will return False! This is called O(n) which means your code will run as long as your range. Or let’s imagine an even worse scenario:

list = [some content]
for x in range(10e^6):
for y in range(10e6):
if list[x] == variable:
return True

This code will run forever, and time complexity is O(n^2) (see the image down below)

This is why you need to pay attention to your code, one way to optimize this is using one of the following search mini-algorithms:

Selection sort [The Worst Case: O(N2)]
Merge sort [The Worst Case: O(N log N)]
Linear search [The Worst Case: O(N)]
Binary search [The Worst Case: O(log N)]

What The Worst Case means is that no matter how you code range is the speed will be the same, as you can see binary search is the best and the fastest search algorithm. So instead of using standard for loops that can take ages to finish try a binary search algorithm and see the difference.

python data science big o notation
Big O Notation Graph

Data Structures:

python programming data science data structures
Binary Tree Data Structure (Credit: DataCamp)

Another part is how you choose to store data, yes there are lists, tuples, dictionaries. But what you didn’t know is there are other data structures that you can use it to store your data but why? Probably that some data structures are faster and store data more effectively than a standard list or tuple. This special data structures are: Linked Lists, Graphs, Trees(tries, Binary Trees), Stacks, Queues, Heaps.

This special data structures provide faster features like searching, deleting, insertion and searching. For example a stack and queue have insertion and deleting time complexity if O(1) where a standard array have time complexity of O(n). The reason is that Stack follow the FIFO(First In Last Out) style and queues follow LIFO(Last In First Out) style which make it super easy for the computer to run specific operations on it like deleting and insertions.

Every data structures have his own features that is faster in some situation and slower in other situations. This will open varieties of choices of your development process where you will not build your code around standard bases but instead you can choose what and how to run your code faster and more effectively.

Yes there is more to add in this list and bigger details to include but I want this article to be a friendly introduction and I will write more articles to explain each part in details in the coming days so make sure to be updated and follow me on Twitter @riade_b

If you want to contact me please visit my portfolio HERE

--

--

Riade
Codinoon

Hi!, my name is Riade, and I am python programmer, full stack web developer and intermediate data scientist.