Analytics Vidhya
Published in

Analytics Vidhya

How to organize a python function?

Photo by Bernard Hermant on Unsplash

What is a function in python? It has a few to multiple lines of code. But what makes them different from the others?

It’s RE-USABILITY

FUNCTIONS

A function is a custom or predefined entity which takes arguments from us, performs operations and returns output. This is the main function of a function.

But contrarily, a function sometimes doesn’t take any argument from us. Sometimes, it doesn’t even return an output.

For example, let us take a look at a simple function.

def simple_function():
pass

I accept, it is too simple. But there are lots of things to be observed and learnt from them.

Any user defined function starts with the keyword def.

It is followed by the function name followed by closed braces.

The function name is followed by a colon.

If a function don’t return anything, we can use an keyword called pass.

It nothing is given in the function body, it returns an error.

The motive of the function is “Write once; use multiple times”. It reduces or eliminates unnecessary work of writing the same code numerous times. The initial code may take a lot of time and effort to do correctly. But once done, it can be reused with no or minimal effort according to the problems or cases.

Should the functions be created only by us?

Definitely NOT.

A lot of predefined functions are already available in various python libraries. In my view, the simplest of all is print(). It is a predefined function that can be used by anyone who has python installed their system.

Most of us don’t know how print() is written. But most of us know what it does. This is the significance and use of functions. Instead of hard coding for the same purpose multiple times, we create a template so that anyone with access to it can use it any number of times.

CUSTOM FUNCTIONS

The function name should be self explanatory. It should be related to what purpose it serves. It is not a written rule. But, it would be better in case you work in a huge team and you need not explain them to each one of your team members. Even if you try to use the same function in a different project, you can identify and tune it according to your necessity. This would be possible only if you are able to identify it.

The function names could be anything except the reserved words of python. Reserved words are those words which already have a role or position in the python in-built syntax. Some of the examples are pass, list etc. Look at the following code.

def addition(a, b):
return a+b

A typical function looks somewhat like this. Look as I mentioned, the name of the function should define what the function does. Then, the function returns an output.

If you don’t want any function definition, we can use pass statement as mentioned above. You may wonder why should someone use pass statement, it can be used when you just specify a function and you want your team member to complete it or you are conducting an examination and you want the students to complete it.

What if our function returns more than one output. Just take a look at the below example.

def basic_math(a,b):
add = a + b
sub = a - b
return add, sub

The output of the above program will be a tuple. If you want to get better access to them, you can store them in multiple variables. Take a look at the below example.

x, y = basic_math(4, 2)

This function would return the addition component to the x variable and the subtraction component to the y variable.

DEFAULT ARGUMENT

In certain cases, we need to specify some default arguments to the functions. If the user while calling the function doesn’t mention value for that argument, it takes the default value. Let us take a look at the following example.

def area_circle(radius, pi = 3.14):
area = pi * radius**2
return area

In the above case, the value of pi is given as a default argument. Even if the user forgets to define the argument, it does not give any error and returns the output. The values can also be overwritten whenever necessary. To find the default arguments for various input functions, please look at the syntax of those functions.

DOCSTRINGS

Docstrings are used for the purpose explaining what a specific block of code such as function, class or method does. It is an alternative way of commenting that is used by the editors to understand what the code does. It often helps the developer too. It is simply the definition of the function, class or method. Let us take a look at how docstrings are defined.

def multiply(a, b):
"""
This function takes two integers as input, multiply them and store them in another variable. Then the function returns the new variable.
"""
c = a * b
return c

They are usually enclosed between triple quotes. In the above case, the docstrings clearly explains the purpose the function. Don’t curse me if it is even necessary. It will come handy when you work with huge modular programs.

Now, you have defined the function with docstrings somewhere in your program. But you remember only the function name. How can you read that again? Here comes the solution. The following code can be used to print the docstrings we have defined.

print(multiply.__dot__)

This comes handy more often if you work in large teams with huge number of functions and modules.

SPECIAL TOOLBOX

So far, we have seen about predefined and custom functions. Now, we will look at some of the tools which are closely associate with these functions.

Photo by Jo Szczepanska on Unsplash

LAMBDA

Lambda is a type of function which comes into picture when the operations we are going to perform are too simple. All the example functions which I have stated as example so far are too simple that can be defined with a lambda function itself. Lambda function is also known as anonymous function. This is because the function does not need a name and is a lightweight version of function. Let us take a look an example below.

lambda x: x**2

The above function returns the square of the entries fed into the function. The variables before the colon represents the variables we are going to operate on. The entries after the colon represent the function that is going to be performed.

These lambda functions can be applied to an elements, a list of elements, a dataframe etc. They are highly powerful and simple.

These lambda functions can also be assigned to variables such as shown below.

sum = (lambda x, y : x + y)

In the above example, we have use two variables and they are assigned to a variable called sum. be careful to notice that, sum is a variable name not a function name. Please remember the rules for a function. This is similar to assigning any value to a variable.

MAP

A map function is used in cases we need to avoid iterations. Everywhere iterations are computationally costly and time consuming. If you have taken Deep Learning Specialization on Coursera, you would have noticed Andrew Ng mentioning numerous methods to remove iterations while building neural networks.

Similarly we can use map functions effectively to do some effective computations. They are very much useful in data manipulations. They are mainly useful for row wise computations in case of dataframes. Usage of map functions reduces the number of lines of code. Consider the following example of converting speed of a car from km/hr to m/s.

def speed_conv(data):
data_m = (data * 5) / 18
return data_m
data_km = [10, 20, 30 40]

A map functions should have minimum two arguments. The first argument is the name of the function without the parenthesis. Then there must be atleast one argument of the variable on which the operation has to be performed on.

list(map(speed_conv, data_km))

This converts all the elements in the list to m/s. This can be scaled up to whatever we need and also we can avoid the computationally heavy iterations. We use list method as the map function returns a map object. So we need to use list to obtain the output in the form of list.

Note that while giving the function argument, we can also give a lambda function to perform the operations.

REDUCE

The reduce functions simply sweeps the given function across the row. It is used for performing aggregations functions in a given row. It sequentially applies a given functions to a particular row.

Unlike map, reduce is not inbuilt in the python. You have to import it from functools library as shown below.

from functools import reduce

Please note that map functions are mainly used to manipulate multiple rows. On the other hand, reduce functions are used for one row.

One important thing to notice is that for reduce function, only two arguments are allowed. More than two leads to error.

from functools import reducenum =[10, 20, 30, 40]reduce(lambda a,b: a+b, num)

The above function sequentially sums consequent elements of a functions and returns the total sum of all elements. Thus the function always returns a single value.

Here, while giving a function as an argument, you can use both a lambda function or the general function.

FILTER

A filter function provides us a simple method to filter our desired rows which satisfies a given condition.

num =[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20]list(filter(lambda x: x%4==0, num))

The above function returns only those elements divisible by 4. The lambda function returns True only for 4, 8, 12, 16 and 20. They are then filtered by the filter function.

Filter function also takes two arguments. First argument is the function and the second argument is the list, dataseries or dataframe.

Let me show a real time application of a function. Below shown is a part of a function used for data cleaning and binning purpose in a recent hackathon I had participated in. Take a look at it. I have changed the name of the variables a little.

def data_clean(data):
data["engagement_rating"] =
data["engagement_rating"].fillna(1.0) data["age"] = data["age"].fillna(data["age"].mean())

data.drop(["id", "test_id", "trainee_id"], axis=1, inplace=True)


data["age_bin"] = 0
data.loc[data["age"]<25, "age_bin"] = 0
data.loc[(data["age"]<35) & (data["age"]>25), "age_bin"] = 1
data.loc[(data["age"]<50) & (data["age"]>35), "age_bin"] = 2
data.loc[(data["age"]<63) & (data["age"]>50), "age_bin"] = 3

data["programme_duration"] = scale(data["programme_duration"])
data["age"] = scale(data["age"])

return data

This is only a small portion of the function. After creating such a generic function, I get the freedom of passing any number of dataframes into this function. Thus, it decreases the unnecessary usage of time. Now, I can clean both my train and test data with this function. Also, I can even reuse this function for any other similar applications with some modifications.

CONCLUSION

Thus, by using all these tools, we can do some efficient coding. A programmer’s skill and experience is reflected in the way he writes code. So write it efficiently by developing your skills.

Now go ahead, take a real time dataset from kaggle and start your practice. Only practice makes you a better programmer.

All of these helps us save time and spend them productively. After all, time is the most valuable thing that exist in the universe.

I wish you all Good Luck🤗

RESOURCES

--

--

--

Analytics Vidhya is a community of Analytics and Data Science professionals. We are building the next-gen data science ecosystem https://www.analyticsvidhya.com

Recommended from Medium

Checking BitTorrent in honor of the 20th anniversary. Time == quality

How Slow is Python Compared to C

OMG

Pipeline Redemption: How Spinnaker is shaping delivery excellence at SAP

The Simple Ways To Contribute To Open-Source Test Automation

How I came up with the idea of ExternalCronJobs.com — The External Cron Jobs System 🚀

Top 10 Spring ’21 Release Features For Better Salesforce Experience

Build real-time dashboard on Amazon Webservices

How Azure Bicep is Different

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Murugesh Manthiramoorthi

Murugesh Manthiramoorthi

Data Scientist chez Murmuration — FlockEO.com | TBS Education | Baja Bhais

More from Medium

MLOps: Azure Machine Learning Components with Azure GUI Dashboard

Linear Regression with PySpark in 10 steps.

Applying Graph Theory concepts in basic data manipulation problems

Discretization Techniques