Fundamentals Of Python In Data Science

Dec 21, 2020 · 7 min read

Python is widely popular and user-friendly language which is used for general purposes. It is designed to be highly readable. Python is an interpreted language. That means that, unlike languages like C and its variants, Python does not need to be compiled before it is run. Other interpreted languages include PHP and Ruby.

Popularity of Each Programming Language.

The reason of growth is Python is user friendly and is easy to debug. It has extensive list of libraries which allows us to do more than basic scripting. For Data Science, Python shines bright as one such language as it has numerous libraries and built in features which makes it easy to tackle the needs.

Before indulging ourselves with Python fundamentals, one should be aware how to set up python environment and know how to use a text editor. Now let us focus on the Fundamentals of python and learn few things about it.

The basic Python syllabus can be broken down into following topics: -

· Data Types: -Int(Integer), Float, Strings

· Data structures: -Lists, Tuples, Dictionaries and Sets

· Loops and Functions

· Libraries

Everything in Python is an object, data types are classes and variables are instance of these classes.

Before diving into this, one should be aware of what a Variable is. A Variable is commonly used in python and is generally named by what they represent to make it easier to interpret. Basically, variable is a location in memory used to store some value or data.

#Here x is a variable and 100 is assigned to Variable 'x'

The first step towards understanding python is how it interprets the data. One should always remember that python has some understanding towards data or values. There are some widely used data types like Integer(int), Strings(str), Float(float), Boolean

a=5            #data type is an integer
<class 'int'>

a='Vihaan' #data type is a string
<class 'str'>
************************************a=12.23 #data type is float. It is mainly a decimal number.
<class 'float'>

Typecasting is one of the most important fundamental topics in Python. It is basically converting a particular type of data or value into another type IF possible.

a='4432'      #a number can be converted into string by using ''
<class 'str'>
<class 'int'>

Sometimes type conversion is not possible. For Example, One cannot change a string variable to an integer variable. Python automatically pops out an error which is stated below.

One very elegant thing about python is how it shows the error. For the user it becomes very easy to understand what error they did and where the error happened. This is one of the reasons why python is getting popular.

Strings: They are widely used. It is basically textual data, and their operations are very useful. Some of them are: -

· Splitting and joining the strings using split() and join() method.

· Changing the case of the string with lower() and upper() method

· Concatenating string with ‘+’

For more details refer this .


List is an ordered sequence of items. It is one of the most used datatypes in Python and is very flexible. All the items in a list do not need to be of the same type. It is often enclosed with square brackets []. A list can also be defined as a collection of elements. It is one of the sequence data structures.

Lists are indexable which means they have index value. Index value is basically the value assigned to the elements present in the list. The value of the first element starts from 0. Here for an example we called the 2nd value of the list and the output is 3. Lists are mutable which means they can be changed easily.

There can be a list inside of a list which is often called as nested list. For example


There are some basic methods we can apply to lists. They are: — adding or extending lists, sorting, reverse, delete, appending etc.


Tuple is an ordered sequence of items same as list. The only difference is that tuples are immutable. Tuples once created cannot be modified. They are enclosed with ().

Python will throw an error if we try to change an existing tuple.

Again, the error is so nicely explained.

Tuples have index values like lists. There can be tuple in a tuple which is often referred as nested tuples. There are some method which are important in tuple Is count(), del(), sort() etc.


Set is an unordered collection of unique items. Set is defined by values separated by comma inside braces ‘{ }’. Items in a set are not ordered. Sets can be used to do math operations like union, intersection, symmetric difference.

One unique thing about sets is they do not allow duplicate elements.

Some important methods used in sets are: -

· Add()

· Pop(), it pops out the element from the set and stores it in memory.

· Remove(), completely removes the element from the set.


In Python, dictionaries are defined within braces {} with each item being a pair in the form key: value. Key and value can be of any type. Dictionary is an unordered collection of key-value pairs. We can create a dictionary by a list of tuples.

For more details regarding dictionary, refer this.


As the name suggests, loop is something that iterates over and over until the condition becomes false. The image gives a clear understanding how loops work.

There are mainly two types of loops one should focus one:-

· While Loop

· For loop

While loop:

While loop is used to iterate over a block of code as long as the expression is true.

Syntax of While loop.

In loops, indentation matters a lot. Python uses indentation as its method of grouping statements. If the condition is true, the loop will iterate over all the indented code.

Here is an example of how while loop works.

For loop:

The for loop in Python is used to iterate over a sequence (list, tuple, string) or other iterable objects. Iterating over a sequence is called traversal.

Syntax of For loop.
How For loop works.


Function is a group of related statements that perform a specific task.

Functions help break our program into smaller and modular chunks. As our program grows larger and larger, functions make it more organized and manageable.

The DRY principle that is ‘Don't Repeat Yourself’ is a principle often used to develop practice in making code more reusable and less repetitive.

The image above shows the flow of the function and when it is called or used. Makes the code more readable, reduces redundancy, makes the code reusable, and saves time.

Example how Functions work.

We can call the function by specifying the name with proper arguments within the parenthesis. It will show an error if the amount of parameters passed are not called. For example, here we have to pass 2 parameters in order to call the function.

In Data Science, we use a lot of libraries. To name a few, we often use pandas, Matplotlib, NumPy etc. The thing about libraries is one should never be afraid or refrain themselves into learning new libraries because the fact is new libraries will be developed as long as programming is alive.

Updating yourself is the mantra in Programming and in Data science.

How to import libraries into your workspace?

Math is one of the libraries used in python.

To learn more about Math or NumPy , please refer to the links.

These libraries and modules have defined classes, attributes, and methods that we can use to accomplish our tasks. The most important thing is to learn how to read the documentation of the libraries.

This would cover the Fundamentals of Programming (Python). Not everything is covered here but as we keep on going and do different projects, we will get more familiar and accustom to it.

That concludes our tutorial of the basics of Python programming!

Thank you.

Please do give feedbacks and ask your queries. I would be happy to help. content/uploads/2017/11/function_works.jpg

Analytics Vidhya

Analytics Vidhya is a community of Analytics and Data…