Introductory note on Data Science using Python programming language. (Article.1: Lists, Strings, and Dictionaries)

Ayush Singh
Analytics Vidhya
Published in
6 min readApr 26, 2020

Before starting with this series of ‘Introduction to Data Science using the Python programming language’, it is important for us to know what exactly is- Data Science.
So here it is, Data science is a blend of data inference, algorithm, development, and technology in order to solve complex problems. Hence, we can say that Data science is the field of study that requires expertise in programming skills, mathematical and statistical knowledge. Now, coming up to a more simple definition of Data Science, it basically deals with processing the raw data or information and extracting meaningful conclusions, or inference from it, which makes it easier for us, the humans to understand and to study the data. A Data Scientist, therefore, processes the raw data, extracts the meaningful data inference, and applies machine learning algorithms, to data inference, extracted which maybe, images, videos, texts, etc. to make an Artificial Intelligence model. Now, these models perform specific tasks as trained by the programmers, benefiting mankind in various ways.
You might even think why I chose python for coding, well there are a lot of reasons why Data Scientists love python, though you might even think Ruby stands out brilliantly, performing tasks such as data cleaning and data mugging, but python features with a lot more of machine learning libraries, which turns out to be one good major reason to choose python. Now, if we look at syntax perspective, I find the syntax of python language to be pretty simple, easy to understand and learn, but I think this differs from head to head, person to person, so you see there are pros and cons for everything, one must just know where he/she fits in the best.

Finally, after a short note on Data Science, we are all set to start with our series. Now if you are not familiar with the Python language at all, I strongly recommend you to first know the basic syntax of python, along with the elementary idea of data types, and Python types & sequences, as you need to have prior elementary knowledge of the above topics, in order to continue.
Let us begin, with all your prerequisites checked, and firstly we will work on Lists.

Lists.

A list is a collection that is ordered and changeable or mutable. In Python, lists are written with square brackets [ ].
In the below code snippet I have declared a list ‘x’ and, the two basic operations which you can perform on lists are also shown, which is ‘append()’, and ‘remove()’. Now, why did I mention these operations to be the ‘basic operation’ we perform on lists? It is because lists are mutable, which means you can alter the contents of the list, such as adding elements or deleting elements. These operations are not accessible to tuples hence, they are immutable. So you see, python provides us with Lists, where you can change the contents of the list, and Tuples, which you can use to store constant or any sort of data which should not be altered in any way. Here is the code snippet for you:

So, initially, the list ‘x’ had the elements [1, 2, ‘c’], but on using append() function, we added the integer 99, to the list. In order to show that append() function works for all kinds of data types, in the code, we also have append(‘abc’), which is a string and can be added without any trouble. Then we have the remove(), function sed in the code snippet, which removes any particular item/element mentioned in the parenthesis. So this was all about how you can add or delete elements in a list.
One would definitely want to iterate or access the data individually in a list, so let us do that. The code snippet below shows us two ways on how we can iterate through the items in a list, using the loop method, and index operator method :

We even see the len() function here, which simply returns the length of the list, here in the above example, and
end= “ “, helps us print the elements in the same line.

Next, we will find out what more can we do with lists.
1. String concatenation
Let us use ‘+’ to concatenate lists.

print([1,2] + [3,4])
#prints ->[1, 2, 3, 4]

2. Elements repetition
Repeating lists using ‘*’

print([1,2,3]*3)
#prints ->[1, 2, 3, 1, 2, 3, 1, 2, 3]

3. Find a specific element
Let us use the ‘in’ operator

x3= 1 in x1
print(x3)
#prints -> True
# As the 'in' operator checked for us whether 1 is in the list x1 or not

One additional thing I would love to highlight as I find it pretty cool, it’s about unpacking a sequence, which can be executed by running the code given below.

'''Let us now see Sequence unpacking '''
x6 = ('Ayush', 'Singh', 'singh@x.com')
first, last, email = x6
print(first)
#prints ->Ayush
print(last)
#prints -> Singh
print(email)
#prints -> singh@x.com

Well, that’s all I wanted to show about Lists, but there are a lot of operations you can perform on lists, and play around with them, as I consider it as the best way to learn more and more.

Strings

Moving on, let us work on strings in python. If you are familiar with data types or have a fairly concrete coding background then you surely know this data type through thick and thin. In python, there are a lot of operations one can perform on strings, and even in the world of Data Science, operations on Strings does play a major role.
We will first see how we can slice a string to get a substring out of it. String slicing is pretty simple, as you might have already guessed, it is extracting a part of the string, whose return type is String itself. Have a look at the code snippets below.

x4 = 'Dr. Abhishek Singh'
print(x4[0])
#first character, prints -> D

This helps us print the first character of the string ‘x4’.

Now, have a look at this code:

print(x4[0:1])
#prints ->D

This also prints the same output as, the code above it, but we have explicitly set the end character, as in the terminal point of the String traversal.
So, its that simple to slice off a String. Have a look at the code snippet below, where I have performed few operations, and the comments will help you know the action of the operation performed as well as the output mentioned would boost up your understanding.

Note: split returns a list of all the words in a string, or a list split on a specific character.

Regarding Strings, there is something I always lay special emphasis, while concatenating. Make sure you convert objects to strings before concatenating.
Chris’ + 2 , will show us an error as-> TypeError: can only concatenate str (not “int”) to str.
Heres the correct way to do it.

print('Python' + str(2))
#prints -> Python2

That’s a wrap up for Strings, as I find these operations to be essential and worth highlighting, but if you wanna learn more you are highly recommended to go through more and more documentations.

Dictionaries

In Python, Dictionary is an unordered collection of data values enclosed in { }, each value being associated with a key so, it’s like a map. Dictionary holds key: value pair.
Hence Dictionaries associate keys with values so, if we want to access a particular value in the dictionary, we must know the key of the particular value. One additional information regarding keys in Dictionary, they don’t allow polymorphism.
Let us now work with Dictionaries.
In the code snippet provided below, x5 is a dictionary, which holds two values- ‘ayush@x.com’, and ‘abhishek@x.com’, now as we mentioned earlier, Dictionary holds key: value pair, it is quite obvious now that ‘Ayush’, and ‘Abhishek’ are the keys.
So here’s the general format to declare a Dictionary:
var_name= {‘<Key>’ : ‘<Value>’}
The comments written along with the code will help you understand it better.

x5 = {'Ayush': 'ayush@x.com', 'Abhishek': 'abhishek.com'}
print(x5['Ayush']) # Retrieve a value by using the indexing operator
#prints->ayush@x.com
print(x5['Abhishek'])
#prints ->abhishek.com

Let us perform a couple of cool operations on Dictionaries.
1.Iterate over all of the keys and values:

for n in x5.keys():
print(n)
#prints->Ayush Abhishek

for n in x5.values():
print(n)
#prints->ayush@x.com abhishek.com

2. Accessing the keys as well as value, together:

for name, email in x5.items():
print(name)
print(email)
#here we can access the keys as well as value, together
'''
prints-> Ayush
ayush@x.com
Abhishek
abhishek.com
'''

That will be enough for this article, but definitely more articles will be published focusing on Data Science using Python. You can find the entire source code at:
https://github.com/ayush-670/PythonDataScience_basics
Stay tuned if you’re interested in Data Science and liked this article, and I would love to interact with you if you have any queries.
Thank you so much for reading this article! I hope it helped you in some way or the other.

--

--

Ayush Singh
Analytics Vidhya

A web designer, flutter developer, and the sugar to your coffee.