Quick Python for Data Science, Part 1: Introduction and Basics

Swamy Sriharsha
5 min readJun 9, 2018

--

This is the compilation of python concepts for those who have studied python earlier. It’s been made with the reference from appliedaicourse.com

The objective of this article is to give the python concepts that are necessary for data science in short and crisp at the same place.

why python?
- very simple language
- provides best packages for AI: matplotlib, numpy, scipy, scikit-learn, tensorflow
- extensively used in the industry

Keywords : reserved words in python

#Get all keywords in python
import keyword #import package/module into your file
print(keyword.kwlist) #prints all available keywords in a list
len(keyword.kwlist) #results total number of keywords

Identifiers : are the names given to entities like class, functions, variables
- rules for writing identifiers (same as any other programming language)
- keywords cannot be used as identifiers

Comments : ignored by compilers and interpreters
- #single line comments
- ‘’’this is an example
for multi line comments ‘’’

Indentation : a code block starts with indentation and ends with the first unindented line
- the amount of indentation is up to you, but it must be consistent throughout that block
- generally four white spaces are used and is preferred over tabs

for i in range(6):
print(i) #it is in the loop
print(“loop ends before this line”) #it is not in the loop

Statements : Instructions that a python interpreter can execute

#single statement — which assigns value 10 into var a
a = 10
#multi-line statement a = 1+2+3+ \
4+5+6+ \
7+8
(or) a = (1+2+3+
4+5+6+
7+8)
print(a) #36

#put multiple statements in a single line using ;
a = 10; b = 20; c = 30

Variables : is a location in memory used to store some data (value). or we can treat it as a placeholder.

a = 10
b = 5.5
c = “ML”
#multiple assignments
a, b, c = 10, 5.5, “ML”
#assign the same value to multiple variables at once
a = b = c = “AI”

Storage Locations

x = 3
print(id(x)) #print address of variable x — 140372891159288
y = 3
print(id(y)) #print address of variable y — 140372891159288
x and y points to same memory locations as the value is same in both

Data Types : every value in python has a datatype
- data types are actually classes and variables are instance (object) of these classes.
- python follows implicit data typing

type(varname) #to know which class a var or a val belongs toa = 10
print(type(a)) #<class ‘int’>
isinstance() #to check if an object belongs to particular class
print(isinstance(1+2j, complex)) #True

- int, long, float, complex (Numbers in version 2)
- int, float, complex (version 3)

Strings : is a sequence of Unicode characters (not ASCII)
- Strings are immutable. It means that elements of a string cannot be changed once if it has been assigned

s = “This is an online AI course” //remember quotation rules
type(s) //<class ‘str’>
#use triple quotes for multi-line strings or strings which has apostrophes and quotesprint(s[-1]) or print(s[len[s]-1]) #prints last char 'e's[5:] #slicing — “is an online AI course”

List : list is an ordered sequence of items
- all the items in a list do not need to be of the same type
- Lists are mutable. It means value of a list can be altered or deleted

a = [10, 20.5, “hello”]
a[1] = 30.7 //lists are mutable
print(a) #[10, 30.5, “hello”]

Tuple : tuple is an ordered sequence of items same as list
- the only difference is that tuples are immutable.
- tuples once created cannot be modified


t = (1, 1.5, “ML”)

Sets : set is an unordered(cannot be indexed) collection of unique(no duplicates) items
- the set itself is mutable. we can add or remove items from it

a = {10, 30, 20, 40, 50}
print(a) #{50, 40, 10, 20, 30} — items are not ordered
type(a) #<class ‘set’>
b = {10, 20, 20, 30, 30, 30}
print(b) #{10, 20, 30} — unique items
a[2] #set object does not support indexing because of unordering

Dictionary : is an unordered collection of key-value pairs

d = {‘a’ : “apple”, ‘b’ : “bat”}
print(d[‘a’]) #apple

Type Conversions

float(5) #5.0
int(100.5) #100
str(20) #’20'
int(‘10p’) #error — must contain compatible value
list(“hello”) #[‘h’,’e’,’l’,’l’,’o’] — converted str to list

Standard Input and Output

print() #to output data to the standard output device#output formattingprint(“the value of a is {} and b is {}”.format(a,b)) #default
print(“the value of b is {1} and a is {0}”.format(a,b))
print(“hello {name}, {greeting}”.format(name=”harsha”, greeting=”good mrng”))print(“the story of {0}, {1}, and {other}”.format(“a”, “b”, other=”c”))input
num = int(input(“Enter a number: “))
myinp = raw_input(“Enter something: “)

Operators : carry out arithmetic or logical computation

- most of the operators are same as any high level programming language and in addition to that we have the following in python. (Note: Python does not support increment/decrement operators)


- #floor division (//)
- #exponent (**)
- logical operators (and, or, not)
- bitwise operators (&, |, ~, ^, <<, >>)
- assignment (+=, //=, so on)
identity operators (is, is not): They are used to check if two values (or variables) are located on the same part of the memory.a = 5; b = 5
print(a is b) #True
#5 is object created once both a and b points to same object
l1 = [1, 2, 3]
l2 = [1, 2, 3]
print(l1 is l2) #False
membership operators (in, not in) : They are used to test whether a value or variable is found in a sequence (string, list, tuple, set and dictionary)d = {1: “a”, 2: “b”}
print(1 in d) #True

Decision Making and Control flow :

if else : used for decision making
0, None, False : false
any other than 0, True : true
if..elif..else
while loop: used to iterate over a block of code as long as the test expression (condition) is true.
while loop with else: the else part is executed if the condition in the loop is false
for loop: used to iterate over a sequence(lst, tuple, string) or other iteratable objects.
for loop with else: the else part is executed if the condition in the loop is false
range(n) #will generate nums from 0 to n-1
range(start, end, stepsize)
range(1, 20, 2) #range of numbers from 1 to 20 with stepsize of 2
break and continue — can alter the flow of a normal loop

Thanks for reading, click here for Part 2. cheers :)

--

--