Quick Python for Data Science, Part 2: Inbuilt Data Structures
This is the second part of the series. for Part 1 click here
Python has four basic inbuilt data structures. They are Lists, Tuples, Sets, Dictionaries.
Lists
- List is an ordered sequence of items.
- All the items in a list do not need to be of the same type.
- Lists are mutable. means value of a list can be altered or deleted.
emptylist = [] #to create a empty listmylst = [1,3.5,'abc',89] #basic list syntaxlsts = [[1,2],[3,4]] #list of lists or multi dimensional listlen(lst) #gives the length of a list
append, insert and extend
lst = [1,2,3]
lst2 = [4,5]lst.append(value) #appends(atlast) value to the listlst.insert(x, ele) #add element ele at location xlst.append(lst2)
#appends lst2 at the end of elements in lst i.e; [1,2,3,[4,5]]lst.extend(lst2)
#appends elements of lst2 at the end of elements in lst [1,2,3,4,5]lists can also be extend by using ‘+’ #newlst = lst+lst2
del, pop, remove and clear
del lst[i]
#deletes item based on index i and it doesn’t return that deleted itema = lst.pop(i) #pops an item and returns that item to alst.remove(ele) #removes first occurrence of ele in a listlst.clear() #Remove all items from the list. Equivalent to del a[:]
reverse, sort and sorted
lst.reverse() #reverses the listnumbers = [3,1,6,2,8]sorted_lst = sorted(numbers)
print(sorted_lst) #[1,2,3,6,8] - sorted in ascending order
print(numbers) #[3,1,6,2,8] - original list remains unchangedsorted(numbers, reverse=True) #sorts in descending order(function)lst.sort()
#sorts (method) the list and stored in itself unlike sorted functioncannot sort list with elements of incomparable data types such as the list [1,2,’b’,5,’a’] #results in type error
string to list
s = “one,two,three” #s is a stringslst = s.split(,)
#default split is white character: space or tab (here it is comma)print(slst) #[‘one’,’two’,’three’]
Slicing
lst[-1] #last element using negative indexinglist slicing #lst[start : end : step-size]numbers[0:4] #from index 0 to index 3
count and looping
marks = [72,67,92,72,85]marks.count(72) #2
#frequency(number of times it has occured) of ele in a lstfor ele in marks:
print(ele)
list comprehension
provide a concise way to create lists
squares = [i**2 for i in range(5)] #i² for all, i=0 to 4print(squares) #[0,1,4,9,16]
Tuples
- Tuple is an ordered sequence of items same as list.
- The only difference is that tuples are immutable. it means tuples once created cannot be modified.
emptytuple = () #creation of empty tuplet = (1,’raju’,28,’abc’) #basic tuple syntaxtl = (1,(2,3,4),[1,’abc’,3]) #nested tuple
general mistake in using tuple vs string
t = (‘harsha’) #it is a string (or) t = “harsha”t = (‘harsha’,) #it is a tuple (or) t = “harsha”,
- indexing is same as lists t[i]
- slicing is same as lists t[:]
unlike lists tuples are immutable(we cannot change or delete)
t[2] = ‘x’
#TypeError: ‘tuple’ object does not support item assignmentbut we can change a list which is inside a tuple tl[2][0] = 5
concatenating, del, index
t = (1,2,3)+(4,5,6) #concatenating tuplest = ((‘sri’,)*3) #repeat the elements in a tuple (‘sri’,’sri’,’sri’)del t #delete entire tuplet = (1,2,3,1,3,3,4,1)
t.index(3) #2 — return the index of first occurrence of 3
other basic functions
len(t)
sorted(t)
max(t)
min(t)
sum(t)
Sets
- Set is an unordered (cannot be indexed) collection of unique (no duplicates) items.
- The set itself is mutable. we can add or remove items from it.
#set doesn’t allow duplicates
s = {1,2,3,1,4}
print(s) #{1,2,3,4}s = set() #empty set
add, update
s = {1,3}s.add(2) #add element into set s
print(s) #{1,2,3}s.update([5,6,1]) #adds multiple elements at a time
print(s) #{1,2,3,5,6}s.update([list],{set}) also possible
discard, pop, clear
s.discard(ele)
#ele is removed from the set s if it present else do nothings.remove(ele)
#ele is removed from the set s if it present else throws an errors.pop() #remove random elements.clear() #removes all the elements from the set
set operations
#Union
set1 | set2 or set1.union(set2)#Intersection
set1 & set2 or set1.intersection(set2)#Set difference
set1 — set2 or set1.difference(set2)#symm difference
set1 ^ set2 or set1.symmetric_difference(set2)#subset
set1.issubset(set2) #True or False
Being immutable set does not have method that add or remove elements.
frozen sets
sets being mutable are unhashable, so they can’t be used as dict keys. on the other hand, frozensets are hashable and can be used as a keys to a dict.
while tuples are immutable lists, frozen sets are immutable setssetf = frozenset([1,2,3,4])
Dictionaries
- Dictionary is an unordered collection of key-value pairs.
- Dictionaries are mutable.
It is like a hash table, has a key: value pairmy_dict = {} or my_dict = dict() #empty dictionary#dictionary with integer keys
my_dict = {1: ‘abc’, 2: ‘xyz’}#dictionary with mixed keys
my_dict = {‘name’: ‘harsha’, 1: [‘abc’, ‘xyz’]}
accessing dictionaries
my_dict[key]
#gives error if key is invalid
my_dict.get(key)
#doesn’t give error and shows None if key is invalidmy_dict[key] = value #adds new key: val pair
pop, popitem, del, clear
my_dict.pop(key) #remove element with the given key
my_dict.popitem() #remove any arbitrary key — no parametersdel my_dict[key] #remove element with the given key
my_dict.clear() #removes entire dictionary
items, keys, values
subjects = {2:4, 3:9, 4:16, 5:25}print(subjects.items())
#return a new view of the dictionary items (key, value) as dict_items([(2, 4), (3, 9), (4, 16), (5, 25)])print(subjects.keys())
#return a new view of the dictionary keys dict_keys([2, 3, 4, 5])print(subjects.values())
#return a new view of the dictionary values dict_values([4, 9, 16, 25])
dictionary comprehension
#Creating a new dictionary with only pairs where the value is larger than 2d = {‘a’: 1, ‘b’: 2, ‘c’: 3, ‘d’: 4}new_dict = {k:v for k, v in d.items() if v > 2}
print(new_dict) #{‘c’: 3, ‘d’: 4}
Thanks for reading. Cheers :)