Quick Python for Data Science, Part 2: Inbuilt Data Structures

Swamy Sriharsha
4 min readJun 9, 2018

--

This is the second part of the series. for Part 1 click here

Python has four basic inbuilt data structures. They are Lists, Tuples, Sets, Dictionaries.

Lists

  • List is an ordered sequence of items.
  • All the items in a list do not need to be of the same type.
  • Lists are mutable. means value of a list can be altered or deleted.
emptylist = [] #to create a empty listmylst = [1,3.5,'abc',89] #basic list syntaxlsts = [[1,2],[3,4]] #list of lists or multi dimensional listlen(lst) #gives the length of a list

append, insert and extend

lst = [1,2,3]
lst2 = [4,5]
lst.append(value) #appends(atlast) value to the listlst.insert(x, ele) #add element ele at location xlst.append(lst2)
#appends lst2 at the end of elements in lst i.e; [1,2,3,[4,5]]
lst.extend(lst2)
#appends elements of lst2 at the end of elements in lst [1,2,3,4,5]
lists can also be extend by using ‘+’ #newlst = lst+lst2

del, pop, remove and clear

del lst[i] 
#deletes item based on index i and it doesn’t return that deleted item
a = lst.pop(i) #pops an item and returns that item to alst.remove(ele) #removes first occurrence of ele in a listlst.clear() #Remove all items from the list. Equivalent to del a[:]

reverse, sort and sorted

lst.reverse() #reverses the listnumbers = [3,1,6,2,8]sorted_lst = sorted(numbers)
print(sorted_lst) #[1,2,3,6,8] - sorted in ascending order
print(numbers) #[3,1,6,2,8] - original list remains unchanged
sorted(numbers, reverse=True) #sorts in descending order(function)lst.sort()
#sorts (method) the list and stored in itself unlike sorted function
cannot sort list with elements of incomparable data types such as the list [1,2,’b’,5,’a’] #results in type error

string to list

s = “one,two,three” #s is a stringslst = s.split(,) 
#default split is white character: space or tab (here it is comma)
print(slst) #[‘one’,’two’,’three’]

Slicing

lst[-1] #last element using negative indexinglist slicing #lst[start : end : step-size]numbers[0:4] #from index 0 to index 3

count and looping

marks = [72,67,92,72,85]marks.count(72) #2 
#frequency(number of times it has occured) of ele in a lst
for ele in marks:
print(ele)

list comprehension

provide a concise way to create lists

squares = [i**2 for i in range(5)] #i² for all, i=0 to 4
print(squares) #[0,1,4,9,16]

Tuples

  • Tuple is an ordered sequence of items same as list.
  • The only difference is that tuples are immutable. it means tuples once created cannot be modified.
emptytuple = () #creation of empty tuplet = (1,’raju’,28,’abc’) #basic tuple syntaxtl = (1,(2,3,4),[1,’abc’,3]) #nested tuple

general mistake in using tuple vs string

t = (‘harsha’) #it is a string (or) t = “harsha”t = (‘harsha’,) #it is a tuple (or) t = “harsha”,
  • indexing is same as lists t[i]
  • slicing is same as lists t[:]

unlike lists tuples are immutable(we cannot change or delete)

t[2] = ‘x’ 
#TypeError: ‘tuple’ object does not support item assignment
but we can change a list which is inside a tuple tl[2][0] = 5

concatenating, del, index

t = (1,2,3)+(4,5,6) #concatenating tuplest = ((‘sri’,)*3) #repeat the elements in a tuple (‘sri’,’sri’,’sri’)del t #delete entire tuplet = (1,2,3,1,3,3,4,1)
t.index(3) #2 — return the index of first occurrence of 3

other basic functions

len(t)
sorted(t)
max(t)
min(t)
sum(t)

Sets

  • Set is an unordered (cannot be indexed) collection of unique (no duplicates) items.
  • The set itself is mutable. we can add or remove items from it.
#set doesn’t allow duplicates
s = {1,2,3,1,4}
print(s) #{1,2,3,4}
s = set() #empty set

add, update

s = {1,3}s.add(2) #add element into set s
print(s) #{1,2,3}
s.update([5,6,1]) #adds multiple elements at a time
print(s) #{1,2,3,5,6}
s.update([list],{set}) also possible

discard, pop, clear

s.discard(ele) 
#ele is removed from the set s if it present else do nothing
s.remove(ele)
#ele is removed from the set s if it present else throws an error
s.pop() #remove random elements.clear() #removes all the elements from the set

set operations

#Union
set1 | set2 or set1.union(set2)
#Intersection
set1 & set2 or set1.intersection(set2)
#Set difference
set1 — set2 or set1.difference(set2)
#symm difference
set1 ^ set2 or set1.symmetric_difference(set2)
#subset
set1.issubset(set2) #True or False

Being immutable set does not have method that add or remove elements.

frozen sets

sets being mutable are unhashable, so they can’t be used as dict keys. on the other hand, frozensets are hashable and can be used as a keys to a dict.

while tuples are immutable lists, frozen sets are immutable setssetf = frozenset([1,2,3,4])

Dictionaries

  • Dictionary is an unordered collection of key-value pairs.
  • Dictionaries are mutable.
It is like a hash table, has a key: value pairmy_dict = {} or my_dict = dict() #empty dictionary#dictionary with integer keys
my_dict = {1: ‘abc’, 2: ‘xyz’}
#dictionary with mixed keys
my_dict = {‘name’: ‘harsha’, 1: [‘abc’, ‘xyz’]}

accessing dictionaries

my_dict[key] 
#gives error if key is invalid

my_dict.get(key)
#doesn’t give error and shows None if key is invalid
my_dict[key] = value #adds new key: val pair

pop, popitem, del, clear

my_dict.pop(key) #remove element with the given key

my_dict.popitem() #remove any arbitrary key — no parameters
del my_dict[key] #remove element with the given key

my_dict.clear() #removes entire dictionary

items, keys, values

subjects = {2:4, 3:9, 4:16, 5:25}print(subjects.items()) 
#return a new view of the dictionary items (key, value) as dict_items([(2, 4), (3, 9), (4, 16), (5, 25)])
print(subjects.keys())
#return a new view of the dictionary keys dict_keys([2, 3, 4, 5])
print(subjects.values())
#return a new view of the dictionary values dict_values([4, 9, 16, 25])

dictionary comprehension

#Creating a new dictionary with only pairs where the value is larger than 2d = {‘a’: 1, ‘b’: 2, ‘c’: 3, ‘d’: 4}new_dict = {k:v for k, v in d.items() if v > 2}
print(new_dict) #{‘c’: 3, ‘d’: 4}

Thanks for reading. Cheers :)

--

--