Python for Machine learning Part 3
Welcome to the third installment of Python fundamentals for those diving into Machine Learning. In the previous segment, we covered Lists and their corresponding functions and methods. In this blog, our exploration continues with a focus on dictionaries, tuples, and sets.
Previous Python for Machine Learning blogs can be found here:
Dictionary
A Dictionary in Python is analogous to Hash or map data structures in other programming languages. It serves as a collection that enables the storage of data in unordered key-value pairs. The creation of dictionaries involves placing key: value pairs within curly brackets {}, separated by commas. Notably, dictionary keys must be immutable, encompassing types like tuples, strings, and integers. Mutable objects, such as lists, are ineligible for use as keys. Additionally, it’s important to recognize that keys are case-sensitive.
An interesting distinction in Python’s approach to data structures is the ability to nest one data structure within another. For instance, a list can contain a dictionary. Let’s explore this concept through some examples.
city_airport_dictionary = {
'AUS': 1,
'NYK': 3,
'SFO': 2
}
print(city_airport_dictionary['AUS']) # prints 1
print(city_airport_dictionary['NYK']) # prints 3
print(city_airport_dictionary) # prints {'AUS': 1, 'NYK': 3, 'SFO': 2}
sample_dictionary= {
'a' : [1,2,3],
'b' : 'Hello',
'c' : 4.56,
'd' : True
}
print(sample_dictionary['b']) # prints Hello
print(sample_dictionary['a']) # prints [1,2,3]
# note we can have dictionary as an item in list
dict_list = [
{
'a' : [1,2,3],
'b' : 'Hello',
'c' : 4.56,
'd' : True
},
{
'AUS': 1,
'NYK': 3,
'SFO': 2
}
]
print(dict_list[0]['c']) # prints 4.56
print(dict_list[1]['SFO']) # prints 2
print(dict_list[0]['a'][2]) # prints 3
When should you use a list vs dictionary? List can be sorted whereas dictionary is not sorted also we know how to query data instead of indices then we go with dictionary.
test_dic = {
10: [1,2,3],
True: '1',
'Name': 'Hello',
3.14: 'Pi'
}
print(test_dic[10]) # prints [1, 2, 3]
print(test_dic[True]) # prints 1
print(test_dic[1]) # prints 1, since python converts 1 to True and gets value 1 from dictionary
print(type(test_dic[1])) # prints <class 'str'> just to prove it got from dictionary
print(test_dic['Name']) # prints Hello
print(test_dic[3.14]) # prints Pi
Dictionary Methods
Let’s go through some of the methods for dictionary. For complete list please go through https://www.w3schools.com/python/python_ref_dictionary.asp
user_info = {
'fname': 'Super',
'lname': 'Coder',
'exp': '15',
'Lang': ['Python', 'html', 'C#', 'JS']
}
print(user_info.get('age')) # prints None, and wont error
print(user_info.get('age', 'Not Found')) # prints Not Found, and wont error
user_info_1 = dict(fname='Super') # another way to define dictionary
print(user_info_1) # prints {'fname': 'Super'}
print('fname' in user_info) # prints True
print('address' in user_info.keys()) # prints False
print('Coder' in user_info.values()) # prints True
print(user_info.items()) # prints all key value pairs as tuples, like dict_items([('fname', 'Super'), ('lname', 'Coder'), ('exp', '15'), ('Lang', ['Python', 'html', 'C#', 'JS'])])
user_info_1.clear()
print(user_info_1) # prints {}
user_info_1 = user_info.copy()
print(user_info_1) # prints {'fname': 'Super', 'lname': 'Coder', 'exp': '15', 'Lang': ['Python', 'html', 'C#', 'JS']}
print(user_info_1.pop('exp')) # prints 15 since that is the value being removed
print(user_info_1) # prints {'fname': 'Super', 'lname': 'Coder', 'Lang': ['Python', 'html', 'C#', 'JS']}
print(user_info_1.update({'exp': 17})) # prints none as it adds key-value pair in this case
print(user_info_1) # prints {'fname': 'Super', 'lname': 'Coder', 'Lang': ['Python', 'html', 'C#', 'JS'], 'exp': 17}
Tuple
In Python, a tuple bears similarities to a list, but with a crucial distinction — the elements of a tuple cannot be modified once assigned, unlike a list where such modifications are possible. To create a tuple, one encloses all the items (elements) within parentheses (), separating them with commas. Although the use of parentheses is optional, it is considered good practice. Tuples can comprise any number of items, and these items may vary in type (integer, float, list, string, etc.). Similar to strings, tuples are immutable.
tup_1 = (1,2,3,4,5)
# tup1_1[0] = 0 # this is not possible since immutable
print(5 in tup_1)
We cannot sort them or reverse them etc. since they are immutable. Whereas they are more performant than list. Since they are immutable they can be used as keys in dictionary.
new_tuple = tup_1[1:2]
print(new_tuple) # prints (2,)
new_tuple = tup_1[1:4]
print(new_tuple) # prints (2,3,4)
x,y,z, *other = (1,2,3,4,5)
print(other) # prints [4, 5]
print(tup_1.count(4)) # prints 1, since only 1 4 present
print(tup_1.index(5)) # prints 4
print(len(tup_1)) # prints 5
Sets
A set in Python is an unsorted assortment of distinct data. Essentially, the elements within a set cannot be duplicates. To instantiate sets in Python, we enclose the elements within curly braces {}, separating them with commas. Sets can encompass any number of items, spanning various types (integer, float, tuple, string, etc.). However, it’s important to note that sets cannot include mutable elements like lists, sets, or dictionaries as their constituents.
test_set = {1,2,3,4,4,5}
print(test_set) # prints {1,2,3,4,5} dups will be removed
test_set.add(100)
test_set.add(5)
print(test_set) # prints {1,2,3,4,5,100}
#we can remove duplicates from list using set function
test_list = [1,2,3,4,4,5,5,5,6]
print(set(test_list)) # prints {1, 2, 3, 4, 5, 6}
print(test_set[0]) # gives error since indexing not supported in set
print(1 in test_set) # prints True
print(len(test_set)) # prints 6
Up to this point, a set may not seem markedly distinct from other data structures. However, the true potency of sets lies in their unique methods, including discard, difference_update, intersection, isdisjoint, issubset, issuperset, and union. Let’s explore these through examples.
set_1 = {1, 2, 3, 4, 5}
set_2 = {4, 5, 6, 7, 8, 9, 10}
print(set_1.difference(set_2)) # prints {1,2,3}
set_1.discard(5) # will remove item5
print(set_1) # prints {1, 2, 3, 4}
set_1.difference_update(set_2) # will remove the difference
print(set_1) # prints {1,2,3}
set_1 = {1, 2, 3, 4, 5}
set_2 = {4, 5, 6, 7, 8, 9, 10}
print(set_1.intersection(set_2)) # prints {4, 5} since those are common items
print(set_1 & set_2) # prints {4, 5} since those are common items, & is short hand for intersection
print(set_1.isdisjoint(set_2)) # prints False, since there are 4, 5 as common
print(set_1.union(set_2)) # prints {1, 2, 3, 4, 5, 6, 7, 8, 9, 10}
print(set_1 | set_2) # prints {1, 2, 3, 4, 5, 6, 7, 8, 9, 10}, | is short hand for union
set_1 = {4, 5}
set_2 = {4, 5, 6, 7, 8, 9, 10}
print(set_1.issubset(set_2)) # prints True
print(set_2.issuperset(set_1)) # prints True
Final Word
We’ve concluded our exploration of all the data types and structures in Python, underscoring the significance of selecting the right data structure. Notably, high-level data types such as integer, float, tuple, and string are immutable, while lists, sets, and dictionaries fall under the mutable category. In our upcoming blog, we’ll delve into the realm of conditionals and flow control in Python.
🙏Thanks for taking the time to read the article. If you found it helpful and would like to show support, please consider:
- 👏👏👏Clap for the story and bookmark for future reference
- Follow me on Chaitanya (Chey) Penmetsa for more content
- Stay connected on LinkedIn.
Wishing you a happy learning journey 📈, and I look forward to sharing new articles with you soon.