MUTABILITY, OBJECTS, AND A DOSE OF STRANGENESS

The Paradigms of Programming

There are various approaches to the programming process, referred to as programming paradigms. These paradigms constitute fundamentally different approaches to building solutions to problems. Having been exposed to C, Python, and node.js, I thought it helpful to explore object oriented programming and illustrate how it differs to procedural programming.

Procedural programming (also known as imperative programming) uses a list of instructions that tell the computer what to do step-by-step. A program is akin to opening a cook book and following a recipe — one would conventionally begin from the top of the page and make their way down. First the recipe informs the user on what ingredients they need (declaring variables) and then how to use those variables to produce the desired result. Most early programming languages are procedural (Fortran, C, COBOL).

Object Oriented Programming (OOP) involved pieces of code called objects, that allows one to create many similar pieces of code without having to re-write it each time. This is carried out by first writing a class which is a blueprint of the object. A class can have both methods (simply, functions) and properties (think of these are characteristics). For example, a program involving cars may have a Quokka class. The Quokka can contain methods for hopping, eating, sleeping, selfie-taking. These are actions that quokkas can do. The properties of the Quokka class can be color, size, weight, name, place of birth. If we wanted to document all the quokkas on Bald Island, rather than create separate code for each individual, we can just create quokkas from our Quokka class. The eating and sleeping methods for example only need to be written once but every Quokka object created from the Quokka class can use these functions. OOP is more efficient because if we wanted to make a change to a quokka, for example adding another characteristic or another function, it would be easy and wouldn’t break the rest of the program.

id and type

In Python, objects are given unique identification numbers that are “guaranteed to be unique and constant for the object during its lifetime.”

To illustrate this, we create two objects below:

str1 and str2 are unique strings that we have initialized. These objects, are given identification numbers the moment they are created. When we call the id() function on the strings, we can see they have unique id numbers.

Interestingly, if an opportunity to economize exists in Python, it will. Python will do something known as “aliasing” which is the condition when one variable is assigned to another, both variables will refer to the same object in memory. Below, we create an example to illustrate this:

We declare two variables and initialize them to the same strings. On line 3, when we run str3 is str4 we are conducting an object comparison. The REPL returns True which denotes that both variables are referring to the same object. Calling the id() function on these strings confirms this — but only for certain cases!!

The strange thing about aliasing in Python

As in the previous example, we see that variables don’t always refer to the same object in memory. When two variables refer to the same object, this is known as an alias. In the previous example, we used the form identifier = expression to assign variables to strings. By creating variables set to identical strings, Python set them to refer to the same object.

However, I soon found out that not all equivalent strings set to different variables point to the same object!

Take the above as an example. Both str1 and str2are set to equivalent strings. We test str2 == str2 to see that the strings stored are the same. Strangely, strings with special characters or spaces get assigned differently from “simple” strings.

The same can be observed with integers:

Two variables assigned to the number 42point to the same object.

…but, two variables assigned to 420 don’t!

To investigate this phenomenon, I obtained the id value for num1 when set to 42 which is 1005684. I went to another computer and set a variable num1 = 42 and obtained its id. Sure enough, its id is 1005684. I then tested the id of num1 when set to 420 and obtained 140070849310096 and ran the same on another computer and received 139817727223824 as the id number.

We can set variables point to the same object by using the expression: identifier = identifier.

This tells us that both num1 and num2refer to the same object using the is operator.

NSMALLPOSITS, NSMALLNEGINTS

Upon startup, the Python interpretor creates empty pools of heap memory called arenas. These arenas contain empty, allocated objects (dictionaries, lists, sets, strings, etc) to avoid unnecessary computational cycles later on.

In the source code for the Python int object, there are macros defined called NSMALLPOSINTS and NSMALLNEGINTS.

It turns out Python keeps an array of integer objects for “all integers between -5 and 256”. When we create an int in that range, we’re actually just getting a reference to the existing object in memory.

If we set x = 42, we are actually performing a search in the integer block for the value in the range -5 to +257. Once x falls out of the scope of this range, it will be garbage collected (destroyed) and be an entirely different object. The process of creating a new integer object and then destroying it immediately creates a lot of useless calculation cycles, so Python preallocated a range of commonly used integers.

What’s happening in memory?

In the above example, the value of x is 42 and its id is 505494448 which specifies the location in memory for the value 42. Initially, x and y refer to the same location in memory. Again, python makes assignment efficient by only copying the reference and not the data. Changes to x do not affect y. In the image to the right, x gets reassigned with the value 43. x now points to a different area in memory with an id value of 505494464.

Modifying elements in a list (mutable object) referenced by two variables

We initialize list1 as a list holding the int object x which stores the value 42. list2 points to the same object upon using the operator =. We then update the value of list1 to 43.

Note that list2[0] changes as a result of changes made to list1. Because list2 is an alias of list1, changes made to the data component of one variable will be reflected in other aliased variables, since both variables refer to the same object in memory.

Mutability and Immutability

Immutable data is data that does not change once it is declared. Immutable variables are commonly used for strings (in C), integers, tuples, since these values do not change and we wouldn’t want them to change. An immutable object is when we have reference to an instance of an object, but the contents of that instance are fixed once created.

Mutable objects on the other hand can be changed after initialized.

The following are some immutable objects:

  • int, float, decimal, complex, bool, string, tuple, range, frozen set, and bytes

The following are some mutable objects:

  • list, dict, set, bytearray, Classes

In general, Python attempts to optimize resources with immutable data types. Strings and ints as we have seen in the previous examples, are immutable data types that objects can refer to without risk that changes to one will affect changes to another.

Mutable types such as lists will not refer to the same object. (Any changes made to list_1 would directly affect list_2 which could have dire consequences if over-looked).

It is important to keep these rules in mind but not adhere to it without testing, as there are quite a few obvious and obscure exceptions.

Mutable immutable objects: Tuples and Frozen Sets

Above, we set two variables to equivalent tuples; however they don’t refer to the same object even though tuples are an immutable data type.

However, when we set two variables as empty tuples, they do refer to the same object!

…but, variables set to tuples with two empty tuples in each, don’t refer to the same object.

Since tuples can hold mutable data types — the immutable tuple can be changed indirectly. Take below as an example:

I declared a weird_tupletuple initialized with an integer (immutable), string (immutable), and a list (mutable).

Let’s say the list contains three of my favorite fruits bananas, apples, pears but I decide that I want to replace “apples” with “Mangoes”. Since lists are mutable, I am able to change apples with Mangoes.

By mutating a mutable data type within an immutable object types, I’ve indirectly mutated the “immutable”!

Even though tuples and frozen sets are immutable object types by nature, Python handles them similar to mutable object types due to the fact that they may contain mutable objects.