Immutable objects and static interning in python.

Ben Bogart
Dec 7, 2020 · 7 min read

Why "apple pie" is not "apple pie"

Photo by Danil Aksenov on Unsplash

The is operator in python is fascinating when you start to play with it. After looking at the example below you should be left questioning reality, wondering what is going on and asking yourself whether an apple is an apple?

(All examples are python 3.7.9)

While you might think an apple is always an apple, Python doesn’t agree. A warning from the start, in nearly all cases you should compare equality of objects with the == operator in python, not the is operator for exactly the reasons illustrated above. But the interesting question is WHY? What is the is operator actually doing and why does it behave the way it does?

There are several layers to dig through here to understand how Python determines if a constant is another constant. Grab an 🍏 and let’s get started.

What does the ‘is’ operator actually do?

Everything in python is an object. Each object gets allocated a little homestead in memory with its own unique address. If we had to remember those addresses our code would be unreadable. Luckily we get to assign nice names to objects in the form of variables like mystring or myfunction.

Python’s built-in id() function gives us the memory address of any object.

The is operator compares the id’s (or memory locations) of two objects and returns True if they are the same. Because only one object can live at a particular memory address, if they share the same address they are one and the same. The expression a is a is equivalent to id(a) == id(b).

That was expected. What about this:

But….

At this point we can see that isis doing its job. is is comparing the memory locations of the objects on either side of it and if they are equal it evaluates to True, or False otherwise.

That still doesn’t answer why 'apple sauce' is not 'apple sauce'. Let’s bite off a little more to figure it out.

Immutable vs. mutable objects.

Mutable objects are objects that can (and are expected to change). In python Lists, Sets, and Dicts are mutable.

Immutable objects are static. They don’t change. Strings, Integers, Floats, and Tuples are immutable objects.

The idea of an immutable object can be confusing. Your first reaction might be, “I can change a string. Watch me!”

But this example didn’t change the string. It created a new string and reassigned the variable name a to point to it.

A string is immutable. It can be replaced, but not changed. Believe it or not ‘apple’ still exists in memory.

A list on the other hand has nifty methods like .append() which do change the mutable list type object.

The values changed, but the ids are the same. The mutated list object lives in the same place as the original list object. It wasn’t replaced, it changed.

This explains some, but not all, of the strange behavior we saw at the beginning of this article. At least now we can see why the following happens.

Because ['apple'] is a list and therefor mutable, each instance of it needs its own memory location. Since is compares memory locations to see if they are the same, False is the expected answer

But what about:

or

Let’s just chew on this a little longer.

Interning

The core of the issue is object interning. Immutable objects can be interned in memory. Because the objects are immutable and identical, python saves memory by storing those objects as pointers to the same memory location… sometimes.

As we’ve come to expect not all ‘apples’ are interned equally.

In the first example aand b point to the same memory location containing the constant apple.

In the second example a and b point to different memory locations each containing a copy of apple pie😱

I know no one is going to complain about the extra apple pie, but seriously WTF?

Different kinds of immutable objects are interned differently. As of python 3.7 the following are automatically interned.

  • Strings of up to 4096 long containing only numerals, underscore, or upper or lower case characters
  • Integers from -5 to 256 (interned at startup whether or not they are used)

This explains why the apple pie example above was False. apple pie contains a space. A space is not an underscore or alphanumeric character so the string is not interned. If not interned python will create a new object for that constant each time it is assigned.

Its important to mention that you can explicitly intern variables with the sys.intern() function.

Why would you want to do this? I’ll refer you to the What are the benefits of interning strings? section of this excellent although outdated article by Adrian Guilload.

Ok, that answers almost everything. We just have to plant one more seed.

And lastly, constant folding!

Probably at this point you know enough, but there is one more curiosity shown at the beginning of this article.

Python also tries to optimize your sloppy code using something called constant folding. The idea of constant folding is that if there is a simpler version of a constant that takes up less memory or doesn’t need to be computed python automatically replaces your expression with what it evaluates to. We can look into this by accessing the .__code__.co_consts attribute of the function which will tell us which constants python is storing for the function.

The the example above shows that for the first function only the constant 'apple' is stored in memory. The second stores apple, a, and pple. This is because the compiler can evaluate the expression 'a' + 'pple' at compile time and realizes that it is the same as 'apple' so the interned string constant is only stored once.

The join example however cannot be “folded” at compile time because it contains a function. Functions are evaluated at runtime and so both a and pple must also be stored in memory so they can be joined at runtime into apple.

Conclusion

While at first the behavior of is can seem arbitrary, after deeper exploration we can see that it is not. These strange results are the result of python helping us out! Of course this exploration is incomplete — its a medium article — but if you want more information I’ll link to several resources below.

I hope this seed grows deep roots and yields much fruit for you.

exit()

Further Reading

Analytics Vidhya

Analytics Vidhya is a community of Analytics and Data…

Ben Bogart

Written by

Data Scientist/ML Engineer and Bandoneonista.

Analytics Vidhya

Analytics Vidhya is a community of Analytics and Data Science professionals. We are building the next-gen data science ecosystem https://www.analyticsvidhya.com

Ben Bogart

Written by

Data Scientist/ML Engineer and Bandoneonista.

Analytics Vidhya

Analytics Vidhya is a community of Analytics and Data Science professionals. We are building the next-gen data science ecosystem https://www.analyticsvidhya.com

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store