Making Data Trees in Python

Learn about trees and how to implement them.

Keno Leon
Keno Leon
Feb 25, 2020 · 10 min read

Roots

A tree as a data structure can quickly become a complex mathematical subject ( 👀 ), we are surrounded by real and virtual things (data really) that can be modeled and represented by a tree, so it is a very handy subject to understand even at a basic level.

Personally, I recently failed a code interview question due to my lack of experience with trees in python, is not that a tree is hard to understand, but implementing it on the other hand, specifically in python from scratch is where I stumbled, I hope that by the end of this post both you and I will understand the subject at a comfortable level, and of course be able to implement one for code interviews and other needs.
⏭ ⏭ In a hurry ?If you know the theory and/or just want a working tree implementation in Python you can jump to the later sections and skip the theory.

A tree you say ?

Most likely you are already using a tree or have used one, well at least a simple tree like this one:

We’ll get into the terminology in a minute, but the important thing here is the parent child relationship which might make more sense with a couple of examples:

Dad -> Son
Boss -> Employee
Favorite Food -> Chinese takeout
3 -> Silver Medal

Implementing this simple tree is fairly common in Python with dictionaries and lists :

⚠️ NOTE:The following examples are actually multiple trees ( technically a forest), in reality you are more likely to find this structure rather than a single tree ( at least at this level of tree complexity), if you remove say Jim and Carlos in the following example you would have a single tree.  # Dictionary:Families = {'Peter':'Paul', 'Jim':'Tommy', 'Carlos':'Diego'}for Parent, Son in Families.items():
print(f"{Parent} is {Son}'s Dad")

OUTPUT:
Peter is Paul's Dad
Jim is Tommy's Dad
Carlos is Diego's Dad


---------------------////------------------
#List:Prizes = ['Gold','Silver','Bronze','Nothing','Zilch']for place, prize in enumerate(Prizes):
print(f"Place number {place+1} gets {prize}")
OUTPUT:Place number 1 gets Gold
Place number 2 gets Silver
Place number 3 gets Bronze
Place number 4 gets Nothing
Place number 5 gets Zilch

But that looks nothing like a tree and these are simple linear relationships you might say. Well, yes but a tree starts to take shape and make sense once we add more children to the root node:

The key thing here is that these children have only one parent, if they had more this wouldn’t strictly be a tree ( it would be some sort of graph ), some examples:

Dad -> Son, Daughter
Boss -> Manager_1, Manager_2, Manager_3
Favorite Foods -> Chinese, Pizza, Tacos
1 -> Gold Medal,$10000,New Car,Sponsorship

Implementing these on Python should be straightforward since they are just expansions of the previous examples:

# Dictionary #(Same note as before, these are multiple trees, family trees in this example) :Families = {'Peter':['Paul','Patty'], 'Jim':['Tommy','Timmy','Tammy'], 'Carlos':['Diego']}for Parent, Children in Families.items():
print(f"{Parent} has {len(Children)} kid(s):" )
print(f"{', and '.join([str(Child) for Child in [*Children]])}")
OUTPUT:Peter has 2 kid(s):
Paul, and Patty
Jim has 3 kid(s):
Tommy, and Timmy, and Tammy
Carlos has 1 kid(s):
Diego
# Reduced/Alternative way of saying the same without thing formatting:for Parent, Children in Families.items():
print(str(Parent) + ' has ' + str(len(Children)) + ' kid(s):')
print(*Children)
Note the use of the * for unpacking the list.
---------------------\\ | // ------------------# List:Prizes = [['Gold Medal','$10000','Sports Car','Brand Sponsorship'],
['Silver Medal','$5000','Budget Car'],
['Bronze Medal','$2500','Motorcycle'],
['Participation Trophy','Swag'],
['Swag']]
for place, prizelist in enumerate(Prizes):
print(f"Place # {place+1} gets the following prize(s)")
print(f"{', and '.join([str(prize) for prize in [*prizelist]])}")
OUTPUT:
Place # 1 gets the following prize(s)
Gold Medal, and $10000, and Sports Car, and Brand Sponsorship
Place # 2 gets the following prize(s)
Silver Medal, and $5000, and Budget Car
Place # 3 gets the following prize(s)
Bronze Medal, and $2500, and Motorcycle
Place # 4 gets the following prize(s)
Participation Trophy, and Swag
Place # 5 gets the following prize(s)
Swag

A more common tree then has a root element and multiple nodes which themselves have children of their own :

Some tree terminology ( Treeminology ?):

Part of what makes implementing a tree difficult (in general), is the hidden complexity a tree carries with it, here are some common terms you might encounter along with their graphical description:

Yeah, it's a lot of terminology to take in,  and use this as a quick refresher. While we are here, a  is a tree in which each node has at most 2 child nodes ( usually named left,right ), the right branch on the above illustration by itself could be a binary tree, while the left side wouldn't. 
By the way, you might be asking yourself why are these structures called trees, well, if you were to invert or vertically flip the structure it would look like one 🙃

Tree vs graph ( Fight ! )

A graph is a related data structure that is also quite popular (think complex networks of any kind), unlike a tree, the main difference is that there are bi-directional connections and no root :

Trees in real life (some examples):- Some neurons and their connections ( but by many accounts the brain can be considered a complex dynamic graph).
- AI/CS: Neural Networks.
- AI/Neuroscience: Semantic Trees.
- AI/CS: Pathfinding.
- ML/Statistics/AI/CS: Search.
- Web/General use: Any nested relationship, (e.g. Ad/site/tracking)
- General use: Classification ( e.g. Family, Evolutionary/Biological Trees).
And many many more, the point here is that in between a tree and a graph ( ) you can cover a lot of complex data relationships not covered by .

Break.

So that should hopefully take care of the theory and some code groundwork for implementing trees, before moving on I’d like to point out that the main difficulty in implementing a tree is twofold , the recursive nature and the number of optional features you might require, ( see the treeminology illustration).

So our plan of attack is as follows, we’ll make a basic tree implementation from scratch which later you can expand to suit your needs and coding interviews, additionally we will also look at fully featured libraries which should already have all the features needed, this is also a good point to take a break if you need one.

⏭ If you are still in a hurry and or don't care about implementing your own Python data tree from scratch you can skip the following section.

Implementing the Hard Way : from scratch

Let’s first deal with the recursion problem, by that I mean that a tree can grow by adding nodes at any level below the root, let’s first grow one of our previous trees to get a feel for what’s needed:

# Dictionary (once more this is a forest of 3 trees:)Families = {
'Peter':
{'Paul':{'Dog','Toucan'} ,
'Patty': {'Turtle'}},
'Jim':
{'Tommy':{'Hamster'},
'Timmy':{'Hamster'},
'Tammy':{'Hamster'}},
'Carlos':
{'Diego':'Cat','Ferret','Fox'}}
}
for Parent, Children in Families.items():
print(f"{Parent} has {len(Children)} kid(s):" )
print(f" {', and '.join([str(Child) for Child in [*Children]])}")
for Child, pets in Children.items():
print(f" {Child} has {len(pets)} pet(s):")
print(f" {', and '.join([str(pet) for pet in [*pets]])}")
OUTPUT:Peter has 2 kid(s):
Paul, and Patty
Paul has 2 pet(s):
Dog, and Toucan
Patty has 1 pet(s):
Turtle
Jim has 3 kid(s):
Tommy, and Timmy, and Tammy
Tommy has 1 pet(s):
Hamster
Timmy has 1 pet(s):
Hamster
Tammy has 1 pet(s):
Hamster
Carlos has 1 kid(s):
Diego
Diego has 3 pet(s):
Cat, and Fox, and Ferret

One solution to the level problem is nesting more dictionaries or lists and adding the same amount of loops to read said dictionaries, we’ll to automate the process soon, but you might be wondering how do we operate on a tree, that is how do we add or remove things at any level :


- Removing (Let's say a Hamster pandemic hit Jim's house and Diego's Fox escaped ):
# Within a loop:for Parent, Children in Families.items():
for Child, pets in Children.items():
for pet in pets:
if pet == 'Hamster':
Families[Parent][Child] = {}
# Directly Updating:Families['Carlos']['Diego'] = {'Cat','Ferret'}- Addition can work in the same way: Families[Parent][Child] = {'Snake'}
Families['Carlos']['Diego'] = {'Cat','Ferret', 'Fox'}
You could also use any other Dictionary or iterable method to suit your needs, if for instance you wanted to delete whole branch or family tree:del Families['Peter'] ['Paul’]
#or
del Families['Peter’]

Let’s now start moving everything into classes for reuse:

"""Barebones minimal general Tree & Node, using lists, but can also use dictionaries if you need key value pairs"""class Tree():
def __init__(self,root):
self.root = root
self.children = []
def addNode(self,obj):
self.children.append(obj)
class Node():
def __init__(self, data):
self.data = data
self.children = []
def addNode(self,obj):
self.children.append(obj)
USAGE:FunCorp = Tree('Head Honcho') # Create a tree and add root data.
print(FunCorp.root) # ask the Tree for it's root.
>> Head Honcho# Add children to root:FunCorp.addNode(Node('VP of Stuff'))
FunCorp.addNode(Node('VP of Shenanigans'))
FunCorp.addNode(Node('VP of Hootenanny'))
# Get children of root:print(f'C suite: {", ".join(str(child.data) for child in FunCorp.children)}')>> C suite: VP of Stuff, VP of Shenanigans, VP of Hootenanny# Add Node to the first child of the Tree:FunCorp.children[0].addNode(Node('General manager of Fun'))# Get the first child of the first child of the Tree:print(f'The position under {FunCorp.children[0].data} is: {FunCorp.children[0].children[0].data}')>> The position under VP of Stuff is: General manager of Fun

This is a minimal implementation, you’d need to add methods to either the tree or node classes to make it more user friendly or implement a specific feature, one such feature which can serve as a template for other ones is asking the tree for all it’s nodes:

"""Barebones general Tree & Node"""
class Tree():
def __init__(self,root):
self.root = root
self.children = []
self.Nodes = []
def addNode(self,obj):
self.children.append(obj)
def getAllNodes(self):
self.Nodes.append(self.root)
for child in self.children:
self.Nodes.append(child.data)
for child in self.children:
if child.getChildNodes(self.Nodes) != None:
child.getChildNodes(self.Nodes)
print(*self.Nodes, sep = "\n")
print('Tree Size:' + str(len(self.Nodes)))
class Node():
def __init__(self, data):
self.data = data
self.children = []
def addNode(self,obj):
self.children.append(obj)
def getChildNodes(self,Tree):
for child in self.children:
if child.children:
child.getChildNodes(Tree)
Tree.append(child.data)
else:
Tree.append(child.data)
# Add a bunch of nodesFunCorp = Tree('Head Honcho')
FunCorp.addNode(Node('VP of Stuff'))
FunCorp.addNode(Node('VP of Shenanigans'))
FunCorp.addNode(Node('VP of Hootenanny'))
FunCorp.children[0].addNode(Node('General manager of Fun'))
FunCorp.children[1].addNode(Node('General manager Shindings'))
FunCorp.children[0].children[0].addNode(Node('Sub manager of Fun'))
FunCorp.children[0].children[0].children[0].addNode(Node('Employee of the month'))
# Get all nodes (unordered):
FunCorp.getAllNodes()
>>
Head Honcho
VP of Stuff
VP of Shenanigans
VP of Hootenanny
Employee of the month
Sub manager of Fun
General manager of Fun
General manager Shindings
Tree Size:8

There are of course more methods you could add or fine tune ( ordering and adding key value pairs for instance ) and ways to implement a tree from scratch (check this for more ideas):

Implementing the Easy way: with Libraries.

ℹ️ We'll try a couple of popular libraries:: 
:
You might also want to check:: :

Using Treelib:

from treelib import Node, Treetree = Tree()
tree.create_node("CEO","CEO") #root
tree.create_node("VP_1","VP_1",parent="CEO" )
tree.create_node("VP_2","VP_2",parent="CEO" )
tree.create_node("GM_1","GM_1",parent="VP_1" )
tree.create_node("GM_2","GM_2",parent="VP_2" )
tree.show()>>CEO
├── VP_1
│ └── GM_1
└── VP_2
└── GM_2

Using Anytree:

from anytree import Node, RenderTreeceo = Node("CEO") #root
vp_1 = Node("VP_1", parent=ceo)
vp_2 = Node("VP_2", parent=ceo)
gm_1 = Node("GM_1", parent=vp_1)
gm_2 = Node("GM_2", parent=vp_2)
for pre, fill, node in RenderTree(ceo):
print("%s%s" % (pre, node.name))
>>>CEO
├── VP_1
│ └── GM_1
└── VP_2
└── GM_2

Creating trees with either seems fairly easy, they have a slightly different syntax and feel though : AnyTree felt more feature packed and TreeLib more streamlined in my short experience , so I suggest you explore them both and see which one works for you.

Extra Credit: Tree visualization

Using  and Anytree : from anytree import Node, RenderTree
from anytree.exporter import DotExporter
ceo = Node("CEO") #root
vp_1 = Node("VP_1", parent=ceo)
vp_2 = Node("VP_2", parent=ceo)
gm_1 = Node("GM_1", parent=vp_1)
gm_2 = Node("GM_2", parent=vp_2)
m_1 = Node("M_1", parent=gm_2)
DotExporter(ceo).to_picture("ceo.png")Output: >>
For complex ( and prettier ) visual tree graphics check  which uses a different tree/graph notation. 

Conclusions :

My primary goal here has been to go from zero knowledge of trees in python to a good/basic understanding of them ( I hope we did ok there ), a secondary one has been to show you how to implement one from scratch, unfortunately due to how feature rich trees can get we made just a simple template you might or might not want to follow and develop further ( but I hope you get inspired by it) and lastly we overviewed a couple of tree libraries which provide a much easier and friendlier experience.

The true power of trees comes from their implementation in specific problems ( or nailing that technical interview ), if on the other hand you were here ( maybe in part ) for the CS knowledge I’d recommend you check out other data structures like graphs, stacks and queues in python.

Thanks for reading !

Keno

The Startup

Get smarter at building your thing. Join The Startup’s +724K followers.

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface.

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox.

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store