# Making Data Trees in Python

## Learn about trees and how to implement them.

Feb 25, 2020 · 10 min read

## Roots

A tree as a data structure can quickly become a complex mathematical subject ( 👀 ), we are surrounded by real and virtual things (data really) that can be modeled and represented by a tree, so it is a very handy subject to understand even at a basic level.

`Personally, I recently failed a code interview question due to my lack of experience with trees in python, is not that a tree is hard to understand, but implementing it on the other hand, specifically in python from scratch is where I stumbled, I hope that by the end of this post both you and I will understand the subject at a comfortable level, and of course be able to implement one for code interviews and other needs.`
`⏭ ⏭ In a hurry ?If you know the theory and/or just want a working tree implementation in Python you can jump to the later sections and skip the theory.`

## A tree you say ?

Most likely you are already using a tree or have used one, well at least a simple tree like this one:

We’ll get into the terminology in a minute, but the important thing here is the parent child relationship which might make more sense with a couple of examples:

`Dad -> SonBoss -> EmployeeFavorite Food -> Chinese takeout3 -> Silver Medal`

Implementing this simple tree is fairly common in Python with dictionaries and lists :

`⚠️ NOTE:The following examples are actually multiple trees ( technically a forest), in reality you are more likely to find this structure rather than a single tree ( at least at this level of tree complexity), if you remove say Jim and Carlos in the following example you would have a single tree.  # Dictionary:Families = {'Peter':'Paul', 'Jim':'Tommy', 'Carlos':'Diego'}for Parent, Son in Families.items():  print(f"{Parent} is {Son}'s Dad")OUTPUT:Peter is Paul's DadJim is Tommy's DadCarlos is Diego's Dad                                 ---------------------////------------------#List:Prizes = ['Gold','Silver','Bronze','Nothing','Zilch']for place, prize in enumerate(Prizes):    print(f"Place number {place+1} gets {prize}")OUTPUT:Place number 1 gets GoldPlace number 2 gets SilverPlace number 3 gets BronzePlace number 4 gets NothingPlace number 5 gets Zilch`

But that looks nothing like a tree and these are simple linear relationships you might say. Well, yes but a tree starts to take shape and make sense once we add more children to the root node:

The key thing here is that these children have only one parent, if they had more this wouldn’t strictly be a tree ( it would be some sort of graph ), some examples:

`Dad -> Son, DaughterBoss -> Manager_1, Manager_2, Manager_3Favorite Foods -> Chinese, Pizza, Tacos1 -> Gold Medal,\$10000,New Car,Sponsorship`

Implementing these on Python should be straightforward since they are just expansions of the previous examples:

`# Dictionary #(Same note as before, these are multiple trees, family trees in this example) :Families = {'Peter':['Paul','Patty'], 'Jim':['Tommy','Timmy','Tammy'], 'Carlos':['Diego']}for Parent, Children in Families.items():        print(f"{Parent} has {len(Children)} kid(s):" )        print(f"{', and '.join([str(Child) for Child in [*Children]])}")OUTPUT:Peter has 2 kid(s):Paul, and PattyJim has 3 kid(s):Tommy, and Timmy, and TammyCarlos has 1 kid(s):Diego# Reduced/Alternative way of saying the same without thing formatting:for Parent, Children in Families.items():        print(str(Parent) + ' has ' + str(len(Children)) + ' kid(s):')        print(*Children)Note the use of the *Operator for unpacking the list.           ---------------------\\ | // ------------------# List:Prizes = [['Gold Medal','\$10000','Sports Car','Brand Sponsorship'],          ['Silver Medal','\$5000','Budget Car'],          ['Bronze Medal','\$2500','Motorcycle'],          ['Participation Trophy','Swag'],          ['Swag']]for place, prizelist in enumerate(Prizes):    print(f"Place # {place+1} gets the following prize(s)")    print(f"{', and '.join([str(prize) for prize in [*prizelist]])}")OUTPUT:Place # 1 gets the following prize(s)Gold Medal, and \$10000, and Sports Car, and Brand SponsorshipPlace # 2 gets the following prize(s)Silver Medal, and \$5000, and Budget CarPlace # 3 gets the following prize(s)Bronze Medal, and \$2500, and MotorcyclePlace # 4 gets the following prize(s)Participation Trophy, and SwagPlace # 5 gets the following prize(s)Swag`

A more common tree then has a root element and multiple nodes which themselves have children of their own :

## Some tree terminology ( Treeminology ?):

Part of what makes implementing a tree difficult (in general), is the hidden complexity a tree carries with it, here are some common terms you might encounter along with their graphical description:

`Yeah, it's a lot of terminology to take in, consult the wiki for detailed definitions and use this as a quick refresher. While we are here, a Binary Tree is a tree in which each node has at most 2 child nodes ( usually named left,right ), the right branch on the above illustration by itself could be a binary tree, while the left side wouldn't. By the way, you might be asking yourself why are these structures called trees, well, if you were to invert or vertically flip the structure it would look like one 🙃 `

## Tree vs graph ( Fight ! )

A graph is a related data structure that is also quite popular (think complex networks of any kind), unlike a tree, the main difference is that there are bi-directional connections and no root :

`Trees in real life (some examples):- Some neurons and their connections ( but by many accounts the brain can be considered a complex dynamic graph).- AI/CS: Neural Networks.- AI/Neuroscience: Semantic Trees.- AI/CS: Pathfinding.- ML/Statistics/AI/CS: Search.- Web/General use: Any nested relationship, (e.g. Ad/site/tracking)- General use: Classification ( e.g. Family, Evolutionary/Biological Trees).And many many more, the point here is that in between a tree and a graph ( non linear data structures ) you can cover a lot of complex data relationships not covered by simpler linear ones.`

## Break.

So that should hopefully take care of the theory and some code groundwork for implementing trees, before moving on I’d like to point out that the main difficulty in implementing a tree is twofold , the recursive nature and the number of optional features you might require, ( see the treeminology illustration).

So our plan of attack is as follows, we’ll make a basic tree implementation from scratch which later you can expand to suit your needs and coding interviews, additionally we will also look at fully featured libraries which should already have all the features needed, this is also a good point to take a break if you need one.

`⏭ If you are still in a hurry and or don't care about implementing your own Python data tree from scratch you can skip the following section.`

## Implementing the Hard Way : from scratch

Let’s first deal with the recursion problem, by that I mean that a tree can grow by adding nodes at any level below the root, let’s first grow one of our previous trees to get a feel for what’s needed:

`# Dictionary (once more this is a forest of 3 trees:)Families = {            'Peter':                   {'Paul':{'Dog','Toucan'} ,                    'Patty': {'Turtle'}},            'Jim':                   {'Tommy':{'Hamster'},                    'Timmy':{'Hamster'},                    'Tammy':{'Hamster'}},            'Carlos':                   {'Diego':'Cat','Ferret','Fox'}}            }for Parent, Children in Families.items():        print(f"{Parent} has {len(Children)} kid(s):" )        print(f" {', and '.join([str(Child) for Child in [*Children]])}")        for Child, pets in Children.items():            print(f"  {Child} has {len(pets)} pet(s):")            print(f"    {', and '.join([str(pet) for pet in [*pets]])}")OUTPUT:Peter has 2 kid(s): Paul, and Patty  Paul has 2 pet(s):    Dog, and Toucan  Patty has 1 pet(s):    TurtleJim has 3 kid(s): Tommy, and Timmy, and Tammy  Tommy has 1 pet(s):    Hamster  Timmy has 1 pet(s):    Hamster  Tammy has 1 pet(s):    HamsterCarlos has 1 kid(s): Diego  Diego has 3 pet(s):    Cat, and Fox, and Ferret`

One solution to the level problem is nesting more dictionaries or lists and adding the same amount of loops to read said dictionaries, we’ll to automate the process soon, but you might be wondering how do we operate on a tree, that is how do we add or remove things at any level :

`- Removing (Let's say a Hamster pandemic hit Jim's house and Diego's Fox escaped ):# Within a loop:for Parent, Children in Families.items():    for Child, pets in Children.items():        for pet in pets:            if pet == 'Hamster':                Families[Parent][Child] = {}# Directly Updating:Families['Carlos']['Diego']  =  {'Cat','Ferret'}- Addition can work in the same way: Families[Parent][Child] = {'Snake'}Families['Carlos']['Diego']  =  {'Cat','Ferret', 'Fox'}You could also use any other Dictionary or iterable method to suit your needs, if for instance you wanted to delete whole branch or family tree:del Families['Peter'] ['Paul’]#ordel Families['Peter’]`

Let’s now start moving everything into classes for reuse:

`"""Barebones minimal general Tree & Node, using lists, but can also use dictionaries if you need key value pairs"""class Tree():    def __init__(self,root):        self.root = root        self.children = []    def addNode(self,obj):        self.children.append(obj)class Node():    def __init__(self, data):        self.data = data        self.children = []    def addNode(self,obj):        self.children.append(obj)USAGE:FunCorp =  Tree('Head Honcho') # Create a tree and add root data.print(FunCorp.root) # ask the Tree for it's root.>> Head Honcho# Add children to root:FunCorp.addNode(Node('VP of Stuff'))FunCorp.addNode(Node('VP of Shenanigans'))FunCorp.addNode(Node('VP of Hootenanny'))# Get children of root:print(f'C suite: {", ".join(str(child.data) for child in FunCorp.children)}')>> C suite: VP of Stuff, VP of Shenanigans, VP of Hootenanny# Add Node to the first child of the Tree:FunCorp.children[0].addNode(Node('General manager of Fun'))# Get the first child of the first child of the Tree:print(f'The position under {FunCorp.children[0].data} is: {FunCorp.children[0].children[0].data}')>> The position under VP of Stuff is: General manager of Fun`

This is a minimal implementation, you’d need to add methods to either the tree or node classes to make it more user friendly or implement a specific feature, one such feature which can serve as a template for other ones is asking the tree for all it’s nodes:

`"""Barebones general Tree & Node"""class Tree():    def __init__(self,root):        self.root = root        self.children = []        self.Nodes = []    def addNode(self,obj):        self.children.append(obj)    def getAllNodes(self):        self.Nodes.append(self.root)        for child in self.children:            self.Nodes.append(child.data)        for child in self.children:            if child.getChildNodes(self.Nodes) != None:                child.getChildNodes(self.Nodes)        print(*self.Nodes, sep = "\n")        print('Tree Size:' + str(len(self.Nodes)))class Node():    def __init__(self, data):        self.data = data        self.children = []    def addNode(self,obj):        self.children.append(obj)    def getChildNodes(self,Tree):        for child in self.children:            if child.children:                child.getChildNodes(Tree)                Tree.append(child.data)            else:                Tree.append(child.data)# Add a bunch of nodesFunCorp =  Tree('Head Honcho')FunCorp.addNode(Node('VP of Stuff'))FunCorp.addNode(Node('VP of Shenanigans'))FunCorp.addNode(Node('VP of Hootenanny'))FunCorp.children[0].addNode(Node('General manager of Fun'))FunCorp.children[1].addNode(Node('General manager Shindings'))FunCorp.children[0].children[0].addNode(Node('Sub manager of Fun'))FunCorp.children[0].children[0].children[0].addNode(Node('Employee of the month'))# Get all nodes (unordered):FunCorp.getAllNodes()>>Head HonchoVP of StuffVP of ShenanigansVP of HootenannyEmployee of the monthSub manager of FunGeneral manager of FunGeneral manager ShindingsTree Size:8`

There are of course more methods you could add or fine tune ( ordering and adding key value pairs for instance ) and ways to implement a tree from scratch (check this for more ideas):

## Implementing the Easy way: with Libraries.

`ℹ️ We'll try a couple of popular libraries:Treelib: https://treelib.readthedocs.io/en/latest/AnyTree: https://github.com/c0fec0de/anytreeYou might also want to check:Binarytree: https://github.com/joowani/binarytreeGraphs: https://wiki.python.org/moin/PythonGraphApi`

Using Treelib:

`from treelib import Node, Treetree = Tree()tree.create_node("CEO","CEO") #roottree.create_node("VP_1","VP_1",parent="CEO" )tree.create_node("VP_2","VP_2",parent="CEO" )tree.create_node("GM_1","GM_1",parent="VP_1" )tree.create_node("GM_2","GM_2",parent="VP_2" )tree.show()>>CEO├── VP_1│   └── GM_1└── VP_2    └── GM_2`

Using Anytree:

`from anytree import Node, RenderTreeceo = Node("CEO") #rootvp_1 = Node("VP_1", parent=ceo)vp_2 = Node("VP_2", parent=ceo)gm_1 = Node("GM_1", parent=vp_1)gm_2 = Node("GM_2", parent=vp_2)for pre, fill, node in RenderTree(ceo):    print("%s%s" % (pre, node.name))>>>CEO├── VP_1│   └── GM_1└── VP_2    └── GM_2`

Creating trees with either seems fairly easy, they have a slightly different syntax and feel though : AnyTree felt more feature packed and TreeLib more streamlined in my short experience , so I suggest you explore them both and see which one works for you.

## Extra Credit: Tree visualization

`Using Graphviz and Anytree : https://github.com/xflr6/graphvizfrom anytree import Node, RenderTreefrom anytree.exporter import DotExporterceo = Node("CEO") #rootvp_1 = Node("VP_1", parent=ceo)vp_2 = Node("VP_2", parent=ceo)gm_1 = Node("GM_1", parent=vp_1)gm_2 = Node("GM_2", parent=vp_2)m_1 = Node("M_1", parent=gm_2)DotExporter(ceo).to_picture("ceo.png")Output: >>`
`For complex ( and prettier ) visual tree graphics check plotly + igraph which uses a different tree/graph notation. `

## Conclusions :

My primary goal here has been to go from zero knowledge of trees in python to a good/basic understanding of them ( I hope we did ok there ), a secondary one has been to show you how to implement one from scratch, unfortunately due to how feature rich trees can get we made just a simple template you might or might not want to follow and develop further ( but I hope you get inspired by it) and lastly we overviewed a couple of tree libraries which provide a much easier and friendlier experience.

The true power of trees comes from their implementation in specific problems ( or nailing that technical interview ), if on the other hand you were here ( maybe in part ) for the CS knowledge I’d recommend you check out other data structures like graphs, stacks and queues in python.

Keno

## The Startup

Get smarter at building your thing. Join The Startup’s +724K followers.

Written by

## Keno Leon

AI, Software Developer, Designer : www.k3no.com

## The Startup

Get smarter at building your thing. Follow to join The Startup’s +8 million monthly readers & +724K followers.

Written by

## Keno Leon

AI, Software Developer, Designer : www.k3no.com

## The Startup

Get smarter at building your thing. Follow to join The Startup’s +8 million monthly readers & +724K followers.

## What to Know Before Using Amazon EKS

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface.

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox.

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic.

Get the Medium app