Zoom in the Decision Tree By Networkx and Make Strategic Decisions

Karàn Kokabisaghi
8 min readNov 14, 2022

--

Part.1. Reconfigure the decision tree:

The decision tree is a common analytical tool that has helped businesses to broaden their scope to all possible options. The application of such a tool is to evaluate alternatives through all phases of solving a problem by analyzing the risk, probability, and costs associated with the expected outcomes. However, the decision tree analysis is subject to the anchoring bias and optimistic guess of the experts and is vulnerable to complex situations and uncertainties. One way to solve the inefficiencies of the decision tree is to zoom in on the steps to see more insight and information related to the decision by restructuring the decision tree and adding more dimensions to it.

In this article, I use Networkx to explain how to reconfigure a simple decision tree with a sequence of actions to a network that is flexible to further updates (See figure below).

The top part of the figure is a decision with a sequence of actions/strategies. In the middle, by adding more details to the decision such as subjective probabilities (best guess of the experts), the overview of the decision process becomes more precise. Now, by zooming in on the decision tree (the dialogue), the decision maker has a choice of selecting the strategy (if the answer is yes) then she should fill in a checklist of the requirements and progress of the strategy and then an external party (e.g. manager) estimates if it is wise/possible to take the next step. At the bottom, we reconfigure the decision tree to a directed graph that is the conceptual framework that we use as the input of our network.

Before we start with the implementation of the decision tree as a network, I explain shorty what a directed graph is.

A directed graph is composed of nodes that are connected by directed edges. The direction of the edges defines unidirectional paths through the graph. In general, both nodes and edges have attributes that encode important additional information (e.g. costs or probabilities, see figure below). Roughly speaking, one can make the following distinction.

  • Nodes can be seen as moments at which different possible outcomes will determine the future course of the process. For instance, the decision maker needs to decide between two alternative courses of action.
    Edges correspond to the various actions that can be taken. Intuitively, one can think of nodes as ”what can happen next?”-questions, with the directed edges as the possible “answers” or outcomes. For each of the outgoing edges (i.e. outcomes), there is an attribute specifying the corresponding cost/reward. In addition, for chance nodes, it also makes sense to specify a probability that captures how likely this outcome is.

Node types: From the point of view of the decision maker, it is helpful to distinguish between the following node types:
1. Decision nodes represent moments at which the decision maker (ego-party) can autonomously decide on the next course of action, in line with its own objectives. Put differently, the ego party is in control and it, therefore, does not make sense to associate probabilities to the different outcomes. In fig.1. the decision nodes are indicated by ovals.
2. Chance nodes are moments at which the ego party is subject to external circumstances, such as actions taken by an external party. Put differently, she is not in control! Therefore outgoing edges represent events it will undergo. The various outcomes that will determine what is going to happen next might have different probabilities, the values of which might be based on expert knowledge (e.g. prior beliefs) or historical data. As a consequence, it makes sense to associate probabilities to edges emanating from chance nodes. chance nodes can be further subdivided into: a. opponent decisions The relevant decision is taken by the opponent. In Figure above, the decision nodes are indicated by ovals; b. external events Neither the ego-nor the external party is in control (e.g. decisions made by the manager, or fluctuations in the market). In Fig.1. the chance nodes are indicated by trapezoids.

Networkx Input:

The input for Networkx is the conceptual framework of the network and the checklists for each strategy. Here I use the excel format (using xlrd package) to read the data from the checklists directly.

The structure of the excel file:

In the example below, a sequence of actions is represented as nodes. Every node has an ID, type, checklist name (to update the information and check the progress towards the objectives), number of children nodes (the edges from the strategy), and features of the edges (eg. cost and probability if we choose to proceed with a strategy or not). See an example of the excel file below:

Snippet of the excel file that specifies the nodes, their type, actions and attributes. Directed
edges can be recovered by following the path from a parent node to one of its children.

In the above example,

  • strategy no.1 is shown as node no.1, which is a decision node. A decision node asks the decision maker whether she would like to start with strategy no.1. There are 2 edges and 4 features assigned to the node. no.1, which represents the connections to nodes no. 2 and 5, and features consist of child node (next nodes), the cost associated with the strategy, and the probability that indicates how many percentages of the criteria are fulfilled for taking the next step. In addition, in the checklist column, “checklist_1” is to keep track of information and criteria related to strategy no.1.
  • Node no.2 is a chance node that checks whether the strategy in node no.1 is done properly and if the related criteria to proceed to the next strategy and the ultimate objectives are ticked.

Important note: the probability of the chance node is set to zero at the beginning of the decision process when the checklist is empty and will be computed based on the available checklist before the chance node.

Implementation of the Networkx:

# Import required libraries:
import networkx as nx
import xlrd

Import network information:

  • The node information is shown in the above figure.
  • The checklist. For every strategy, there is a checklist that is stored in one or more excel files.

checklist_sheet = { checklist_1: content, checklist_2: content}

NOTE: In the next article, I explain an example of how the checklists are read directly from the excel file and a way of computing probability.

# import network_info from Google colab 
location = r"C:mydrive/pyfolder}
workbook = xlrd.open\_workbook(r"C:location/network_info.xlsx")
sheet_nodes = wb.sheet_by_name("node_info")

Build the Networkx:

  • Read the nodes and create the network:
# Read Nodes:
G = nx.DiGraph() # create an empty directed graph
checklist_col = {} # to store checklist names
for row_id in range(1,sheet_nodes.nrows):

#1. get the node_id for this row and add it to the network
#Note: row = 0 is the titles
#node_id in Excel first column
node_id = int(sheet_nodes.cell_value(row_id,0))
G.add_node(node_id)
#1.1. add node type to the network:
node_type={}
node_type.update({node_id: dict(zip(['node_type'],
[sheet_nodes.cell_value(row_id, 1)]))})
nx.set_node_attributes(G, node_type)
# read important features of the nodes: nr.children and checklists
nr_children = int(sheet_nodes.cell_value(row_id,2))

checklist_col.update({node_id: dict(zip(['checklist_name'],
[sheet_nodes.cell_value(row_id, 4)]))})
nx.set_node_attributes(G, checklist_col)
  • Add edges and attributes to the network:
# 2. CYCLE OVER ALL THE Children
children_of_this_node = [] #a list to collect each loop
col_offset = 5
nr_features_per_edge = 4
for k in range(1,nr_children+1):
'''
read the k-th child's node_id at the following location
in the excel sheet_nodes.
'''
child_node_id_col0 = col_offset-1+(k-1)*nr_features_per_edge+1
child_node_id =
int(sheet_nodes.cell(row_id,child_node_id_col0).value)
children_of_this_node.append(child_node_id)

# 3. Find the corresponding attributes:
# child_id, action, prob, cost
attr_action_col0 = child_node_id_col0 + 1
edge_action = sheet_nodes.cell(row_id, attr_action_col0).value
attr_prob_col0 = child_node_id_col0 + 2
edge_prob = sheet_nodes.cell(row_id, attr_prob_col0).value
attr_cost_col0 = child_node_id_col0 + 3
edge_cost = sheet_nodes.cell(row_id, attr_cost_col0).value
print('edge from ', node_id, 'to ', child_node_id,
'action:', edge_action, ', prob', edge_prob,
', cost', edge_cost)
# 4. add the corresponding edge
G.add_edge(node_id, child_node_id, action = edge_action,
prob = edge_prob, cost = edge_cost)

Print the Network:

nx.draw(G,with_labels=True)
An Overview of all the possible path, nodes connections and interconnections

Read the checklist and update the probability of the chance nodes:

To pass the checklist to the network, we first need to find the nodes with the checklist (in our example, there are checklists in nodes no.1 and 3)

Note: here we assume that 50% of the checklist_1 and 50% of the checklist_2 are ticked. It means that 50% is added to the probability of the chance nodes (meet_criteria_yes, meet_criteria_no).

checklist_sheet = { ‘checklist_1’: 0.5, ‘checklist_2’: 0.5}
# Find the node with the checklist
for node in G.nodes:
checklist_col = G.nodes[node]["checklist_name"]
for sheet_tab, sheet_content in checklist_sheet.items():
if sheet_tab == checklist_col:
update_prob = sheet_content
print(' I am node:', node, ',
my checklist is:', sheet_tab,
'with probability :', update_prob)
Here, you see the decision nodes with their corresponding checklists; the probability is the outcomes of the checklist (in this case, 50% of the checklists are ticked)

Now that we know the node id, the corresponding checklist, and the weight of the ticked criteria in every checklist, we need to find the edges from the decision node to the child nodes (the terminal node if no action is taken or a chance node if the action is taken). Here we assume that the decision is to proceed with strategy no.1. Therefore, we need to find the edges that include “…yes” in the action.

# find the edges with "yes"
for i,j in G[node].items():
print('I am node:', node ,'my child is :', i )
for jj in j.values():
if isinstance(jj,str):
if 'yes' in jj:
print('I am node:', node ,', my child is :', i,
',The right edge is from node:', node,
', with action:', jj )

Next, we need to find the attributes of the edges and update the probability based on the checklist.

for chance_id, chance_attr in G[i].items(): 
print( 'I am edge to the node : ' , chance_id,
'& my parent is chance node', i,
'& my attributes are', chance_attr)
if chance_attr['action'] == 'meet_criteria_yes':
chance_attr['prob'] = update_prob
print('updated prob:', chance_attr['prob'])
else:
chance_attr['prob'] = 1- update_prob
print('updated prob:', chance_attr['prob'])
Here you see that the probability of the chance nodes are initially zero but will be updated to 0.5

If we run the network, probabilities of the chance nodes are updated to 0.5.

for node_from, node_to, edge_attr in G.edges.data():
print(node_from,'to', node_to,':', edge_attr)

Now, the decision maker has a better estimate of the results of taking each action based on both the available information and external events.

This article shows how a decision-making process efficiency is improved by taking some simple but important steps. Moreover, Networkx is one of the best tools to zoom in the decision process and is flexible to add more dimensions to the decision along the way.

Thanks for reading!

You can find my GitHub repository for the python code in the link below:

Don’t forget to follow me on medium!

Also, feel free to contact me on LinkedIn.

--

--