Introduction to Graphs for CP (Part I, Basic Theory)

Published in

Programming Club, IIT Kanpur

6 min readMar 17, 2021

A graph is essentially a collection of vertices (sometimes called nodes) and edges. The vertices represent individual entities of sorts, and each edge connects a pair of vertices and denotes that they are related in some way.

A bit more formally, a graph G is a pair (V, E) where V is the set of vertices and E is the set of edges (pairs (a, b) such that a, b ∈ V).

Here is an illustration of a graph:

The blue dots denote the vertices, and each black arrow indicates an edge connecting one vertex to another.

Graphs are a handy tool that can be used to model various scenarios: for example, the map of a state (individual cities as vertices, roads connecting cities as edges), research papers citing each other (vertices as papers, an edge from paper A to paper B if A cites B), and many more.

For studying graphs more deeply, we divide them into some basic categories:

Undirected Graphs

These are the kind of graphs in which an edge connects two vertices without caring about the order of the vertices connected. For example, at a party, we can say there is an edge between two people if they shook each other’s hands. Person A shaking Person B’s hand is equivalent to B shaking A’s hand.

Directed Graphs

Contrary to an undirected graph, here, the edges have a definite starting and ending vertex. For example, in the research paper graph, if A cites B, it doesn’t imply that B cites A. So this graph is directed.

It is worth noting that an undirected graph can also be seen as a directed graph in which for every edge (A, B) in the undirected graph, there are 2 edges (A, B) and (B, A) in the directed version.

Some more categories:

Weighted graph: Each edge has a “weight”/”cost” associated with it. For example, the road’s length could be each road’s weight on the map of a state.

Unweighted graph: Graph without weights. It can also be treated as a weighted graph with all edges having equal weights.

For further discussion on the theory, let us restrict ourselves to unweighted graphs, as adding weights won’t affect our analysis much here.

Let us see some essential characteristics of a graph:

Degree of a vertex

The degree of a vertex v in an undirected graph G is defined as the number of vertices adjacent, i.e. “directly connected” to v.

The degree of vertex 1 is 3, while that of vertex 2 is 2.

For a directed graph, the degree is of 2 kinds: outdegree (number of arrows/edges emanating from v) and indegree (number of arrowheads pointing towards v).

Here’s a quick problem: show that in an undirected graph, the sum of all degrees = 2|E| (twice the number of edges). This is also known as the handshaking lemma. What can you say about the sum of indegrees of all vertices in a directed graph? What about the sum of all outdegrees?

Subgraph

A subgraph of a graph G = (V, E) is defined as a graph G’ = (V’, E’), where V’ ⊆ V, and E’ ⊆ E such that the vertices connected by edges in E’ are present in V’. More intuitively, to get a subgraph, we take some (possibly none or all) vertices of G, remove all edges that connect vertices not taken, and can then remove more edges (perhaps none or all) as well.

Path

As the name suggests, it is a sequence of vertices ⟨v₀, v₁, …, vₖ⟩ where there is an edge from the first to the second vertex for each consecutive pair of vertices. Here k is known as the path length (the number of edges included). The path is called simple if all vertices in the sequence are distinct.

Cycle

A cycle in a graph is defined as a path ⟨v₀, v₁, …, vₖ⟩ where v₀ = vₖ, and there is at least one edge, i.e. a path with the same starting and ending vertex.

With this feature, we can also give another classification of graphs:

Cyclic graph: Graph in which at least one cycle exists.

Acyclic graph: Opposite of cyclic graph, these are graphs containing no cycles.

Connectedness

Vertices 1, 2, 3 & 4 and the edges connecting them form one of this graph’s connected components.

For a general graph, we can see that certain vertices are reachable from other vertices.

In an undirected graph G, we say vertices a and b are connected if a and b are reachable from each other, i.e. there exists a path in the graph from a to b. Otherwise, they are disconnected. Similarly, the undirected graph G is said to be a connected graph if every 2 distinct vertices of G are connected.

This leads to a notion of connected components in an undirected graph G: these are the different components of a partition of G such that each subgraph is connected, and there is no edge between vertices of 2 different components.

{1, 4} and {0, 2, 3, 5} are the 2 connected components of this graph

So 2 vertices of G are connected if and only if they belong to G’s same connected component.

There is also a particular kind of graph worth mentioning:

Tree

A tree is defined to be an undirected acyclic connected graph. Woah, I bombarded you with terms you just learnt, so let’s go a bit easy. There’s not much to be explained through the definition, so let us peek at the visualization:

As we can see intuitively, alongside being acyclic and connected, some properties of a tree make it interesting:

|E| = |V|-1 (the number of edges is 1 less than the number of vertices) i.e. m = n-1.
Every 2 distinct vertices are connected to each other via a unique simple path.

There are variants of trees, like rooted trees, ordered trees, binary (and in general k-ary) trees. Let’s discuss rooted trees.

Rooted Tree

This is a kind of tree in which one of the vertices is distinguished from the others and is titled the root of the tree. See this illustration below for better visualization:

Intuitively, you can think of it as a family tree for a species that reproduces by itself (single parent, no/single/multiple children). Some things can be noticed here:

There is a parent-child relationship between adjacent vertices. The vertex closer (i.e. having a smaller shortest distance) to the root is the parent, and the other is the child.
Each parent can have an arbitrary number of children, but each vertex has a single parent, except the root, which has no parents. If a vertex has no children, then it is known as a leaf.
We can call a a descendant of b if b lies on the unique simple path from a to the root. In other words, we will reach b if we successively go to the parent of the vertex, starting from a. In such a scenario, b is known as an ancestor of a.
For any vertex v in the tree T, the subgraph containing v and its descendants (along with the edges joining these vertices) is itself a tree rooted at v. We can call it the subtree of T rooted at v.
Starting from the root and going below, we can assign each vertex a depth. The depth of the root is 0, and the depth of any other vertex is 1+the depth of its parent. Essentially, the depth of vertex v gives us the shortest distance of v from the root.

Whew, enough theory for now. Having done all this, let us move on to implementing graphs and various algorithms associated with it in Part II.

Part II