Networks in Biology: an introduction to Gene Coexpression Networks

Haz
Beautiful Biology
Published in
3 min readJan 4, 2023

Developing high-throughput experiments have provided vast genomic and transcriptomic data. The availability of such big data has led to the use of network methods to construct gene expression networks.

Both gene expression and co-expression networks can be used to understand the relationships between genes and their functions and identify key players in biological processes. However, they differ in their construction and the information they provide.

In comparison, Gene Expression Networks focus on understanding how genes are expressed and regulate their expression. Gene Coexpression Networks, on the other hand, are constructed by identifying pairs of genes consistently coexpressed across several different samples or conditions.

source: an illustration of a simple gene coexpression network (https://en.wikipedia.org/wiki/Gene_co-expression_network)

Gene co-expression network analysis has been widely used for understanding which genes are highly co-expressed through special biological processes or differentially expressed in various conditions. They can also be used to identify potential functional relationships between genes, such as whether one gene might regulate the expression of another.

WGCNA (Weighted Gene Co-expression Network Analysis)

WGCNA is a method for constructing a gene co-expression network from high-throughput gene expression data. It is based on the concept that co-expressed genes across different experimental conditions are likely to have related functions or be regulated by common pathways.

WGCNA has been widely used to identify clusters of co-expressed genes or “modules” and to analyze the relationships between gene expression and various traits of interest (such as disease status).

Core Steps in WGCNA:

step 1: Data Preprocessing — cleaning and normalizing the gene expression data

step 2: Network Construction — calculating the pairwise correlations (Pearson/Spearman correlations) between genes and constructing a network of coexpressed genes using these correlations

step 3: Module Detection — identifying groups of genes (modules) that are significantly coexpressed within the network using methods, including hierarchical clustering, k-means clustering, and dynamic tree cutting

step 4: Module annotation — annotating the genes in each module with functional information (Gene Ontology terms or Pathways)

step 5: Network visualization — visualizing the network where the genes are represented as nodes, and the weights between them are represented as edges by exporting data into Cytoscape or Gephi

source: an overview of WGCNA methodology (Langfelder & Horvath 2008)

Gene coexpression networks are important because they can help to identify groups of genes that are coordinately regulated, which can provide insights into the mechanisms underlying biological processes and assist in identifying potential therapeutic targets for diseases. Additionally, gene coexpression networks can be used to predict uncharacterized genes’ function and prioritize genes for further study.

--

--

Haz
Beautiful Biology

Data Enthusiast and a Researcher who loves to share ideas