Sitemap

GSoC’23 @ NRNB : Week 4 ( Data Extraction and Preparation)

1 min readAug 9, 2023

--

AIM : Data Extraction and Preparation with prepare_pc_data

Mentors : Guangchuang Yu, Augustin Luna

Week — 4 : Jun 19 — Jun 25

Introduction:

Welcome to the fourth week of my GSOC journey! In the previous weeks, we’ve embarked on a journey through the basics of Pathway Commons analysis. This week, we’re venturing into the crucial process of extracting and preparing data for further analysis using the powerful prepare_pc_data function.

Progress Made:

This week, I dived into the intricacies of prepare_pc_data. This function is instrumental in retrieving and structuring Pathway Commons data, splitting it into two essential components: PCID2GENE and PCID2NAME. These structured data frames lay the foundation for more in-depth analyses.

prepare_pc_data <- function(source, keyType) {
pc2gene <- get_pc_data(source, keyType, output = 'data.frame')
##TERM2GENE
pcid2gene <- pc2gene[, c("id", "gene")]
##TERM2NAME
pcid2name <- unique(pc2gene[, c("id", "name")])

list(PCID2GENE = pcid2gene,
PCID2NAME = pcid2name)
}

Next Week Plan:

In the upcoming week, we’ll explore the fascinating world of Over-Representation Analysis (ORA) using the enriched data we’ve prepared.

Conclusion:

Preparing data is akin to laying the tracks for an insightful analysis journey. Join me next week as we navigate the pathways of Over-Representation Analysis using the groundwork we’ve established.

Repository :

https://github.com/YuLab-SMU/clusterProfiler/blob/devel/R/pathwayCommons.R

--

--

No responses yet