Downloading multiple samples from a entry at GEO Dataset

Yen-Chung Chen
Sep 29, 2018 · 3 min read
“black binders on steel rack” by Samuel Zeller on Unsplash

Batch download of supplementary tables

In the bottom of the GSE series page, there are several files containing metadata of the series. Each of these files contain the details about the experiments, and every entry is in the form of ![feature name] = feature description. Luckily, these metadata all provide a list of downloading list.

Link for downloading metadata
# This command shows all the links in terminal
awk '$0 ~ "^!Series_supplementary_file" {print substr($0, 30)}' GSE94883_family.soft
awk '$0 ~ "^!Series_supplementary_file" {system ("curl -O "  substr($0, 30))}' GSE94883_family.soft

Batch download of raw data

What if I want raw data, so I could do analysis from scratch? The metadata also gives you a SRA accession so you can access those. The link to start with is saved as !Series_relation. The link will lead you to a page that list all the SRA results, and this page contains a link to Run selector, where you can download an accession list. Every accession number is listed in this file.

Link to SRA run selector.
awk '{system ("fastq-dump " $0)}' SRR_Acc_List.txt

biosyntax

Notes, thoughts, and random experiments in life science.

Yen-Chung Chen

Written by

A learning developmental biologist

biosyntax

biosyntax

Notes, thoughts, and random experiments in life science.