3 Simple Ways to Download FASTQ files

A detailed overview of 3 ways to download FASTQ files of SRA runs from NCBI

--

As bioinformaticians, the National Center for Biotechnology Information (NCBI) is one of the most important resources we use to get data. NCBI plays a crucial role in our research community due to its extensive databases and bioinformatics tools. Among all the NCBI databases, the one I frequently use is the Sequence Read Archive (SRA). It is a massive database of high-throughput sequencing data and I quote from [1],

Released in 2009, the SRA contains 9 million records and 12 petabytes of data.

I develop bioinformatics software that can analyse genomic data and my work requires me to test my software on numerous datasets available on SRA. Over the past years, I have stumbled upon many ways to download data from SRA and I thought of writing them up into a blog post, hoping it will be useful for those who are starting off in bioinformatics. So this article walks you through three easy ways to download FASTQ files of SRA runs. Let’s get started.

This image was generated with the assistance of AI

1. SRA Toolkit

The SRA Toolkit provided by NCBI is a set of utilities for accessing SRA data.

How to get SRA run IDs?

--

--

Vijini Mallawaarachchi
The Computational Biology Magazine

Bioinformatician | Computational Genomics 🧬 | Data Science 👩🏻‍💻 | Music 🎵 | Astronomy 🔭 | Travel 🎒 | vijinimallawaarachchi.com