Sitemap
CodeX

Everything connected with Tech & Code. Follow to join our 1M+ monthly readers

Follow publication

BLAST on the Cloud with NCBI’s ElasticBLAST

10 min readJun 17, 2021

--

Photo by tian kuan on Unsplash

Bioinformatic programs come and go, but BLAST stays.

Figure 1. NCBI’s online BLAST page. Screenshot from https://blast.ncbi.nlm.nih.gov/Blast.cgi. Image by author.
Figure 2. ElasticBLAST’s webpage. https://blast.ncbi.nlm.nih.gov/doc/elastic-blast/index.html. Image by author.
Figure 3. Architecture for this project. Image by author.

1. Set up an S3 bucket with SNS email notification

Figure 4. SNS setup for S3 file finalization. Image by author.
Figure 5. Create a subscription for the topic. Image by author.
Figure 6. Event notification settings. Image by author.

2. Run ElasticBLAST in CloudShell

3. Analyze the results in DataBrew

s3://[YourBucketName]/results/[RunName]/<[^/]+>.out.gz
Figure 7. DataBrew setup. Image by author.
qaccver, saccver, pident, length, mismatch, gapopen, qstart, qend, sstart, send, evalue, bitscore, sskingdoms, ssciname
Figure 8. Schema editing in DataBrew. Image by author.
Figure 9. ElasticBLAST results in DataBrew. Image by author.
Figure 10. Remove duplicate values in DataBrew. Image by author.
Figure 11. Percentage identitis, subject kingdoms, and subject scientific names of the ElasticBLAST results in DataBrew. Image by author.

Conclusion

--

--

CodeX
CodeX

Everything connected with Tech & Code. Follow to join our 1M+ monthly readers

Sixing Huang
Sixing Huang

A Neo4j Ninja, German bioinformatician in Gemini Data. I like to try things: Cloud, ML, satellite imagery, Japanese, plants, and travel the world.

No responses yet