An Open Tool for Mapping Gene Expression

HBASet | An Open Project Spotlight

Mozilla Open Leaders
Read, Write, Participate
4 min readDec 13, 2017

--

Derek Howard (@internautderek) has broad interests in cellular/molecular neuroscience and mental health. Derek was selected to join our current round of Mozilla Open Leaders with his project, HBASet where you can see his interest in the use of open data in systems analysis.

I interviewed Derek to learn more about HBASet and how you can help.

What is HBASet?

HBASet is a tool to help make molecular maps of where and when sets of genes are expressed in the brain. Many scientific studies related to mental illness produce lists of genes that may be disrupted in disease however it’s not always clear how those disruptions may interact. With this tool, a user can input a set of genes and get a better idea of what specific brain structures may be affected, at what time during development and hopefully (down the line) what specific type of cell in the brain may be most impacted.

Why did you start HBASet?

This project was started by Dr. Leon French at the Centre for Addiction and Mental Health (CAMH) in Toronto, and I’ve learned a lot about it from working closely with him. The idea behind the project appealed to me because it seems like a practical way to reuse valuable research data that has already been collected and allow it to be recombined and useful in a wide variety of cases.

Many psychiatric diseases are highly complex and caused by many genes and their interactions with the environment. HBASet is being developed to get a better understanding of how these complex diseases manifest in the brain. Hopefully the tool can help provide quick and early information that distills knowledge from open data to focus future research on specific parts of the brain that are involved in psychiatric diseases.

Why did you decide to start with mouse, monkey and human brains?

I’ve initially focused on using open gene expression data collected and provided by the Allen Brain institute which include mouse, monkey and human brain datasets. These resources are extremely valuable as they provide genome wide microarray expression measures in fine neuroanatomical detail from multiple subjects.

The mouse and monkey are mammals that are commonly used in experimental science as a model for understanding similar processes in humans. If results from analyses with this tool agree across species then they are likely to be more robust and due to fundamental processes that are worth diving into deeper. Using datasets from multiple species can help translate experimental work to better inform disease processes in humans.

What challenges have you faced working on this project?

One challenge is just getting acquainted with the main data sources and understanding how the signal in the data can be augmented or improved by pulling in other data sources (eg: filtering out or scaling probes in the arrays based off of newer RNA-Seq data that is also provided by the Allen institute).

Also, I found myself changing the scope of the project while working on it. Initially, we had proposed developing a web-app to facilitate ease of use of the tool. However, given that more of the work was focused on figuring out the data processing, analysis and visualisation methods, we shifted priority to developing a reproducible workflow in Jupyter notebooks.

How has your project been impacted by Mozilla Open Leaders?

Mozilla Open Leaders helped in a number of ways with general project planning, developing documentation and generally making it easier for others to understand the purpose of the project and contribute.

I learned the importance of specifically defining issues to break down problems so that someone who isn’t intimately familiar with the project can contribute. It was also great to zoom out from the details and develop a roadmap to provide context for the overall direction of the project and understanding what stage the project is at and where we plan to go. Lastly, I enjoyed the section discussing how to facilitate running open events. I’m definitely looking forward to a BrainHack in the future!

How can others help you continue the work on HBASet?

The first step is checking out the repo on github to get acquainted with the project. There are different ways to contribute, from identifying datasets that may be of interest, simplifying some of the data processing, testing out the workflow or reporting any issues is always welcome.

What meme or gif best represents your project?

from giphy

Mozilla Open Leaders offers mentorship and training on working open. Join a cohort of project leads fueling the Internet Health movement. Receive mentorship and training through the Mozilla community in this 14-week online program on working open. Apply today!

--

--

Mozilla Open Leaders
Read, Write, Participate

A cohort of Open Leaders fueling the #internethealth movement through mentorship & training on working open. Work Open, Lead Open #WOLO mzl.la/openleaders