Penn Engineers Identify Protein Implicated in 3-D Epigenetics of Brain Development

The vast majority of genetic mutations that are associated with disease occur at sites in the genome that aren’t genes. These sequences of DNA don’t code for proteins themselves, but provide an additional layer of instructions that determine if and when particular genes are expressed. Researchers are only beginning to understand how the non-coding regions of the genome influence gene expression and might be disrupted in disease.

​​​​​​​Jennifer Phillips-Cremins

​​​​​​​Jennifer Phillips-Cremins, assistant professor in the Department of Bioengineering in the University of Pennsylvania’s School of Engineering and Applied Science, studies the three-dimensional folding of the genome and the role it plays in brain development. When a stretch of DNA folds, it creates a higher-order structure called a looping interaction, or “loop.” In doing so, it brings non-coding sites into physical contact with their target genes, precisely regulating gene expression in space and time during development.

Jonathan Beagan

Phillips-Cremins and lab member Jonathan Beagan have led a new study identifying a new protein that connects loops in embryonic stem cells as they begin to differentiate into types of neurons. Though the study was conducted in mice, these findings inform aspects of human brain development, including how the genetic material folds in the 3-D nucleus and is reconfigured as stem cells become specialized. Better understanding of these mechanisms may be relevant to a wide range of neurodevelopmental disorders.

Cremins lab members Michael Duong, Katelyn Titus, Linda Zhou, Zhendong Cao, Jingjing Ma, Caroline Lachanski and Daniel Gillis also contributed to the study, which was published in the journal Genome Research.​​​​​​​

“In this paper we create detailed maps of how the genetic material, the DNA, folds in three dimensions inside cells in the brain. We uncover a new class of looping interactions that emerge only when embryonic stem cells turn into neural stem cells in the brain,” Phillips-Cremins said. “These neural stem-cell-specific loops are important because they connect non-coding regulatory elements to their target genes at a developmental stage when brain-specific gene expression patterns are initially established.”

“We also discovered that most new neural stem-cell-specific loops arise within a larger framework of pre-existing loops that are established far earlier in development and present in most cell types in the body,” Beagan said.

A protein named CTCF is known to be the main connector of loops that are stable throughout development. The researchers discovered that CTCF is sharply reduced in the transition from early development to neural stem cells, leading to a global pruning back of loops that don’t matter for the brain and leaving only the stable framework in place.

The Cremins lab creates experimental “heat maps” of the higher-order structure of the genome. By fixing the DNA such that its 3-D folding patterns are preserved prior to sequencing, two distant parts of the linear sequence will end up in the same string of hybrid DNA and will thus be detected together when the DNA is sequenced.

“In this study we integrated maps of protein binding to the DNA with our 3-D genome heat maps. We unexpectedly discovered that the traditional architectural protein CTCF does not connect brain-specific loops,” Beagan said. “Rather, we found a new protein, Ying Yang 1, or YY1, that is essential for connecting the 3D genome specifically in early stages of brain development. Disruption of this protein has been implicated in brain diseases in early human development.”

“At this early stage, we can only say YY1 plays an essential role in connecting brain-specific loops at the earliest stages of neurodevelopment. However, brain development and maturation is a complex process and we’re excited to continue to unravel the organizing principles governing genome folding in fully differentiated neurons in the human brain,” Phillips-Cremins said. “Because the large majority of disease-associated mutations are located in the non-coding regions of the genome, these results might eventually shed light on the mechanisms underlying the onset and progression of a wide range of neurodevelopmental diseases.”