Researchers from the Broad Institute of Harvard and MIT and Beth Israel Deaconess Medical Center have discovered a vast new class of previously unrecognized mammalian genes that do not encode proteins, but instead function as long RNA molecules. The findings, appearing in the journal Nature, show that these “large intervening non-coding RNAs” (lincRNAs) play critical roles in both health and disease, including cancer, immune signaling and stem cell biology.
“We’ve known that the human genome still has many tricks up its sleeve,” said the Broad Institute’s Eric Lander. “But, it is astounding to realize that there is a huge class of RNA-based genes that we have almost entirely missed until now.”
By contrast, the newly discovered lincRNAs are thousands of bases long. Because only about ten examples of functional lincRNAs were known previously, they seemed more like genomic oddities than critical components. But the new Nature study shows that there are actually thousands of such genes, and that they have been conserved across mammalian evolution.
“The challenge in finding these lincRNAs is that they have been hiding in plain sight,” said Harvard co-researcher John Rinn. “The human and mouse genomes are already known to produce many large RNA molecules, but the vast majority show no evolutionary conservation across species, suggesting that they may simply be ‘genomic noise’ without any biological function.”
To uncover the new genes, the team looked not at the RNA molecules themselves but at telltale signs in the DNA called chromatin modifications or epigenomic marks. They searched for genomic regions that have the same chromatin patterns as protein-coding genes, but do not encode proteins. By surveying the genomes of four different types of mouse cells, they found an astonishing 1,586 such loci that had not been previously described. The researchers also found that the vast majority of these genomic regions are transcribed into lincRNAs, and that these are conserved across mammals.
“The epigenomic marks revealed where these genes were hiding,” said MIT’s Mitch Guttman. “Analysis of their sequence then revealed that the genes are highly conserved in mammalian genomes, which strongly suggested that these genes play critical biological functions.” By correlating the expression patterns of lincRNAs in various cell types with the expression patterns of known critical protein-coding genes in those same cells, the researchers observed that lincRNAs likely play critical roles in helping to regulate a variety of different cellular processes, including cell proliferation, immune surveillance, maintenance of embryonic stem cell pluripotency, neuronal and muscle development, and gametogenesis.
Teasingly, because of the stringent experimental conditions imposed by the researchers in identifying the 1,600 lincRNAs in the Nature study, it is likely that there are many more lincRNA genes hiding in plain sight in the genome, as well as other RNA-encoding genes that are as important to genome function as their better-recognized protein-coding counterparts.