Almost two decades have passed since the first sequence of a complete set of DNA was released as part of the Human Genome Project. That initiative expanded understanding of certain cancers, boosted the effectiveness of some pharmaceuticals, and spurred discovery of how genetic variation can influence diseases, among other breakthroughs. Yet scientists are now learning that genomics is necessary but not sufficient to advance precision medicine and in-depth knowledge of how our genes interact with the environment. This is where RNA comes into play.
“RNA determines cell identity and mediates responses to cellular needs,” wrote NIEHS grantee Vivian Cheung, M.D., from the University of Michigan, and her colleagues in a commentary published last year in Nature Genetics. “Such diverse cellular functions arise from the vast chemical composition of RNA comprising four [major] ribonucleotides…and more than 140 modified ribonucleotides,” they explained.
“Many years of RNA research laid the foundation for the development of RNA therapeutics as diverse as antisense oligonucleotide therapy for spinal muscular atrophy, and mRNA [messenger RNA] vaccines,” the authors continued. “[Such] accomplishments were enabled by modified ribonucleotides, yet the ‘true’ sequence of RNA, i.e., the ‘RNome,’ remains unknown. This key knowledge gap in understanding the building blocks of RNA must be filled.”
I recently spoke with Cheung, a researcher and pediatric neurologist who is a member of the National Academy of Medicine, to learn why she believes that an effort akin to the Human Genome Project — but focused on RNA — would greatly enhance biomedical research. Cheung discussed why RNA modifications, some of which are influenced by environmental exposures, represent a missing link in our understanding of genetic variation and the origins of disease.
She explained how RNome sequencing will lead to new therapeutics, strengthen the scientific response to COVID-19, and even bolster food security. We spoke about the promise of relevant technologies such as nanopore sequencing, and I asked Cheung about what inspired her to pursue a career as a physician-scientist (see sidebar).
Building knowledge to advance therapeutics
Rick Woychik: Can you explain why greater understanding of RNA is important for the advancement of biomedical research?
Vivian Cheung: Sure. Although your DNA sequence is different from mine, within both of us, our cells have the same DNA even though they have different functions. A lymphocyte, which is a type of immune cell, helps our bodies fight viruses, whereas a motor neuron has an entirely different shape and biological role. But both the lymphocyte and the motor neuron have the same DNA. What allows them to have different functions? A lot of that information is in the RNA.
Although we have a basic understanding that RNA is the regulatory code of our cells, we don’t yet know the exact details of that code. It is difficult to study the function of RNA and how it regulates cellular processes because we don’t have complete knowledge of its sequence. If RNA were a book, I would say that we are reading it with only a small part of the alphabet available. We can piece together the gist of the book, but we don’t know all the subtleties.
Many years ago, researchers thought that there was a one-to-one relationship between DNA and RNA — that RNA is just an exact copy of DNA. But it turns out to be much more complicated than that. Although it is true that RNA is an exact copy of DNA when it is made, it is modified very quickly. Today, we know that there are more than 140 different modifications on RNA.
We also know that exposures involving heavy metals, endocrine-disrupting chemicals, and arsenite can affect RNA modifications. But is that a good biological response or what leads to cell toxicity? We don’t yet know. Until we learn the complete RNA sequence — what I call the RNome — it will be very hard to comprehensively assess biological processes affected by exposures, and we will not understand how our cells and genes are regulated. That lack of knowledge will limit the biomedical community’s efforts to develop effective therapeutics.
Genetic basis of Alzheimer’s disease
RW: Among other topics, your lab at the University of Michigan studies RNA processing and genetic variation caused by environmental stress. Can you discuss some of your latest work?
VC: We recently discovered a noncoding RNA [ncRNA] that regulates the gene APOE, identified 30 years ago as the highest risk factor for Alzheimer's disease. Today, we still do not have an APOE-targeted therapy for that disease. And until recently, when we discovered this ncRNA, it was not known how the gene is regulated. That is important because we can’t target a gene for therapy if we don't know how it is controlled.
So, this ncRNA normally is folded, and part of the sequence is modified to keep it from making APOE, a gene that is normally made only in the liver and in some cell types in the nervous system. But upon stress, and that could be environmental stress, this RNA is unfolded to allow APOE to be transcribed to respond to the stress. By identifying this ncRNA and how it folds and is modified, we are beginning to better understand the genetic basis of Alzheimer’s.
RNA modifications, COVID-19, and food security
RW: That is fascinating, and it raises an important point, which is that a significant portion of the genome gets transcribed and produces RNA that doesn't make a protein but still has a critically important biological function. Understanding more about such RNA is an essential aspect of the RNome, in my view, and the ncRNA that your team identified is a great example of how such knowledge will advance biomedical research.
The other key components of the RNome involve identifying all the different kinds of mRNA that occur within each cell type, and all the different RNA modifications. Can you discuss why those are important aspects of the RNome?
VC: There are more than 140 different types of RNA modifications, and those are important in several ways, illustrated more recently by mRNA vaccines developed during the COVID-19 pandemic. Vaccines in which mRNA was not modified were shown to be at least 48% less effective than ones that were modified. The modification is necessary to ensure that the immune system responds properly to the COVID vaccine.
Another example involves the m6A RNA modification. It has been associated with autoimmune conditions and cancer. There is now good evidence that heavy metals, endocrine disruptors, and arsenite decrease m6A in the RNA of human cells, and I assume that is bad for the cells. A decrease in m6A may be a mechanism by which these toxins affect human health.
Interestingly, recent research has shown that m6A can affect barley’s propensity to absorb cadmium, a heavy metal, from the soil. So, understanding m6A and other modifications could shed light not just on human diseases but also on agricultural challenges and food security issues.
Back to the topic of COVID, I think we need a way for the scientific community to better respond to the pandemic, and sequencing RNA is a way to do that. After all, the disease is caused by an RNA virus, and there are 30 to 40 RNA bases in SARS-CoV-2 that are modified. There is still much to learn about the implications of the virus. We can all just say, “Well, we have the mRNA vaccine, look at what we have done.” Or we can make the RNome project a way to bolster our scientific response, engage the public, and build greater trust in our research.
Innovation on the horizon
RW: What technologies do you think hold promise in terms of advancing the RNome?
VC: On the market today, we have nanopore technology, which is very promising. It involves tiny pores with electric currents that pass through them. If you put the RNA through the pores, the RNA disrupts the current in different manners based on its sequence, and that allows for direct RNA sequencing. However, the machine doesn’t know how to read all RNA modifications, so the critical step at this point is to teach it how to read them.
There is also mass spectrometry, which is a very precise way of identifying RNA. But to date, mass spectrometry cannot sequence RNA that is long. We can probably sequence 20 nucleotides. Given that our human DNA is 3 billion nucleotides and RNA is much more complex than that, we are far from where we need to be. Nevertheless, I have faith that 10 years from now, we will easily be able to sequence RNA.
I was just starting my career when the Human Genome Project was near its peak, and I think most people at that time thought there was no way we could sequence DNA, especially given technological challenges back then. Yet researchers overcame those challenges. That is one reason I am hopeful that a large-scale initiative to sequence RNA will lead to innovation — perhaps even game-changing technology that we cannot imagine today.
Citation: Alfonzo JD, Brown JA, Byers PH, Cheung VG, Maraia RJ, Ross RL. 2021. A call for direct sequencing of full-length RNAs to identify all modifications. Nat Genet 53(8):1113–1116.
(Rick Woychik, Ph.D., directs NIEHS and the National Toxicology Program.)