Benedict Paten, senior author on the study and associate professor of biomolecular engineering. (Photo by Carolyn Lagattuta)
Benedict Paten, senior author on the study and associate professor of biomolecular engineering. (Photo by Carolyn Lagattuta)

UCSC prof Paten creates a toolkit for RNA sequencing analysis using a pantranscriptome

Analyzing a person’s gene expression requires mapping their RNA landscape to a standard reference to gain insight into the degree to which genes are “turned on” and perform functions in the body. But researchers can run into issues when the reference does not provide enough information to allow for accurate mapping, an issue known as reference bias. Diagram of the haplotype-aware transcriptome analysis pipeline.

Researchers at UC Santa Cruz introduce the first-ever method for analyzing RNA sequencing data genome-wide using a “pantranscriptome,” which combines a transcriptome and a pangenome – a reference that contains genetic material from a cohort of diverse individuals, rather than just a single linear strand. A group of scientists led by UCSC Associate Professor of Biomolecular Engineering Benedict Paten has released a toolkit that allows researchers to map an individual's RNA data to a much richer reference, addressing reference bias and leading to much more accurate mapping. 

“This is pangenome plus transcriptome – that combination has never really been done before until now,” said Jordan Eizenga, the paper’s co-first author and a postdoctoral scholar in the UCSC Computational Genomics Lab. “This is the first time anyone has attempted to incorporate the pangenome as a standard feature of the RNA sequencing mapping.”

This tool will aid researchers around the world who are working to understand gene expression through RNA sequencing analysis. The tools are publicly available and can be accessed via GitHub.

“With this toolkit, we are employing this more diverse data that we can now get from the pangenome to improve the measurement of gene expression data, something that can widely vary between individuals,” Paten said. “The aim is to make the impact of this more diverse data felt on studies that are looking at gene expression, resulting in better analysis for cell models, organoid models, and other research applications.”

RNA’s most commonly recognized function is to translate DNA into proteins, but scientists now understand that the vast majority of RNA is noncoding and does not make proteins, but instead can play roles such as influencing cell structure or regulating genes. The entire RNA landscape is known collectively as the transcriptome, and mapping this allows researchers to better understand an individual’s gene expression.

The pantranscriptome builds on the emerging concept of “pangenomics” in the genomics field. Typically when evaluating an individual’s genomic data for variation, scientists compare the individual’s genome to that of a reference made up of a single linear strand of DNA bases. Using a pangenome allows researchers to compare an individual’s genome to that of a genetically diverse cohort of reference sequences all at once, sourced from individuals representing a diversity of biogeographic ancestry. This gives the scientists more points of comparison for which to better understand an individual’s genomic variation. 

Mapping RNA sequencing data to understand gene expression can be difficult because the RNA sequences are spliced by cellular mechanisms, meaning one set of RNA data can come from non-connected areas of the genome, making it challenging to correctly align them to a reference. These splicing sites are not uniform across the human population but vary between individuals. It is also difficult to know which haplotype the RNA comes from – whether the group of genes comes specifically from the set of chromosomes inherited from the individual’s mother or the set inherited from the father. 

But with the new pipeline of open source tools, the researchers can take the spliced segments of an individual’s RNA, map where they align on a pangenome, identify which haplotype the data belongs to, and analyze gene expression. 

First, the pipeline identifies which areas of the genome the RNA sequencing data comes from, including the splice sites, and marks those points on the pangenome reference. Those marked points are then compared to a pantranscriptome consisting of haplotype-specific transcripts generated from the reference data contained within the pangenome. This step requires specialized, challenging algorithmic methods.

Finally, it generates estimates of levels of gene expression based on this comparison between the mapped data and the transcripts in the pantranscriptome and identifies which haplotypes the genes come from.

“It's definitely a very forward-looking study in that other genome-wide expression methods are not yet really utilizing pangenomes and haplotype information,” said Jonas Sibbesen, co-first author on the study and a former postdoctoral scholar in the UCSC Computational Genomics Lab who is now an assistant professor at the University of Copenhagen. “We're now thinking ahead as to what pangenomics might additionally bring to the table in transcriptomic analyses.” 

Going forward, the researchers are interested in further developing these tools to be useful for downstream informatics analysis, and tailoring the tools for the particularities of research on single-cell data. For now, the group hopes their new toolkit will serve to show how useful using pangenomics-derived analysis can be.

“We need to be able to explain to some researchers how a pangenome reference will benefit them,” Paten said. “This pipeline is really a first go at doing this for RNA, for functional data, for expression data.”

FAU researchers analyze resistance training in older adults at the cellular level

Aging involves a balance between oxidants and antioxidants, low-grade inflammation, and a protein response that occurs at the cellular level, which is responsible for many health disorders.

Exercise has been shown to regulate the inflammatory response, balance oxidants such as free radicals that build up in the cells and damage DNA; and ameliorate the process by which cells protect themselves against these stressors. Furthermore, resistance training in older adults is recommended to help maintain muscle, flexibility, and balance.  

Aging and related diseases are associated with alterations in oxidative status and low-grade inflammation, as well as a decreased endoplasmic reticulum (ER) unfolded protein response (UPR). UPR is a functional mechanism by which cells attempt to protect themselves against ER stress, resulting from the accumulation of unfolded/misfolded proteins.

One group of proteins associated with the aging process is the mitochondrial heat shock protein 60 (HSP60), which has been demonstrated to play a protective role in the ability of cells to remain active and healthy. Currently, there is limited research investigating the effects of resistance training in older adults on the expression of HSP60 and Klotho, a gene involved in the aging process in mammals.

A new study by researchers at Florida Atlantic University, in collaboration with the University of León in Spain, examined whether an eight-week resistance training program would modulate the oxidative status, the UPR activation, and key inflammatory pathways as well as their relationships with HSP60 and Klotho proteins.

For the study, researchers analyzed these proteins in peripheral blood mononuclear cells of elderly subjects. In addition, they utilized supercomputer simulation to predict the key proteins associated with these biomolecules underlying physiological adaptations to exercise. They collected blood samples approximately five to six days before and after the training period and just before training intervention in young subjects who were included in basal assessments. Researchers also analyzed various oxidative stress biomarkers in peripheral blood mononuclear cells. 

Results of the study, published in the journal Antioxidantsdemonstrated that the levels of the inflammatory proteins (pIRAK1, TLR4, and TRAF6), as well as different markers of the redox balance (catalase, GSH, LP, NRF2, PC, ROS, SOD1, and SOD2) remained unchanged with training. Importantly, untrained elderly subjects showed a significant reduction in pIRE1/IRE1 ratio when compared to trained elderly subjects. Such a finding was further confirmed by a gene ontology analysis, showing that endoplasmic reticulum stress is a key mechanism modulated by IRE1. Additionally, the analysis did not show the training effect on the expression of HSP60 and Klotho or their relationships with other outcome variables. Although elderly male and female subjects were included in the training program, researchers did not find any sex effects in the study. These findings might partially support the modulatory effect of resistance training on the endoplasmic reticulum in the elderly.

“Regular physical activity is suggested to be an effective intervention in improving age-related diseases such as osteoporosis, sarcopenia or muscle loss and dynapenia or loss of muscle strength, cardiovascular diseases, and type 2 diabetes,” said Chun-Jung “Phil” Huang, Ph.D., co-author and a professor in the Department of Exercise Science and Health Promotion within FAU’s Charles E. Schmidt College of Science. “Although the beneficial effects of regular physical exercise to alleviate inflammation and oxidative stress are well-established, the processes of these physiological adaptations with regard to protein folding or UPR remains to be explored. That is why we used a systems biology approach for our study.”

The resistance training protocol for the study consisted of 16 sessions over eight weeks (two sessions per week), with a minimum of 48 hours between sessions. The participants started with a 10-minute warm-up on a cycle ergometer. Subsequently, eight different resistance exercises (leg press, ankle extension, bench press, leg extension, bicep curl, pec deck, high pulley traction, and dumbbell lateral lift) were performed using the exercise device. For each exercise, participants performed three sets of 12-8-12 repetitions. There was a two-to-three-minute rest between each repetition and a three-minute rest between each exercise.

“We know how very important physical activity is for older adults and our study takes research one step further in helping to elucidate the benefits of exercise in this population,” said Huang.

Study co-authors are senior author Brisamar Estébanez, Ph.D.; Marta Rivera-Viloria; and José A. de Paz, M.D., all with the University of León; José E. Vargas, Ph.D., Universidad Federal do Paraná, Curitiba; and Nishant P. Visavadiay, Ph.D.; and Andy V. Kahmoui, Ph.D., both with FAU’s Department of Exercise Science and Health Promotion.  

Two images of a solar active region (NOAA AR 2109) taken by SDO/AIA show extreme-ultraviolet light produced by million-degree-hot coronal gas (top images) on the day before the region flared (left) and the day before it stayed quiet and did not flare (right). The changes in brightness (bottom images) at these two times show different patterns, with patches of intense variation (black & white areas) before the flare (bottom left) and mostly gray (indicating low variability) before the quiet period (bottom right). Credits: NASA/SDO/AIA/Dissauer et al. 2022
Two images of a solar active region (NOAA AR 2109) taken by SDO/AIA show extreme-ultraviolet light produced by million-degree-hot coronal gas (top images) on the day before the region flared (left) and the day before it stayed quiet and did not flare (right). The changes in brightness (bottom images) at these two times show different patterns, with patches of intense variation (black & white areas) before the flare (bottom left) and mostly gray (indicating low variability) before the quiet period (bottom right). Credits: NASA/SDO/AIA/Dissauer et al. 2022

NWRA team's new database makes it easier for scientists to predict solar flares

In the blazing upper atmosphere of the Sun, a team of scientists has found new clues that could help predict when and where the Sun’s next flare might explode.

Using data from NASA’s Solar Dynamics Observatory, or SDO, researchers from NorthWest Research Associates, or NWRA, identified small signals in the upper layers of the solar atmosphere, the corona, that can help identify which regions on the Sun are more likely to produce solar flares – energetic bursts of light and particles released from the Sun.

They found that above the regions about to flare, the corona produced small-scale flashes – like small sparklers before the big fireworks.

This information could eventually help improve predictions of flares and space weather storms – the disrupted conditions in space caused by the Sun’s activity. Space weather can affect Earth in many ways: producing auroras, endangering astronauts, disrupting radio communications, and even causing large electrical blackouts.

Scientists have previously studied how activity in lower layers of the Sun’s atmosphere – such as the photosphere and chromosphere – can indicate impending flare activity in active regions, which are often marked by groups of sunspots, or strong magnetic regions on the surface of the Sun that are darker and cooler compared to their surroundings. The new findings, published in The Astrophysical Journal, add to that picture.

“We can get some very different information in the corona than we get from the photosphere, or ‘surface’ of the Sun,” said KD Leka, lead author on the new study who is also a designated foreign professor at Nagoya University in Japan. “Our results may give us a new marker to distinguish which active regions are likely to flare soon and which will stay quiet over an upcoming period.”

For their research, the scientists used a newly created image database of the Sun’s active regions captured by SDO. The publicly available resource, described in a companion paper also in The Astrophysical Journal, combines over eight years of images taken of active regions in ultraviolet and extreme-ultraviolet light. Led by Karin Dissauer and engineered by Eric L. Wagner, the NWRA team’s new database makes it easier for scientists to use data from the Atmospheric Imaging Assembly (AIA) on SDO for large statistical studies.

“It's the first time a database like this is readily available for the scientific community, and it will be very useful for studying many topics, not just flare-ready active regions,” Dissauer said.

The NWRA team studied a large sample of active regions from the database, using statistical methods developed by team member Graham Barnes. The analysis revealed small flashes in the corona preceded each flare. These and other new insights will give researchers a better understanding of the physics taking place in these magnetically active regions, to develop new tools to predict solar flares.

“With this research, we are starting to dig deeper,” Dissauer said. “Down the road, combining all this information from the surface up through the corona should allow forecasters to make better predictions about when and where solar flares will happen.”

A black hole repeatedly destroying a star

Animation describing the scientific result published here: Wevers, Coughlin, Pasham et al. (2022), https://ui.adsabs.harvard.edu/abs/202... A mysterious flash of electromagnetic radiation from the center of a galaxy some 800 million light years away was first detected in 2018. The flare lasted for about 2 years and then it disappeared. This behavior is consistent with the supermassive black hole a...

Read more

Joachim Kock, Associate Professor at the Department of Mathematics, University of Copenhagen. Photo: Jim Høyer
Joachim Kock, Associate Professor at the Department of Mathematics, University of Copenhagen. Photo: Jim Høyer

University of Copenhagen prof Kock's COVID computations trigger a solution to an old problem in computer science

A mathematician from the University of Copenhagen was keen to forecast the evolvement of the COVID epidemic. Instead, he ended up solving a problem that had troubled computer scientists for decades.

During the corona epidemic, many of us became amateur mathematicians. How quickly would the number of hospitalized patients rise, and when would herd immunity be achieved? Professional mathematicians were challenged as well, and a researcher at the University of Copenhagen became inspired to solve a 30-year-old problem in computer science.

“Like many others, I was out to calculate how the epidemic would develop. I wanted to investigate certain ideas from theoretical computer science in this context. However, I realized that the lack of a solution to the old problem was a showstopper,” says Joachim Kock, Associate Professor at the Department of Mathematics, University of Copenhagen.

His solution to the problem can be of use in epidemiology and computer science, and potentially in other fields as well. A common feature for these fields is the presence of systems where the various components exhibit mutual influence. For instance, when a healthy person meets a person infected with COVID, the result can be two people infected.

The smart method invented by a German teenager

To understand the breakthrough, one needs to know that such complex systems can be described mathematically through so-called Petri nets. The method was invented in 1939 by German Carl Adam Petri (by the way at the age of only 13) for chemistry applications. Just like a healthy person meeting a person infected with COVID can trigger a change, the same may happen when two chemical substances mix and react.

In a Petri net, the various components are drawn as circles while events such as a chemical reaction or an infection are drawn as squares. Next, circles and squares are connected by arrows which show the interdependencies in the system.

Computer scientists regarded the problem as unsolvable

In chemistry, Petri nets are applied for calculating how the concentrations of various chemical substances in a mixture will evolve. This manner of thinking has influenced the use of Petri nets in other fields such as epidemiology: we are starting with a high “concentration” of un-infected people, whereafter the “concentration” of infected starts to rise. In computer science, the use of Petri nets is somewhat different: the focus is on individuals rather than concentrations, and the development happens in steps rather than continuously.

What Joachim Kock had in mind was to apply the more individual-oriented Petri nets from computer science for COVID calculations. This was when he encountered the old problem:

“The processes in a Petri net can be described through two separate approaches. The first approach regards a process as a series of events, while the second approach sees the net as a graphical expression of the interdependencies between components and events,” says Joachim Kock, adding:

“The serial approach is well suited for performing calculations. However, it has a downside since it describes causalities less accurately than the graphical approach. Further, the serial approach tends to fall short when dealing with events that take place simultaneously.”

“The problem was that nobody had been able to unify the two approaches. The computer scientists had more or less resigned, regarding the problem as unsolvable. This was because no one had realized that you need to go back and revise the very definition of a Petri net,” says Joachim Kock.

Small modifications with a large impact

The Danish mathematician realized that a minor modification to the definition of a Petri net would enable a solution to the problem:

“By allowing parallel arrows rather than just counting them and writing a number, additional information is made available. Things work out and the two approaches can be unified.”

The exact mathematical reason why this additional information matters is complex, but can be illustrated by an analogy:

“Assigning numbers to objects has helped humanity greatly. For instance, it is highly practical that I can arrange the right number of chairs in advance for a dinner party instead of having to experiment with different combinations of chairs and guests after they have arrived. However, the number of chairs and guests does not reveal who will be sitting where. Some information is lost when we consider numbers instead of real objects.”

Similarly, information is lost when the individual arrows of the Petri net are replaced by a number.

“It takes a bit more effort to treat the parallel arrows individually, but one is amply rewarded as it becomes possible to combine the two approaches so that the advantages of both can be obtained simultaneously.”

The circle to COVID has been closed

The solution helps our mathematical understanding of how to describe complex systems with many interdependencies, but will not have much practical effect on the daily work of computer scientists using Petri nets, according to Joachim Kock:

“This is because the necessary modifications are mostly back-compatible and can be applied without the need for revision of the entire Petri net theory.”

“Somewhat surprisingly, some epidemiologists have started using the revised Petri nets. So, one might say the circle has been closed!”

Joachim Kock does see a further point to the story:

“I wasn’t out to find a solution to the old problem in computer science at all. I just wanted to do COVID calculations. This was a bit like looking for your pen but realizing that you must find your glasses first. So, I would like to take the opportunity to advocate the importance of research that does not have a predefined goal. Sometimes research driven by curiosity will lead to breakthroughs.”