tRNAs: The ancient, multi-functional non-coding RNAs with new forms and uses

Transfer RNAs (tRNAs) are the largest, most complex non-coding RNA family, universal to all living things. These genes are best known for their pivotal role in translating genetic information from messenger RNA to protein. For over two decades, our research group has played a leading role in accurately identifying complete sets of tRNA genes in genomes with tRNAscan-SE. The recent major update of this most widely used tRNA gene prediction tool further improves sensitivity and performance, allows distinction between canonical tRNAs and repetitive elements, and enables better functional classification of the genes. We also provide the most cited global reference for tRNAs with the Genomic tRNA Database (GtRNAdb), including almost 400,000 tRNA genes in 5,000 genomes. The collaboration with multiple genomics groups such as HUGO Gene Nomenclature Committee and RNAcentral ensures consistent tRNA gene annotations and provides researchers needed resources through a global network.

If tRNAs have been discovered for such a long time, why do we still have the interest in studying them? The identification of non-canonical introns in archaeal tRNAs and the discovery of atypical split and circularized permuted tRNAs in archaea and eukaryotes have illustrated different possible adaptations through evolution. Renewed focus on tRNA biology has further revealed their involvement in regulatory pathways outside their fundamental role in mRNA translation. Recent evidence suggests tRNAs undergo tissue-specific expression, processing, and modifications, resulting in a constellation of tRNAs and tDRNAs (tRNA-derived small RNAs) with the potential to regulate a multitude of pathways in response to changes in the cellular or extracellular environment. Dysregulation of expression of both tRNAs and tDRNAs has also been shown in a variety of diseases such as neurodegeneration. However, these findings only represent a small portion of tRNAs found in the most studied organisms. Many unknown features and regulatory mechanisms are awaiting to be discovered.

Img

Emerging roles and regulation of non-coding RNAs

Img

ARM-seq facilitates sequencing of m1A-, m3C- or m1G-modified RNAs

A key focus of our lab is to study the function and regulation of tRNAs and tDRNAs across a variety of species including human, mouse, and yeast. By utilizing tRNA-specific sequencing methods including ARM-seq (developed by our lab) and DM-tRNA-seq (developed by Pan Lab at University of Chicago), in combination of our data analysis software tRAX, we can obtain expression profiles of tDRNAs and mature tRNAs in multiple tissue samples with the aim to generate a comparative atlas of tRNA expression that complements gene expression data for protein coding genes in the Genotype-Tissue Expression (GTEx) project. Additionally, we use cutting-edge tools such as CRISPR/Cas9 technology to identify molecular and cellular phenotypes associated with inhibition or overexpression of individual tRNAs and tDRNAs. Our continuous engagement on examining and designing methods and tools enables discovery of novel characteristics of tRNAs and related genes.

Img

tRAX tRNA-sequencing data analysis workflow

RNA Modifications on tRNA processing, regulation, and function

tRNAs are the most highly modified molecules within the cell, on average containing 12 modifications per tRNA species. Previous studies have shown that RNA modifications can regulate the structure, stability, and function of tRNAs. Our ARM-seq data also reveals that tDRNAs carry the modifications as the mature tRNAs. Yet, only a small number of modifications on specific tRNAs have been experimentally confirmed. With the development of chemical treatments in conjunction with sequencing techniques and data analysis methods, we work on identifying unknown modification state of tRNAs and tDRNAs to understand their roles on tRNA regulation and function.

Img

ARM-seq expression profiles show the presence of tRNA modifications

Evolution of tRNA genes across phylogenetic clades

Img

tRNA flanking region variations in multiple species

Although tRNAs can be found in all living organisms, their primary nucleotide sequence varies to a surprising degree between species across the domains of life, attesting to the continual evolution of this molecule. In many species, particularly multicellular eukaryotes, there are often multiple different versions of tRNAs with the same anticodon (isodecoders) that nominally carry out the same translation function. However, any difference in the regulation, processing and specific biological role(s) of the diversity of isodecoders is largely unexplored. We therefore developed tRNAviz to facilitate the study of conservation patterns of tRNA sequence features across different phylogenetic clades based on a collection of over 150,000 tRNAs from over 1,500 unique species. Examining the genomic context of tRNA genes further provided insights of their activity state. We found that sequences at the flanking regions of active tRNA genes are highly variable despite the conservation of the mature tRNAs and this transcription-associated mutagenesis in cytosolic tRNAs was observed in many multicellular eukaryotes, including human, mouse, fruit fly, and Arabidopsis. Moreover, multiple genomic elements such as CpG island in the upstream region and transcription termination sequences contribute to the identification of active genes, leading to our development of a computational method, tRNA gene Activity Predictor (tRAP), to classify the activity state of tRNA genes in different eukaryotes that experimental results may not be available. With the state-of-the-art genome alignment algorithms, we established tRNA ortholog sets in mammals that enables evolutionary analysis of tRNA gene activity and regulation.

Img

Consensus features of tRNAs in primates shown in tRNAviz