I downloaded tRNAscan-SE to look for tRNAs in my genome. Why do I get a lot more tRNA predictions than those included in GtRNAdb?
tRNA-derived repetitive elements, whose primary sequences are very similar to real tRNA genes, have been commonly found in a lot of vertebrates, some worms, and some plants. We plan to add a new feature to identify this specific type of element within tRNAscan-SE, although the current version does not effectively discriminate between atypical tRNAs and tRNA-derived SINEs.
To address this problem, we currently perform a post-filtering process for tRNAscan-SE predictions in non-primate mammalian genomes. The tRNA candidates that meet the following two criteria are considered as potential tRNA-derived SINEs and are eliminated from the final prediction results displayed in the GtRNAdb: (1) a final bit score of less than 55 bits (50 bits for tRNA-SeC-TCA), and (2) are identified by only one of the two pre-filter scanners (trnascan 1.4 and EufindtRNA). As such, the tRNA genes listed within the GtRNAdb will differ from stand-alone tRNAscan-SE analyses due to this ad-hoc post-filtering step. The full set of unfiltered tRNAs + likely tRNA-like SINES can be provided upon request.
The genomes that we have applied this post-filtering step include: mouse, rat, mouse lemur, oppossum, naked mole-rat, chinese hamster, northen white-cheeked gibbon, baboon, chimp, marmoset, gorilla, orangutan, rhesus, tarsier, guinea pig, horse, elephant, cow, pig, sheep, dog, cat, panda, rabbit, zebrafish, fugu, medaka, stickleback, tetraodon, platypus, lamprey, frog, chicken, turkey, zebra finch, lizard, Caenorhabditis brenneri, Caenorhabditis japonica, and soybean.