CSRDB is a bioinformatics resource for cereal crops consisting of large-scale datasets of maize and rice and small RNA sequences. The sequences were generated by 454 Life Science sequencing. The small RNA sequences have been mapped to the rice genome and available maize genome sequence and are presented in two genome browser datasets using the Generic Genome Browser. Potential target sequences representing mature mRNA sequences have been predicted using the FASTH software from the Zuker lab. and access to the resulting small RNA target pair (SRTP) dataset has been made available through a mysql based relational database. Within the genome browser the small RNAs have links to the SRTP database that will return a list of potential targets. The SRTP database may also be searched independently using both small RNA and target transcript queries. Data linking and integration is the main focus of this interface and to this aim links are present in the SRTP results pages back to the browser and the SRTP database as well as external sites.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This is a public resource highlighting efforts at ARS in developing small RNA genome information for the potato genome. Updates and progress are reported here. Resources in this dataset:Resource Title: Web Page. File Name: Web Page, url: https://potato.pw.usda.gov
The Subviral RNA database facilitates the research and analysis of viroids, satellite RNAs, satellite viruses, the human hepatitis delta virus, and related RNA sequences. It integrates a large number of Subviral RNA sequences, their respective RNA motifs, analysis tools, related publication links and additional pertinent information to allow users to efficiently retrieve and analyze relevant information about these small RNA agents. The Subviral RNA Database contains 2877 sequences indexed in 83 species and 4 main groups.
DASHR reports the annotation, expression and evidence for specific RNA processing (cleavage specificity scores/entropy) of human sncRNA genes, precursor and mature sncRNA products across different human tissues and cell types. DASHR integrates information from multiple existing annotation resources for small non-coding RNAs, including microRNAs (miRNAs), Piwi-interacting (piRNAs), small nuclear (snRNAs), nucleolar (snoRNAs), cytoplasmic (scRNAs), transfer (tRNAs), tRNA fragments (tRFs), and ribosomal RNAs (rRNAs). These datasets were obtained from non-diseased human tissues and cell types and were generated for studying or profiling small non-coding RNAs. This collection references RNA records.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Summary of small RNA sequencing data analysis.
The role of microRNAs in gene regulation has been well established. The extent of miRNA regulation also increases with increasing genome complexity. Though the number of genes appear to be equal between human and zebrafish, substantially less microRNAs have been discovered in zebrafish compared to human (Release 19). It appears that most of the miRNAs in zebrafish are yet to be discovered. We sequenced small RNAs from brain, gut, liver, ovary, testis, eye, heart and embryo of zebrafish. In brain, gut and liver sequencing was done in male and female separately. Majority of the sequenced reads (16-62%) mapped to known miRNAs, with the exception of ovary (5.7%) and testis (7.8%). Using the miRNA discovery tool (miRDeep2), we discovered novel miRNAs from the un-annotated reads that ranged from 7.6 to 23.0%, with exceptions of ovary (51.4%) and testis (55.2%). The prediction tool identified a total of 459 novel pre-miRNAs. We compared expression of miRNAs between different tissues and between males and females to identify tissue associated and sex associated miRNAs respectively. These miRNAs could serve as putative biomarkers for these tissues. The brain and liver had highest number of tissue associated (22) and sex associated (34) miRNAs, respectively. This study comprehensively identifies tissue and sex associated miRNAs in zebrafish. Further, we have discovered 459 novel pre-miRNAs (~30% seed homology to human miRNA) as a genomic resource which can facilitate further investigations to understand miRNA-mRNA gene regulatory networks in zebrafish which will have implications in understanding the function of human homologs. Known miRNA profiling, novel miRNA discovery and identification of tissue associated and sex associated miRNAs from sRNA deep sequencing data of different tissues and embryo of zebrafish (in triplicate) was carried out using the Illumina HiSeq 2000 platform.
A database of animal, plant and virus microRNA data maintained at the University of Poznan. The database provides: * 9980 miRNA candiates from 420 animal and plant species predicted in Expressed Sequence Tags * predicted targets for plant candidates * RNA-seq reads mapped to candidates from 29 species * external data from 12 databases that includes sequences, polymorphism, expression and regulation. miRNEST 1.0, it contains miRNA from 563 animals, plants and viruses plant species.
It is intended to provide information on the sequences and functions of transcripts which do not code for proteins, but perform regulatory roles in the cell. Currently, the database includes over 30,000 individual sequences from 99 species of Bacteria, Archaea and Eukaryota. The primary source of sequences included in the database was the GenBank. Additional annotation information for mouse and human ncRNAs was derived from FANTOM3 database and H-inviational Integrated Database of Annotated Human Genes version 3.4, respectively. Genome mapping information was derived from tha data available at the UCSC Genome Browser site. The sequences and annotations of small cytoplasmic RNAs from bacteria, for which annotation is lacking in the genome sequences, were derived from the Rfam database. The microRNAs or snoRNAs which were available in previous editions, as well as other housekeeping (infrastructural) RNAs (e.g. rRNA, tRNA, snRNA, SRP RNA) are not included in our database to avoid redundancy with more specialized databases which emerged in recent years.
sRNAMap is a collection of sRNAs, regulators, and targets in microbial genomes. It provides valuable information on sRNAs, such as their secondary structure, expressed conditions, the expression profiles, the transcriptional start sites, and cross-links to other biological databases. Various textual and graphical interfaces were also designed and implemented to facilitate the data access in sRNAMap. Overall, this work presents an integrated database, namely sRNAMap, to collect the sRNA genes, the transcriptional regulators of sRNAs and the sRNA target genes by integrating a variety of biological databases and by surveying literature. It currently contains 397 sRNAs, 62 regulators/sRNAs and 60 sRNAs/targets in seventy microbial genomes.
Database compiles all complete or nearly complete SSU (small subunit) and LSU (large subunit) ribosomal RNA sequences. Sequences are provided in aligned format. Alignment takes into account secondary structure information derived by comparative sequence analysis of thousands of sequences. Additional information such as literature references, taxonomy, secondary structure modles and nucleotide variability maps, is also available.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The data provided here are part of a Galaxy Training Network tutorial that analyzes small RNA-seq (sRNA-seq) data from a study published by Harrington et al. (DOI:10.1186/s12864-017-3692-8) to detect differential abundance of various classes of endogenous short interfering RNAs (esiRNAs). The goal of this study was to investigate "connections between differential retroTn and hp-derived esiRNA processing and cellular location, and to investigate the potential link between mRNA 3' end cleavage and esiRNA biogenesis." To this end, sRNA-seq libraries were constructed from triplicate Drosophila tissue culture samples under conditions of either control RNAi or RNAi knockdown of a factor involved in mRNA 3' end processing, Symplekin. This dataset (GEO Accession: GSE82128) consists of single-end, size-selected, non-rRNA-depleted sRNA-seq libraries. Because of the long processing time for the large original files, we have downsampled the original raw data files to include only reads that align to a subset of interesting transcript features including: (1) transposable elements, (2) Drosophila piRNA clusters, (3) Symplekin, and (4) genes encoding mass spectrometry-defined protein binding partners of Symplekin from Additional File 2 in the indicated paper by Harrington et al. More details on features 1 and 2 can be found here: https://github.com/bowhan/piPipes/blob/master/common/dm3/genomic_features (piRNA_Cluster, Trn). All features are from the Drosophila genome Apr. 2006 (BDGP R5/dm3) release.
Small RNA sequencing has been performed using eight commonly used breast cancer cell lines by Illumina as part of the Illumina iDEA Challange 2011. Total RNA of the cell lines was isolated and the sequencing library was prepared using Illuminas Small RNA v1.5 library prep protocol. This data was analyzed with Flicker, an add-on to the Illumina Genome Analyzer Pipeline software that allows for the preliminary analysis of small RNA runs on the GA. Flicker was used in this case for: * Trimming: Attempts to trim the adaptor sequence from small RNA reads Trimming is accomplished by aligning the adaptor with each read, both internally (the adaptor is fully contained within the read) and as a suffix (the adaptor overlaps the 3' end of the read). A similarity matrix is used to generate the matching score, taking into account both substitutions and indels. The alignment with the best matching score is returned, and that best alignment is trimmed from the read. A trimming threshold may be specified, and the adaptor is not trimmed unless the best score exceeds the threshold. * Aligning with Iterated ELAND: Attempts to aligns trimmed reads to targets (squashed genomes or databases such as miRBase) using the ELAND short read aligner. Each read is aligned by ELAND with specified alignment targets. ELAND was designed to align reads of a uniform length, and the trimmed reads from a small RNA run are several different lengths. We currently address this by running ELAND for reads of each length: reads are binned by length, and each bin of a given length is aligned against a specified target. Once all bins are aligned against all targets (in this case, gDNA and miRBase), the results are recombined into a single file with all the annotation information. ===== Key to Experiments HCT ID, Lane, Sample ID, Cell Line, ER +/- Status, Reads HCT20061, Lane 1, HS378, MCF7, +, 14,555,390 HCT20062, Lane 2, HS379, MDA-MB-231, -, 15,147,037 HCT20063, Lane 3, HS381, T47D, +, 14,701,254 HCT20064, Lane 4, HS377, BT-20, -, 16,529,823 HCT20065, Lane 5, HS375, BT-474, +, 16,443,361 HCT20066, Lane 6, HS380, MDA-MB-468, -, 16,746,981 HCT20067, Lane 7, HS382, ZR-75-1, +, 17,631,005 HCT20068, Lane 8, HS376, MCF10A, normal, 19,155,607
MicroRNAs (miRNAs) are processed from longer precursors with fold-back structures. While animal MIRNA precursors have homogenous structures, plant precursors comprise a collection of fold-backs with variable size and shape. Here, we design an approach (SPARE) to systematically analyze miRNA processing intermediates and characterize the biogenesis of most of the evolutionary conserved miRNAs present in Arabidopsis thaliana. We found that plant MIRNAs are processed by four mechanisms, depending on the sequential direction of the processing machinery and the number of cuts required to release the miRNA. Classification of the precursors according to their processing mechanism revealed specific structural determinants for each group. We found that the complexity of the miRNA processing pathways occurs in both ancient and evolutionary young sequences, and that members of the same family can be processed in different ways. We observed that different structural determinants compete for the processing machinery and that alternative miRNAs can be generated from a single precursor. The results provide a mechanistic explanation for the structural diversity of MIRNA precursors in plants and new insights towards the understanding of the biogenesis of small RNAs. Approach to systematically analyze miRNA processing intermediates and characterize the biogenesis of conserved and young miRNAs present in Arabidopsis thaliana. MiRNA processing intermediates profiles of Wild type and Fiery mutants Arabidopsis plants were analyzed, using Illumina GAIIx.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Title of data: Table S1. Description of data: Detailed results for adapter prediction and trimming. (XLSX 12Â kb)
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Summary of P. xylostella small RNA data analysis.
https://ega-archive.org/dacs/EGAC00001001467https://ega-archive.org/dacs/EGAC00001001467
This data set contains small RNA-sequencing and RNA-sequencing data from subependymal giant cell astrocytomas (SEGA) resected from tuberous sclerosis complex patients. Small RNA-sequencing and RNA-sequencing were performed on the same set of SEGAs (n=19) and periventricular controls (n=8). For full details on library preparation and patients please refer to the paper "The coding and non-coding transcriptional landscape of subependymal giant cell astrocytomas." (PMID: 31834371 DOI: 10.1093/brain/awz370).
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This study was conducted to explore the mechanism of gemcitabine resistance in bladder cancer.This is a small RNA sequence data comparing gemcitabine-resistant cell line with the parental cell line in BOY, a bladder cancer cell line.
http://exrna.org/resources/data/data-access-policyhttp://exrna.org/resources/data/data-access-policy
The role of non-coding RNAs in different biological processes and diseases is continuously expanding. Next-generation sequencing together with the parallel improvement of bioinformatics analyses allows the accurate detection and quantification of an increasing number of RNA species. With the aim of exploring new potential biomarkers for disease classification, a clear overview of the expression levels of common/unique small RNA species among different biospecimens is necessary. However, except for miRNAs in plasma, there are no substantial indications about the pattern of expression of various small RNAs in multiple specimens among healthy humans. By analysing small RNA-sequencing data from 243 samples, we have identified and compared the most abundantly and uniformly expressed miRNAs and non-miRNA species of comparable size with the library preparation in four different specimens: plasma exosomes (n=125), stool (n=39), urine (n=48), and cervical scrapes (n=31). Eleven miRNAs were commonly detected among all different specimens while 231 miRNAs were globally unique across the specimens. Classification analysis using these miRNAs provided an accuracy of 99.6% to recognize the sample types. piRNAs and tRNAs were the most represented non-miRNA small RNAs detected in all specimen types that were analysed, particularly in urine samples. With the present data, the most uniformly expressed small RNAs in each sample type were also identified. A signature of small RNAs for each specimen could represent a reference gene set in validation studies by RT-qPCR. Overall, the data reported hereby provide an insight of the constitution of the human miRNome and other small non-coding RNAs in various specimens of healthy individuals
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The genomic revolution and subsequent advances in large-scale genomic and transcriptomic technologies highlighted hidden genomic treasures. Among them stand out non-coding small RNAs (sRNAs), shown to play important roles in post-transcriptional regulation of gene expression in both pro- and eukaryotes. Bacterial sRNA-encoding genes were initially identified in intergenic regions, but recent evidence suggest that they can be encoded within other, well-defined, genomic elements. This notion was strongly supported by data generated by RIL-seq, a RNA-seq-based methodology we recently developed for deciphering chaperon-dependent sRNA-target networks in bacteria. Applying RIL-seq to Hfq-bound RNAs in Escherichia coli, we found that ∼64% of the detected RNA pairs involved known sRNAs, suggesting that yet unknown sRNAs may be included in the ∼36% remaining pairs. To determine the latter, we first tested and refined a set of quantitative features derived from RIL-seq data, which distinguish between Hfq-dependent sRNAs and “other RNAs”. We then incorporated these features in a machine learning-based algorithm that predicts novel sRNAs from RIL-seq data, and identified high-scoring candidates encoded in various genomic regions, mostly intergenic regions and 3′ untranslated regions, but also 5′ untranslated regions and coding sequences. Several candidates were further tested and verified by northern blot analysis as Hfq-dependent sRNAs. Our study reinforces the emerging concept that sRNAs are encoded within various genomic elements, and provides a computational framework for the detection of additional sRNAs in Hfq RIL-seq data of E. coli grown under different conditions and of other bacteria manifesting Hfq-mediated sRNA-target interactions.
The wild grass Brachypodium distachyon has emerged as a model system for temperate grasses and biofuel plants. However, the global analysis of miRNAs, molecules known to be key for eukaryotic gene regulation, has been limited in B. distachyon to studies examining a few samples or that rely on computational predictions. Similarly an in-depth global analysis of miRNA-mediated target cleavage using Parallel Analysis of RNA Ends (PARE) data is lacking in B. distachyon. B. distachyon small RNAs were cloned and deeply sequenced from 17 libraries that represent different tissues and stresses. Using a computational pipeline, we identified 116 miRNAs including not only conserved miRNAs that have not been reported in B. distachyon, but also non-conserved miRNAs that were not found in other plants. To investigate miRNA-mediated cleavage function, four PARE libraries were constructed from key tissues and sequenced to a total depth of approximately 70 million sequences. The roughly 5 million distinct genome-matched sequences that resulted represent an extensive dataset to analyze small RNA-guided cleavage events. Analysis of the PARE and miRNA data provided experimental evidence for miRNA-mediated cleavage of 264 sites in predicted miRNA targets. In addition, PARE analysis revealed that differentially expressed miRNAs in the same family guide specific target RNA cleavage in a correspondingly tissue-preferential manner. B. distachyon miRNAs and target RNAs were experimentally identified and analyzed. Knowledge gained from this study should provide insights into the roles of miRNAs and the regulation of their targets in B. distachyon and related plants. Examination of various tissues and stresses in Brachypodium by high throughput sequencing for small RNA profiling and PARE (Parallel Analysis of RNA Ends)
CSRDB is a bioinformatics resource for cereal crops consisting of large-scale datasets of maize and rice and small RNA sequences. The sequences were generated by 454 Life Science sequencing. The small RNA sequences have been mapped to the rice genome and available maize genome sequence and are presented in two genome browser datasets using the Generic Genome Browser. Potential target sequences representing mature mRNA sequences have been predicted using the FASTH software from the Zuker lab. and access to the resulting small RNA target pair (SRTP) dataset has been made available through a mysql based relational database. Within the genome browser the small RNAs have links to the SRTP database that will return a list of potential targets. The SRTP database may also be searched independently using both small RNA and target transcript queries. Data linking and integration is the main focus of this interface and to this aim links are present in the SRTP results pages back to the browser and the SRTP database as well as external sites.