Facebook
TwitterCSRDB is a bioinformatics resource for cereal crops consisting of large-scale datasets of maize and rice and small RNA sequences. The sequences were generated by 454 Life Science sequencing. The small RNA sequences have been mapped to the rice genome and available maize genome sequence and are presented in two genome browser datasets using the Generic Genome Browser. Potential target sequences representing mature mRNA sequences have been predicted using the FASTH software from the Zuker lab. and access to the resulting small RNA target pair (SRTP) dataset has been made available through a mysql based relational database. Within the genome browser the small RNAs have links to the SRTP database that will return a list of potential targets. The SRTP database may also be searched independently using both small RNA and target transcript queries. Data linking and integration is the main focus of this interface and to this aim links are present in the SRTP results pages back to the browser and the SRTP database as well as external sites.
Facebook
TwitterThis is a public resource highlighting efforts at ARS in developing small RNA genome information for the potato genome. Updates and progress are reported here. Resources in this dataset: Resource Title: Web Page. File Name: Web Page, url: https://potato.pw.usda.gov
Facebook
TwitterOpen Data Commons Attribution License (ODC-By) v1.0https://www.opendatacommons.org/licenses/by/1.0/
License information was derived automatically
Small RNA-seqLibraries of small RNAs were constructed using TruSeq Small RNA Library Preparation Kits (Illumina) according to manufacturer’s protocols, and then sequenced by Illumina HiSeq 2000 at Bioacme (Wuhan, China). Quality control and adapter trimming process were performed using Trim Galore (v0.6.4) with default parameters, and reads length between 17 and 70 were used for subsequent analyses. The sequenced reads were aligned to the human genome (GRCh38) by using Bowtie2 (v2.3.5.1). Reads that cannot be aligned to the human genome (GRCh38) were then aligned to agshRNAs. The sequences and lengths of reads aligned to agshRNA were then determined using an in-house script. Reads mapped to genome were further categorized as miRNA, tRNA, snoRNA etc. by using htseq-count (v0.11.2). Annotation file was downloaded from DASHR 2.0 (https://dashr2.lisanwanglab.org/). RNA-seqLibraries of total RNAs were constructed using MGIEasy RNA Library Preparation Kits (BGI) according to manufacturer’s protocols, and then sequenced by MGISEQ2000 (BGI) at Wuhan Institute of Virology, CAS. Quality control and adapter trimming were performed using Trim Galore (v0.6.4) with default parameter. All reads were mapped to the human genome (GRCh38) using Hisat2 (v2.1.0) and then annotated using htseq-count (v0.11.2). Annotation file was downloaded from NCBI. Annotated reads number were transformed into count per million reads (CPM). Statistical significance was evaluated via using unpaired t test and adjusted using Bonferroni correction. Different expression genes (DEGs) were defined by |log2FC| > 1 and P.adj < 0.05.
Facebook
TwitterThis project has developed a sequence dataset of plant small RNAs based on the hypothesis that most if not all plants utilize important small RNA signaling networks. Different plant families are likely to have both common and lineage-specific miRNAs or other small RNAs with important biological roles. Comparative genomics approaches can be applied to distinguish potential miRNAs from siRNAs and to match the miRNAs to the target sequences. This project develops an unparalleled resource of millions of plant small RNAs for comparative analyses. The project includes sequencing of small RNAs from a diverse and agronomically-relevant set of plant species, focused analyses of important members of the Solanaceae and Poaceae, and development of a small RNA database and web interface for public access and analysis of data. These data will allow the experimental characterization of the majority of biologically important small RNAs for a range of plant species, and will be tremendously useful to a broad set of plant biologists interested in development, stress responses, epigenetics, evolution, RNA biology and other traits impacted by small RNAs. We offer a variety of tools to query the small RNA data set, with options to identify sequences based on homology, expression levels, conservation, or potential function: 1. Small RNA mapping tool: searches for small RNAs perfectly matching a genomic sequence provided by the user. 2. Small RNA mismatch tool: searches the database for small RNAs or other short sequences provided by the user, allowing mismatches. 3. Library-comparison tool to identify conserved small RNAs. 4. Library-comparison tool to identify differentially regulated small RNAs. 5. Reverse Target Prediction.
Facebook
TwitterThe Subviral RNA database facilitates the research and analysis of viroids, satellite RNAs, satellite viruses, the human hepatitis delta virus, and related RNA sequences. It integrates a large number of Subviral RNA sequences, their respective RNA motifs, analysis tools, related publication links and additional pertinent information to allow users to efficiently retrieve and analyze relevant information about these small RNA agents. The Subviral RNA Database contains 2877 sequences indexed in 83 species and 4 main groups.
Facebook
TwitteraPercentage among reads mapped by Subread to transcribed region of the genomeCharacteristics of small RNA sequencing data of the eight samples.
Facebook
TwitterSequencing data of six human pathogen RNA viruses
Facebook
TwitterA database of animal, plant and virus microRNA data maintained at the University of Poznan. The database provides: * 9980 miRNA candiates from 420 animal and plant species predicted in Expressed Sequence Tags * predicted targets for plant candidates * RNA-seq reads mapped to candidates from 29 species * external data from 12 databases that includes sequences, polymorphism, expression and regulation. miRNEST 1.0, it contains miRNA from 563 animals, plants and viruses plant species.
Facebook
TwitterSummary of small RNA sequencing data analysis.
Facebook
TwitterWe sought to determine whether the spaceflight environment can induce alterations in small extracellular vesicles (sEV) smallRNA content and their utility as biomarkers. Using small RNA sequencing (sRNAseq), we evaluated the impact of the spaceflight environment on sEV miRNA content in peripheral blood (PB) plasma of 14 astronauts, who flew STS missions between 1998-2001. Samples were collected at three-time points:10 days before the launch (L-10), the day of return (R-0), and three days post-landing (R+3).
Facebook
TwitterThis data was generated by ENCODE. If you have questions about the data, contact the submitting laboratory directly (Jonathan Preall jpreall@cshl.edu (Generation 0 Data from Hannon Lab), Carrie Davis davisc@cshl.edu (experimental), Alex Dobin dobin@cshl.edu (computational), Wei Lin wlin@cshl.edu (computational), Tom Gingeras gingeras@cshl.edu (primary investigator)). If you have questions about the Genome Browser track associated with this data, contact ENCODE (mailto:genome@soe.ucsc.edu). hg18: This data was produced by Hannon lab part of Cold Spring Harbor as part of the ENCODE Project. The series depicts NextGen sequencing information for RNAs between the sizes of 20-200 nt isolated from RNA samples from tissues or sub cellular compartments of cell lines. hg19: This track depicts NextGen sequencing information for RNAs between the sizes of 20-200 nt isolated from RNA samples from tissues or sub cellular compartments from ENCODE cell lines. The overall goal of the ENCODE project is to identify and characterize all functional elements in the sequence of the human genome. hg19: This cloning protocol generates directional libraries that are read from the 5' ends of the inserts, which should largely correspond to the 5' ends of the mature RNAs. The libraries were sequenced on a Solexa platform for a total of 36, 50 or 76 cycles however the reads undergo post-processing resulting in trimming of their 3' ends. Consequently, the mapped read lengths are variable. For data usage terms and conditions, please refer to http://www.genome.gov/27528022 and http://www.genome.gov/Pages/Research/ENCODE/ENCODEDataReleasePolicyFinal2008.pdf hg18: Small RNAs between 20-200 nt were ribominus treated according to the manufacturer's protocol (Invitrogen) using custom LNA probes targeting ribosomal RNAs (some datasets are also depleted of U snRNAs and high abundant microRNAs). The RNA was treated with Tobacco Alkaline Pyrophosphatase to eliminate any 5' cap structure. Poly-A Polymerase was used to catalyze the addition of C's to the 3' end. The 5' ends were phosphorylated using T4 PNK and an RNA linker was ligated onto the 5' end. Reverse transcription was carried out using a poly-G oligo with a defined 5' extension. The inserts were then amplified using oligos targeting the 5' linker and poly-G extension and containing sequencing adapters. The library was sequenced on an Illumina GA machine for a total of 36, 50 or 76 cycles. Initially 1 lane is run. If an appreciable number of mappable reads are obtained, additional lanes are run. Sequence reads underwent quality filtration using Illumina standard pipeline (Gerlad). The read lengths may exceed the insert sizes and consequently introduce 3' adaptor sequence into the 3' end of the reads. The 3' sequencing adaptor was removed from the reads using a custom clipper program, which aligned the adaptor sequence to the short-reads, allowing up to 2 mismatches and no indels. Regions that aligned were "clipped" off from the read. The trimmed portions were collapsed into identical reads, their count noted and aligned to the human genome (NCBI build 36, hg18 unmasked) using Nexalign (Lassmann et al., not published). The alignment parameters are tuned to tolerate up to 2 mismatches with no indels and will allow for trimmed portions as small as 5 nucleotides to be mapped. We report reads that mapped 10 or fewer times. Data obtained from each lane is processed and mapped independently. The processed/mapped data from each lane is then complied as a single track without additional processing and submitted to UCSC. Consequently, identical reads within a lane were collapsed and their value is reported as the "transfrag" signal value. However, the redundancy between lanes has not been eliminated so the same transfrag may appear multiple times within a signal. hg19: Small RNAs between 20-200 nt were ribominus treated according to the manufacturer's protocol (Invitrogen) using custom LNA probes targeting ribosomal RNAs (some datasets are also depleted of U snRNAs and high abundant microRNAs). The RNA was treated with Tobacco Alkaline Pyrophosphatase to eliminate any 5' cap structures. Poly-A Polymerase was used to catalyze the addition of C's to the 3' end. The 5' ends were phosphorylated using T4 PNK and an RNA linker was ligated onto the 5' end. Reverse transcription was carried out using a poly-G oligo with a defined 5' extension. The inserts were then amplified using oligos targeting the 5' linker and poly-G extension and containing sequencing adapters. The library was sequenced on an Illumina GA machine for a total of 36, 50 or 76 cycles. Initially, one lane was run. If an appreciable number of mappable reads were obtained, additional lanes were run. Sequence reads underwent quality filtration using Illumina standard pipeline (GERALD). The Illumina reads were initially trimmed to discard any bases following a quality score less than or equal to 20 and converted into FASTA format, thereby discarding quality information for the rest of the pipeline. As a result, the sequence quality scores in the BAM output are all displayed as "40" to indicate no quality information. The read lengths may exceed the insert sizes and consequently introduce 3' adapter sequence into the 3' end of the reads. The 3' sequencing adapter was removed from the reads using a custom clipper program (available at http://hannonlab.cshl.edu/fastx_toolkit/), which aligned the adapter sequence to the short-reads using up to 2 mismatches and no indels. Regions that aligned were "clipped" off from the read. Terminal C nucleotides introduced at the 3' end of the RNA via the cloning procedure are also trimmed. The trimmed portions were collapsed into identical reads, their count noted and aligned to the human genome (version hg19, using the gender build appropriate to the sample in question - female/male) using Bowtie (Langmead B. et al). The alignment parameter allowed 0, 1, or 2 mismatches iteratively. We report reads that mapped 20 or fewer times. Discrepancies between hg18 and hg19 versions of CSHL small RNA data: The alignment pipeline for the CSHL small RNA data was updated upon the release of the human genome version hg19, resulting in a few noteworthy discrepancies with the hg18 dataset. First, mapping was conducted with the open-source Bowtie algorithm (http://bowtie-bio.sourceforge.net/index.shtml) rather than the custom NexAlign software. As each algorithm uses different strategies to perform alignments, the mapping results may vary even in genomic regions that do not differ between builds. The read processing pipeline also varies slightly, in that we no longer retain information regarding whether a read was 'clipped' off adapter sequence.
Facebook
TwittersRNAMap is a collection of sRNAs, regulators, and targets in microbial genomes. It provides valuable information on sRNAs, such as their secondary structure, expressed conditions, the expression profiles, the transcriptional start sites, and cross-links to other biological databases. Various textual and graphical interfaces were also designed and implemented to facilitate the data access in sRNAMap. Overall, this work presents an integrated database, namely sRNAMap, to collect the sRNA genes, the transcriptional regulators of sRNAs and the sRNA target genes by integrating a variety of biological databases and by surveying literature. It currently contains 397 sRNAs, 62 regulators/sRNAs and 60 sRNAs/targets in seventy microbial genomes.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The data provided here are part of a Galaxy Training Network tutorial that analyzes small RNA-seq (sRNA-seq) data from a study published by Harrington et al. (DOI:10.1186/s12864-017-3692-8) to detect differential abundance of various classes of endogenous short interfering RNAs (esiRNAs). The goal of this study was to investigate "connections between differential retroTn and hp-derived esiRNA processing and cellular location, and to investigate the potential link between mRNA 3’ end cleavage and esiRNA biogenesis." To this end, sRNA-seq libraries were constructed from triplicate Drosophila tissue culture samples under conditions of either control RNAi or RNAi knockdown of a factor involved in mRNA 3’ end processing, Symplekin. This dataset (GEO Accession: GSE82128) consists of single-end, size-selected, non-rRNA-depleted sRNA-seq libraries. Because of the long processing time for the large original files, we have downsampled the original raw data files to include only reads that align to a subset of interesting transcript features including: (1) transposable elements, (2) Drosophila piRNA clusters, (3) Symplekin, and (4) genes encoding mass spectrometry-defined protein binding partners of Symplekin from Additional File 2 in the indicated paper by Harrington et al. More details on features 1 and 2 can be found here: https://github.com/bowhan/piPipes/blob/master/common/dm3/genomic_features (piRNA_Cluster, Trn). All features are from the Drosophila genome Apr. 2006 (BDGP R5/dm3) release.
Facebook
TwitterIt is intended to provide information on the sequences and functions of transcripts which do not code for proteins, but perform regulatory roles in the cell. Currently, the database includes over 30,000 individual sequences from 99 species of Bacteria, Archaea and Eukaryota. The primary source of sequences included in the database was the GenBank. Additional annotation information for mouse and human ncRNAs was derived from FANTOM3 database and H-inviational Integrated Database of Annotated Human Genes version 3.4, respectively. Genome mapping information was derived from tha data available at the UCSC Genome Browser site. The sequences and annotations of small cytoplasmic RNAs from bacteria, for which annotation is lacking in the genome sequences, were derived from the Rfam database. The microRNAs or snoRNAs which were available in previous editions, as well as other housekeeping (infrastructural) RNAs (e.g. rRNA, tRNA, snRNA, SRP RNA) are not included in our database to avoid redundancy with more specialized databases which emerged in recent years.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Rhizomania in sugar beet causes significant yield and sucrose loss worldwide. The disease is caused by Beet necrotic yellow vein virus (BNYVV) and vectored by the plasmodiophorid, Polymyxa betae. Resistance to rhizomania in commercial cultivars is currently dependent upon the use of Rz1 and Rz2 resistant genes in sugar beet. We have developed an ethyl methanesulfonate (EMS) mutant breeding line (KEMS12; PI672570) that is highly resistant to rhizomania. Using rhizomania resistant (R) and susceptible (S) sugar beet breeding lines, natural infection, and comprehensive RNA sequencing, we have identified the accumulation of a unique set of sncRNAs derived from both the sugar beet plant and the BNYVV virus during active infection that may have possible regulatory roles in the resistance and/or susceptibility to rhizomania. Examples of target genes that are differentially expressed in the roots and leaves at early and late infection stages in sugar beet by plant derived miRNAs include Bevul.9G209500 (cytoplasm related catalytic activity), Bevul.2G095700 (potassium transporter) and Bevul.9G160600 (zinc finger), that were up-regulated in the R line (vs. S). Viral derived sncRNAs predominantly originated from RNA1 and RNA2 and targeted a subset of 69 sugar beet genes with overall expression that showed a strong negative correlation with higher sncRNA abundance. The results presented here for the first time demonstrate putative roles of sugar beet miRNAs in rhizomania resistance, and BNYVV derived sncRNAs and small peptides as potential pathogenicity factors.
Facebook
TwitterOverview of small-RNA sequencing information and subsequent data analysis.
Facebook
TwitterDASHR reports the annotation, expression and evidence for specific RNA processing (cleavage specificity scores/entropy) of human sncRNA genes, precursor and mature sncRNA products across different human tissues and cell types. DASHR integrates information from multiple existing annotation resources for small non-coding RNAs, including microRNAs (miRNAs), Piwi-interacting (piRNAs), small nuclear (snRNAs), nucleolar (snoRNAs), cytoplasmic (scRNAs), transfer (tRNAs), tRNA fragments (tRFs), and ribosomal RNAs (rRNAs). These datasets were obtained from non-diseased human tissues and cell types and were generated for studying or profiling small non-coding RNAs. This collection references RNA records.
Facebook
TwitterDatabase compiles all complete or nearly complete SSU (small subunit) and LSU (large subunit) ribosomal RNA sequences. Sequences are provided in aligned format. Alignment takes into account secondary structure information derived by comparative sequence analysis of thousands of sequences. Additional information such as literature references, taxonomy, secondary structure modles and nucleotide variability maps, is also available.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Title of data: Table S7. Description of data: Detailed description of the artificial miRNA test dataset. (XLSX 1478Â kb)
Facebook
TwitterCSRDB is a bioinformatics resource for cereal crops consisting of large-scale datasets of maize and rice and small RNA sequences. The sequences were generated by 454 Life Science sequencing. The small RNA sequences have been mapped to the rice genome and available maize genome sequence and are presented in two genome browser datasets using the Generic Genome Browser. Potential target sequences representing mature mRNA sequences have been predicted using the FASTH software from the Zuker lab. and access to the resulting small RNA target pair (SRTP) dataset has been made available through a mysql based relational database. Within the genome browser the small RNAs have links to the SRTP database that will return a list of potential targets. The SRTP database may also be searched independently using both small RNA and target transcript queries. Data linking and integration is the main focus of this interface and to this aim links are present in the SRTP results pages back to the browser and the SRTP database as well as external sites.