92 datasets found
  1. f

    Data from: Impact of Library Preparation on Downstream Analysis and...

    • datasetcatalog.nlm.nih.gov
    • plos.figshare.com
    Updated Aug 19, 2013
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sun, Zhifu; Zhang, Yuji; Wang, Liguo; Bhagwate, Aditya V.; Asmann, Yan W.; Kalari, Krishna R.; Perez, Edith A.; Baker, Tiffany R.; Thompson, E. Aubrey; Kocher, Jean-Pierre A.; Carr, Jennifer M.; Nair, Asha (2013). Impact of Library Preparation on Downstream Analysis and Interpretation of RNA-Seq Data: Comparison between Illumina PolyA and NuGEN Ovation Protocol [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0001650929
    Explore at:
    Dataset updated
    Aug 19, 2013
    Authors
    Sun, Zhifu; Zhang, Yuji; Wang, Liguo; Bhagwate, Aditya V.; Asmann, Yan W.; Kalari, Krishna R.; Perez, Edith A.; Baker, Tiffany R.; Thompson, E. Aubrey; Kocher, Jean-Pierre A.; Carr, Jennifer M.; Nair, Asha
    Description

    ObjectivesThe sequencing by the PolyA selection is the most common approach for library preparation. With limited amount or degraded RNA, alternative protocols such as the NuGEN have been developed. However, it is not yet clear how the different library preparations affect the downstream analyses of the broad applications of RNA sequencing.Methods and MaterialsEight human mammary epithelial cell (HMEC) lines with high quality RNA were sequenced by Illumina’s mRNA-Seq PolyA selection and NuGEN ENCORE library preparation. The following analyses and comparisons were conducted: 1) the numbers of genes captured by each protocol; 2) the impact of protocols on differentially expressed gene detection between biological replicates; 3) expressed single nucleotide variant (SNV) detection; 4) non-coding RNAs, particularly lincRNA detection; and 5) intragenic gene expression.ResultsSequences from the NuGEN protocol had lower (75%) alignment rate than the PolyA (over 90%). The NuGEN protocol detected fewer genes (12–20% less) with a significant portion of reads mapped to non-coding regions. A large number of genes were differentially detected between the two protocols. About 17–20% of the differentially expressed genes between biological replicates were commonly detected between the two protocols. Significantly higher numbers of SNVs (5–6 times) were detected in the NuGEN samples, which were largely from intragenic and intergenic regions. The NuGEN captured fewer exons (25% less) and had higher base level coverage variance. While 6.3% of reads were mapped to intragenic regions in the PolyA samples, the percentages were much higher (20–25%) for the NuGEN samples. The NuGEN protocol did not detect more known non-coding RNAs such as lincRNAs, but targeted small and “novel” lincRNAs.ConclusionDifferent library preparations can have significant impacts on downstream analysis and interpretation of RNA-seq data. The NuGEN provides an alternative for limited or degraded RNA but it has limitations for some RNA-seq applications.

  2. A Low-Cost Library Construction Protocol and Data Analysis Pipeline for...

    • plos.figshare.com
    tiff
    Updated Jun 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Lin Wang; Yaqing Si; Lauren K. Dedow; Ying Shao; Peng Liu; Thomas P. Brutnell (2023). A Low-Cost Library Construction Protocol and Data Analysis Pipeline for Illumina-Based Strand-Specific Multiplex RNA-Seq [Dataset]. http://doi.org/10.1371/journal.pone.0026426
    Explore at:
    tiffAvailable download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Lin Wang; Yaqing Si; Lauren K. Dedow; Ying Shao; Peng Liu; Thomas P. Brutnell
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The emergence of NextGen sequencing technology has generated much interest in the exploration of transcriptomes. Currently, Illumina Inc. (San Diego, CA) provides one of the most widely utilized sequencing platforms for gene expression analysis. While Illumina reagents and protocols perform adequately in RNA-sequencing (RNA-seq), alternative reagents and protocols promise a higher throughput at a much lower cost. We have developed a low-cost and robust protocol to produce Illumina-compatible (GAIIx and HiSeq2000 platforms) RNA-seq libraries by combining several recent improvements. First, we designed balanced adapter sequences for multiplexing of samples; second, dUTP incorporation in 2nd strand synthesis was used to enforce strand-specificity; third, we simplified RNA purification, fragmentation and library size-selection steps thus drastically reducing the time and increasing throughput of library construction; fourth, we included an RNA spike-in control for validation and normalization purposes. To streamline informatics analysis for the community, we established a pipeline within the iPlant Collaborative. These scripts are easily customized to meet specific research needs and improve on existing informatics and statistical treatments of RNA-seq data. In particular, we apply significance tests for determining differential gene expression and intron retention events. To demonstrate the potential of both the library-construction protocol and data-analysis pipeline, we characterized the transcriptome of the rice leaf. Our data supports novel gene models and can be used to improve current rice genome annotation. Additionally, using the rice transcriptome data, we compared different methods of calculating gene expression and discuss the advantages of a strand-specific approach to detect bona-fide anti-sense transcripts and to detect intron retention events. Our results demonstrate the potential of this low cost and robust method for RNA-seq library construction and data analysis.

  3. Data from: Automated workflow for the cell cycle analysis of (non-)adherent...

    • data.niaid.nih.gov
    • datadryad.org
    zip
    Updated Oct 18, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Simona Rodighiero; Elena Ceccacci; Kourosh Hayatigolkhatmi; Chiara Soriani; Oualid El Menna; Emanuel Soda (2024). Automated workflow for the cell cycle analysis of (non-)adherent cells using a machine learning approach [Dataset]. http://doi.org/10.5061/dryad.cvdncjtcx
    Explore at:
    zipAvailable download formats
    Dataset updated
    Oct 18, 2024
    Dataset provided by
    Human Technopole Foundation
    European Institute of Oncology
    Authors
    Simona Rodighiero; Elena Ceccacci; Kourosh Hayatigolkhatmi; Chiara Soriani; Oualid El Menna; Emanuel Soda
    License

    https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html

    Description

    Understanding the details of the cell cycle at the level of individual cells is critical for both cellular biology and cancer research. While existing methods using specific fluorescent markers have advanced our ability to study the cell cycle in cells that adhere to surfaces, there is a clear gap when it comes to non-adherent cells. In this study, we combine a specialized surface to improve cell attachment, the genetically-encoded FUCCI(CA)2 sensor, an automated image processing and analysis pipeline, and a custom machine-learning algorithm. This combined approach allowed us to precisely measure the duration of different cell cycle phases in non-adherent cells. Our method provided detailed information from hundreds of cells under different experimental conditions in a fully automated manner. We validated this approach in two different Acute Myeloid Leukemia (AML) cell lines, NB4 and Kasumi-1, which have unique cell cycle characteristics. Additionally, we tested the impact of drugs affecting the cell cycle in NB4 cells. Importantly, our cell cycle analysis system is freely available and has also been validated for use with adherent cells. In summary, this report introduces a comprehensive, automated method for studying the cell cycle in both adherent and non-adherent cells, offering a valuable tool for cancer research and drug development. Methods RNA extraction, RNA-seq protocol and data analysis Total RNA was extracted from dry pallets of cells collected prior- and post-acquisitions and purified using the Zymo Research Quick-RNA Miniprep (W/O directzol). Reverse transcription was performed with the SuperScript II Kit (Invitrogen), according to the manufacturer’s protocol. RNA-seq was performed according to the True-seq Low sample protocol selecting only polyadenylated transcripts. In brief, before starting mRNA isolation and library preparations the integrity of the total RNA was evaluated by running samples on a Bioanalyzer instrument by picoRNA Chip (Agilent), then converted into libraries of double stranded cDNA appropriate for next generation sequencing on the Illumina platform. The Illumina TruSeq v.2 RNA Sample Preparation Kit was used following manufacturer’s recommendations. Briefly, 0.1-1 μg of total RNA were subjected to two rounds of mRNA purification by denaturing and letting the RNA bind to Poly‐T oligo-attached magnetic beads. Then fragmentation was performed exploiting divalent cations contained in the Illumina fragmentation buffer and high temperature. First and second strand cDNA is reverse transcribed from fragmented RNA using random hexamers. First strand cDNA was synthesized by SuperScript II (Invitrogen) reverse transcriptase and random primers and second strand cDNA synthesized by DNA polymerase I and Rnase H. The subsequent isolation of the cDNA was achieved by using AMPure XP beads (depending on the concentration used, these beads can efficiently recover PCR products of different sizes). The product recovered contained overhanging strands of various lengths due to the fragmentation procedure. The 5’ and 3’ ends of cDNA are repaired by the 3’-5’ exonuclease activity and the polymerase activity and adenylated at 3’ extremities before ligating specific Illumina oligonucleotides adapters followed by 15 cycles of PCR reaction using proprietary Illumina primers mix to enrich the DNA fragments. Prepared libraries were quality checked and quantified using Agilent high sensitivity DNA assay on a Bioanalizer 2100 instrument (Agilent Technologies). Raw reads 51bp PE for NB4 and Kasumi-1 cells were quality-filtered and aligned to the hg18 reference genome using nf-core/rnaseq v3.9 pipeline using STAR as aligner and Salmon for quantification with default parameters. Gene counts for each sample were log1p transformed, mean value among the two replicates was taken to compute Pearson correlation among gene expression pre- and post- time-lapse acquisition. EdU incorporation and assessment by Flow Cytometry A two-hour EdU pulse was performed by replacing half of the total media volume of the cells with 2X concentrated EdU in the corresponding growth medium, followed by subsequent fixation. Click-iT™ EdU Alexa Fluor™ 647 Flow Cytometry Assay Kit (CN: C10419, Thermo-Fisher Scientific, Waltham, MA, USA) and Click-iT™ EdU Cell Proliferation Kit for Imaging, Alexa Fluor™ 647 dye (CN: C10340, Thermo-Fisher Scientific, Waltham, MA, USA) were used for flow cytometry and imaging, respectively. The experiments were performed according to the manufacturer’s protocols for the mentioned kits. DNA staining with DAPI using 500 μL of 5 μg/mL DAPI in PBS for 106 cells, followed by overnight incubation at 4°C, was additionally performed for cell cycle profiling by flow cytometry. Alternatively, 5 μg/mL Hoechst® 33342 (Thermo-Fisher Scientific, Waltham, MA, USA) was used to stain DNA for imaging purposes. Fluorescence Time-Lapse Imaging Images were acquired with a Leica Thunder Imager (Leica Microsystems, Wetzlar, Germany), equipped with a Lumencor Spectra X Light Engine (Lumencor, Beaverton, USA) for fluorescence excitation, a motorized stage and a Leica DFC9000 GTC camera. For non-adherent cells, images were acquired with LAS X software (Leica Microsystems, Wetzlar, Germany, version 3.7.5.24914) using a 20X/0.75NA air objective and a binning 2x2 was applied to increase the SNR. The mCherry and mVenus signals were detected respectively with 540-580 nm and 460-500 nm excitation filters, 585 and 505 nm dichroic mirrors and 592-668 nm and 512-542 nm emission filters. The brightfield channel was also acquired for representation purposes. We imaged 20 to 25 fields of views per well and focal points were manually set in each position before starting the acquisition and kept constant during the whole time-lapse thanks to the Adaptive Focus Control (AFC, Leica Microsystems). The total duration of the time-lapse on non-adherent cells was 72 hours, and the time interval was set to 1 hour to prevent cell phototoxicity. The total duration of the time-lapse on MDA-MB-231 adherent cells was 120 hours, and the time interval was set to 30 minutes.

  4. Data from:...

    • osdr.nasa.gov
    • catalog.data.gov
    Updated Apr 26, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Amanda Saravia-Butler; Jonathan Galazka; Medaya Torres; Yi-Chun Chen; Sungshin Choi; Rebecca Klotz; Stephanie Perreau; Dennis Leveson Gower; Yasaman Shirazi; San-Huei Lai Polo; Amanda Saravia-Butler; Homer Fogle; Medaya Torres; Yi-Chun Chen; Valery Boyko; Marie Dinh; Sungshin Choi; America Reyes Wang; Christina Lim; Tyler Marsh; Vandana Verma; Rebecca Klotz; Amanda Saravia-Butler; Sylvain Costes; Samrawit Gebre; Afshin Beheshti (2022). Transcriptional-analysis-of-spleens-from-mice-preserved-with-the-Rapid-Freeze-hardware [Dataset]. https://osdr.nasa.gov/bio/repo/data/studies/OSD-272
    Explore at:
    Dataset updated
    Apr 26, 2022
    Dataset provided by
    NASAhttp://nasa.gov/
    Authors
    Amanda Saravia-Butler; Jonathan Galazka; Medaya Torres; Yi-Chun Chen; Sungshin Choi; Rebecca Klotz; Stephanie Perreau; Dennis Leveson Gower; Yasaman Shirazi; San-Huei Lai Polo; Amanda Saravia-Butler; Homer Fogle; Medaya Torres; Yi-Chun Chen; Valery Boyko; Marie Dinh; Sungshin Choi; America Reyes Wang; Christina Lim; Tyler Marsh; Vandana Verma; Rebecca Klotz; Amanda Saravia-Butler; Sylvain Costes; Samrawit Gebre; Afshin Beheshti
    License

    Attribution 1.0 (CC BY 1.0)https://creativecommons.org/licenses/by/1.0/
    License information was derived automatically

    Description

    Data from the NASA Rodent Research-1 (RR-1) mission showed that gene-expression levels in mouse livers are different depending on what tissue preservation protocol is used and that slow freezing is not an effective method for preserving signals in gene-expression data. In response to these and other observations, the Rapid Freeze hardware was built for use on the International Space Station. The Rapid Freeze hardware freezes mouse tissues (Glovebox freezer) and whole carcasses (Cryochiller) at rates closely mimicking those attained with immersion in liquid nitrogen. Because this hardware will be used extensively on future rodent research missions, it is crucial to understand whether or not it preserves signals in gene expression data in order to maximize the value of these rare and expensive spaceflight experiments. Therefore, this study was designed with three goals: 1) To evaluate the temperature profile of the Cryochiller and Glovebox freezer cartridges (Rapid Freeze hardware) over time during mock on-orbit procedures; 2) To determine the freezing profiles of tissues and carcasses using Rapid Freeze hardware at both optimal and sub-optimal temperatures (to mimic on-orbit operations), compared with those frozen in liquid nitrogen (the laboratory gold standard) or frozen in at -80 C (the current standard method); 3) To identify gene expression changes in a) tissues that were frozen via the Glovebox freezer and b) tissues dissected from whole or partial carcasses that were frozen via the Cryochiller versus tissues that were frozen via control methods (liquid nitrogen or -80C slow freeze) to assess how the Rapid Freeze hardware compares with laboratory gold standard practices and our current standard methods.

  5. R

    RNA-Seq Report

    • datainsightsmarket.com
    doc, pdf, ppt
    Updated Jul 10, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Data Insights Market (2025). RNA-Seq Report [Dataset]. https://www.datainsightsmarket.com/reports/rna-seq-1981173
    Explore at:
    doc, pdf, pptAvailable download formats
    Dataset updated
    Jul 10, 2025
    Dataset authored and provided by
    Data Insights Market
    License

    https://www.datainsightsmarket.com/privacy-policyhttps://www.datainsightsmarket.com/privacy-policy

    Time period covered
    2025 - 2033
    Area covered
    Global
    Variables measured
    Market Size
    Description

    The RNA sequencing (RNA-Seq) market is experiencing robust growth, driven by the increasing adoption of next-generation sequencing (NGS) technologies in various life science applications. The market's expansion is fueled by several factors, including the rising prevalence of chronic diseases necessitating advanced diagnostic tools, the accelerating demand for personalized medicine approaches, and the growing investments in research and development within the pharmaceutical and biotechnology sectors. Technological advancements, such as improved sequencing accuracy and reduced costs, are further stimulating market growth. Furthermore, the expanding applications of RNA-Seq in oncology, infectious disease research, and agriculture are contributing to its significant market value. The competitive landscape is characterized by a mix of large established players and emerging innovative companies, leading to continuous product development and market diversification. While challenges remain, such as the complexity of data analysis and the need for skilled professionals, the overall outlook for the RNA-Seq market remains highly positive, with substantial growth potential in the coming years. Despite the positive trajectory, market penetration in developing nations remains limited due to factors such as high costs and infrastructure limitations. Furthermore, stringent regulatory approvals and the need for standardized data analysis protocols pose some hurdles. Nevertheless, the continuous innovation in sequencing technologies, coupled with declining costs, is likely to overcome these challenges. The increasing accessibility of bioinformatics tools and the emergence of cloud-based data analysis platforms are also expected to accelerate market growth. Segmentation by technology (e.g., Illumina, PacBio), application (e.g., oncology, transcriptomics), and end-user (e.g., research institutions, pharmaceutical companies) reveals specific opportunities and market niches. Focusing on these segment-specific needs will be crucial for market players aiming to capitalize on the market's future potential. The long-term forecast projects a sustained high growth rate, indicating RNA-Seq's pivotal role in advancing biomedical research and healthcare.

  6. m

    Data from: TARGET-Seq: A Protocol For High-Sensitivity Single-Cell...

    • data.mendeley.com
    Updated Sep 10, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Alba Rodriguez-Meira (2020). TARGET-Seq: A Protocol For High-Sensitivity Single-Cell Mutational Analysis and Parallel RNA Sequencing [Dataset]. http://doi.org/10.17632/k92cnf2fph.1
    Explore at:
    Dataset updated
    Sep 10, 2020
    Authors
    Alba Rodriguez-Meira
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    TARGET-Seq is a protocol for TARGETed high-sensitivity single-cell mutational analysis, parallel RNA SEQuencing and cell-surface proteomics, with extremely low allelic-dropout rates . In this manuscript, we present a detailed step-by-step protocol for the application of TARGET-seq (Rodriguez-Meira et al, Molecular Cell, 2019), including troubleshooting tips, approaches for automation and methods for high-throughput multiplexing of libraries. In this dataset, we provide program files for specific liquid handlers corresponding to each step of the protocol.

  7. Data from:...

    • osdr.nasa.gov
    Updated Jul 30, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jorge Ruas; Jorge C. Correia; Paulo R. Jannig; Maya L. Gosztyla; Igor Cervenka; Serge Ducommun; Stine M. Præstholm; José M. Dias; Kyle D. Dumont; Zhengye Liu; Qishan Liang; Daniel Edsgärd; Olof Emanuelsson; Paul Gregorevic; Håkan Westerblad; Tomas Venckunas; Marius Brazaitis; Sigitas Kamandulis; Johanna T. Lanner; Ana I. Teixeira; Gene W. Yeo (2024). Zfp697-is-an-RNA-binding-protein-that-regulates-skeletal-muscle-inflammation-and-remodeling---Zfp697-skeletal-muscle-knockout-RNA-Seq-of-unloading-reloading-protocol [Dataset]. https://osdr.nasa.gov/bio/repo/data/studies/OSD-880
    Explore at:
    Dataset updated
    Jul 30, 2024
    Dataset provided by
    NASAhttp://nasa.gov/
    Authors
    Jorge Ruas; Jorge C. Correia; Paulo R. Jannig; Maya L. Gosztyla; Igor Cervenka; Serge Ducommun; Stine M. Præstholm; José M. Dias; Kyle D. Dumont; Zhengye Liu; Qishan Liang; Daniel Edsgärd; Olof Emanuelsson; Paul Gregorevic; Håkan Westerblad; Tomas Venckunas; Marius Brazaitis; Sigitas Kamandulis; Johanna T. Lanner; Ana I. Teixeira; Gene W. Yeo
    License

    https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html

    Description

    Muscle atrophy is a morbidity and mortality risk factor that happens with disuse, chronic disease, and ageing. Recovery from atrophy involves changes in protein synthesis and different cell types such as muscle fibers, and satellite and immune cells. Here we show that the previously uncharacterized gene and protein Zfp697 is a damage-induced regulator of muscle regeneration. Zfp697/ZNF697 expression is transiently elevated during recovery from muscle atrophy or injury in mice and humans. Sustained Zfp697 expression in mouse muscle leads to a gene expression signature of chemokine secretion, immune cell recruitment, and extracellular matrix remodeling. Myofiber-specific Zfp697 ablation hinders the inflammatory and regenerative response to muscle injury, compromising functional recovery. We uncover Zfp697 as an essential mediator of the interferon gamma response in muscle cells that functions primarily as an ncRNA-binding protein, most notably the pro-regenerative miR-206. This work identifies Zfp697 as an integrator of cell-cell communication necessary for tissue regeneration. Overall design: Comparative gene expression profiling analysis of RNA-seq data for mouse gastrocnemius muscle from model of hindlimb unloading/reloading comparing wild-type and Zfp697 conditional skeletal muscle knockout (mKO) animals. Control mice were kept in conventional cages throughout the entire protocol. Experiment was performed in three replicates per condition.

  8. RNA-Seq differential expression analysis: An extended review and a software...

    • plos.figshare.com
    pdf
    Updated May 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Juliana Costa-Silva; Douglas Domingues; Fabricio Martins Lopes (2023). RNA-Seq differential expression analysis: An extended review and a software tool [Dataset]. http://doi.org/10.1371/journal.pone.0190152
    Explore at:
    pdfAvailable download formats
    Dataset updated
    May 30, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Juliana Costa-Silva; Douglas Domingues; Fabricio Martins Lopes
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The correct identification of differentially expressed genes (DEGs) between specific conditions is a key in the understanding phenotypic variation. High-throughput transcriptome sequencing (RNA-Seq) has become the main option for these studies. Thus, the number of methods and softwares for differential expression analysis from RNA-Seq data also increased rapidly. However, there is no consensus about the most appropriate pipeline or protocol for identifying differentially expressed genes from RNA-Seq data. This work presents an extended review on the topic that includes the evaluation of six methods of mapping reads, including pseudo-alignment and quasi-mapping and nine methods of differential expression analysis from RNA-Seq data. The adopted methods were evaluated based on real RNA-Seq data, using qRT-PCR data as reference (gold-standard). As part of the results, we developed a software that performs all the analysis presented in this work, which is freely available at https://github.com/costasilvati/consexpression. The results indicated that mapping methods have minimal impact on the final DEGs analysis, considering that adopted data have an annotated reference genome. Regarding the adopted experimental model, the DEGs identification methods that have more consistent results were the limma+voom, NOIseq and DESeq2. Additionally, the consensus among five DEGs identification methods guarantees a list of DEGs with great accuracy, indicating that the combination of different methods can produce more suitable results. The consensus option is also included for use in the available software.

  9. scRNA-seq + scATAC-seq Challenge at NeurIPS 2021

    • kaggle.com
    zip
    Updated Sep 16, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Alexander Chervov (2022). scRNA-seq + scATAC-seq Challenge at NeurIPS 2021 [Dataset]. https://www.kaggle.com/datasets/alexandervc/scrnaseq-scatacseq-challenge-at-neurips-2021
    Explore at:
    zip(2917180928 bytes)Available download formats
    Dataset updated
    Sep 16, 2022
    Authors
    Alexander Chervov
    Description

    Context

    Dataset from NeurIPS2021 challenge similar to Kaggle 2022 competition: https://www.kaggle.com/competitions/open-problems-multimodal "Open Problems - Multimodal Single-Cell Integration Predict how DNA, RNA & protein measurements co-vary in single cells"

    It is https://en.wikipedia.org/wiki/ATAC-seq#Single-cell_ATAC-seq single cell ATAC-seq data. And single cell RNA-seq data: https://en.wikipedia.org/wiki/Single-cell_transcriptomics#Single-cell_RNA-seq

    Single cell RNA sequencing, i.e. rows - correspond to cells, columns to genes (or vice versa). value of the matrix shows how strong is "expression" of the corresponding gene in the corresponding cell. https://en.wikipedia.org/wiki/Single-cell_transcriptomics

    See tutorials: https://scanpy.readthedocs.io/en/stable/tutorials.html ("Scanpy" - main Python package to work with scRNA-seq data). Or https://satijalab.org/seurat/ "Seurat" - "R" package

    (For companion dataset on CITE-seq = scRNA-seq + Proteomics, see: https://www.kaggle.com/datasets/alexandervc/citeseqscrnaseqproteins-challenge-neurips2021)

    Particular data

    https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE194122

    Expression profiling by high throughput sequencing Genome binding/occupancy profiling by high throughput sequencing Summary Single-cell multiomics data collected from bone marrow mononuclear cells of 12 healthy human donors. Half the samples were measured using the 10X Multiome Gene Expression and Chromatin Accessability kit and half were measured using the 10X 3' Single-Cell Gene Expression kit with Feature Barcoding in combination with the BioLegend TotalSeq B Universal Human Panel v1.0. The dataset was generated to support Multimodal Single-Cell Data Integration Challenge at NeurIPS 2021. Samples were prepared using a standard protocol at four sites. The resulting data was then annotated to identify cell types and remove doublets. The dataset was designed with a nested batch layout such that some donor samples were measured at multiple sites with some donors measured at a single site. In the competition, participants were tasked with challenges including modality prediction, matching profiles from different modalities, and learning a joint embedding from multiple modalities.

    Overall design Single-cell multiomics data collected from bone marrow mononuclear cells of 12 healthy human donors.

    Contributor(s) Burkhardt DB, Lücken MD, Lance C, Cannoodt R, Pisco AO, Krishnaswamy S, Theis FJ, Bloom JM Citation https://datasets-benchmarks-proceedings.neurips.cc/paper/2021/hash/158f3069a435b314a80bdcb024f8e422-Abstract-round2.html

    Related datasets:

    Other single cell RNA seq datasets can be found on kaggle: Look here: https://www.kaggle.com/alexandervc/datasets Or search kaggle for "scRNA-seq"

    Inspiration

    Single cell RNA sequencing is important technology in modern biology, see e.g. "Eleven grand challenges in single-cell data science" https://genomebiology.biomedcentral.com/articles/10.1186/s13059-020-1926-6

    Also see review : Nature. P. Kharchenko: "The triumphs and limitations of computational methods for scRNA-seq" https://www.nature.com/articles/s41592-021-01171-x

    Search scholar.google "challenges in single cell rna sequencing" https://scholar.google.fr/scholar?q=challenges+in+single+cell+rna+sequencing&hl=en&as_sdt=0&as_vis=1&oi=scholart gives many interesting and highly cited articles

    (Cited 968) Computational and analytical challenges in single-cell transcriptomics Oliver Stegle, Sarah A. Teichmann, John C. Marioni Nat. Rev. Genet., 16 (3) (2015), pp. 133-145 https://www.nature.com/articles/nrg3833

  10. r

    RNA sequencing data from patients with heart disease and controls

    • researchdata.se
    • demo.researchdata.se
    • +1more
    Updated Mar 18, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Victoria Rotter Sopasakis; Lillemor Mattsson Hultén (2025). RNA sequencing data from patients with heart disease and controls [Dataset]. http://doi.org/10.5878/e48r-gn02
    Explore at:
    Dataset updated
    Mar 18, 2025
    Dataset provided by
    University of Gothenburg
    Authors
    Victoria Rotter Sopasakis; Lillemor Mattsson Hultén
    Area covered
    Sweden
    Description

    RNA sequencing analysis of atrial cardiac tissue from patients undergoing coronary artery bypass grafting (CABG) or aortic valve replacement (AVR) was performed and compared with atrial cardiac tissue from organ donors and purchased human atrial cardiac RNA.

    RNA sequencing analysis was performed at the Genomics Core Facility at University of Gothenburg, Sweden. All samples (ten controls, ten CABG patients and ten AVR patients) were quality checked by the RNA integrity number (RIN) using Tapestation 2200 RNA screenTape (Agilent Technologies, Santa Clara, CA) and RNA concentration was measured by NanoDrop (Thermo Fisher, Waltham, MA). RIN values ranged between 6.6 and 9.0 for all samples. TruSeq Total Stranded RNA kit with RiboZero (Gold) Sample Preparation Guide (15,031,048 Rev. E) was used for RNA sample preparation (Illumina, San Diego, CA). A total of 10 μl (~1 μg) RNA from each sample was used for library preparation. Directly after depletion, a cleanup step was performed using 110 ul of the RNAClean XP beads (Beckman Coulter, USA) for each sample. The fragmentation step was performed for 8 min. 12 PCR cycles were run for all samples. Libraries were quantified and normalized with Qubit DNA HS Assay kit (Life Technologies, Carlsbad, CA) and fragment size determined by Tapestation 2200 (Agilent Technologies, Santa Clara, CA). The libraries were pooled together by using the Illumina protocol for pooling and sequenced with NovaSeq 6000 S1 (Illumina, San Diego, CA) for the read length of 2 × 100 bp.

  11. n

    Data from: Single cell RNA-seq analysis reveals that prenatal arsenic...

    • data.niaid.nih.gov
    • datadryad.org
    • +1more
    zip
    Updated Jun 1, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Britton Goodale; Kevin Hsu; Kenneth Ely; Thomas Hampton; Bruce Stanton; Richard Enelow (2020). Single cell RNA-seq analysis reveals that prenatal arsenic exposure results in long-term, adverse effects on immune gene expression in response to Influenza A infection [Dataset]. http://doi.org/10.5061/dryad.vt4b8gtp6
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jun 1, 2020
    Dataset provided by
    Dartmouth College
    Dartmouth–Hitchcock Medical Center
    Authors
    Britton Goodale; Kevin Hsu; Kenneth Ely; Thomas Hampton; Bruce Stanton; Richard Enelow
    License

    https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html

    Description

    Arsenic exposure via drinking water is a serious environmental health concern. Epidemiological studies suggest a strong association between prenatal arsenic exposure and subsequent childhood respiratory infections, as well as morbidity from respiratory diseases in adulthood, long after systemic clearance of arsenic. We investigated the impact of exclusive prenatal arsenic exposure on the inflammatory immune response and respiratory health after an adult influenza A (IAV) lung infection. C57BL/6J mice were exposed to 100 ppb sodium arsenite in utero, and subsequently infected with IAV (H1N1) after maturation to adulthood. Assessment of lung tissue and bronchoalveolar lavage fluid (BALF) at various time points post IAV infection reveals greater lung damage and inflammation in arsenic exposed mice versus control mice. Single-cell RNA sequencing analysis of immune cells harvested from IAV infected lungs suggests that the enhanced inflammatory response is mediated by dysregulation of innate immune function of monocyte derived macrophages, neutrophils, NK cells, and alveolar macrophages. Our results suggest that prenatal arsenic exposure results in lasting effects on the adult host innate immune response to IAV infection, long after exposure to arsenic, leading to greater immunopathology. This study provides the first direct evidence that exclusive prenatal exposure to arsenic in drinking water causes predisposition to a hyperinflammatory response to IAV infection in adult mice, which is associated with significant lung damage.

    Methods Whole lung homogenate preparation for single cell RNA sequencing (scRNA-seq).

    Lungs were perfused with PBS via the right ventricle, harvested, and mechanically disassociated prior to straining through 70- and 30-µm filters to obtain a single-cell suspension. Dead cells were removed (annexin V EasySep kit, StemCell Technologies, Vancouver, Canada), and samples were enriched for cells of hematopoetic origin by magnetic separation using anti-CD45-conjugated microbeads (Miltenyi, Auburn, CA). Single-cell suspensions of 6 samples were loaded on a Chromium Single Cell system (10X Genomics) to generate barcoded single-cell gel beads in emulsion, and scRNA-seq libraries were prepared using Single Cell 3’ Version 2 chemistry. Libraries were multiplexed and sequenced on 4 lanes of a Nextseq 500 sequencer (Illumina) with 3 sequencing runs. Demultiplexing and barcode processing of raw sequencing data was conducted using Cell Ranger v. 3.0.1 (10X Genomics; Dartmouth Genomics Shared Resource Core). Reads were aligned to mouse (GRCm38) and influenza A virus (A/PR8/34, genome build GCF_000865725.1) genomes to generate unique molecular index (UMI) count matrices. Gene expression data have been deposited in the NCBI GEO database and are available at accession # GSE142047.

    Preprocessing of single cell RNA sequencing (scRNA-seq) data

    Count matrices produced using Cell Ranger were analyzed in the R statistical working environment (version 3.6.1). Preliminary visualization and quality analysis were conducted using scran (v 1.14.3, Lun et al., 2016) and Scater (v. 1.14.1, McCarthy et al., 2017) to identify thresholds for cell quality and feature filtering. Sample matrices were imported into Seurat (v. 3.1.1, Stuart., et al., 2019) and the percentage of mitochondrial, hemoglobin, and influenza A viral transcripts calculated per cell. Cells with < 1000 or > 20,000 unique molecular identifiers (UMIs: low quality and doublets), fewer than 300 features (low quality), greater than 10% of reads mapped to mitochondrial genes (dying) or greater than 1% of reads mapped to hemoglobin genes (red blood cells) were filtered from further analysis. Total cells per sample after filtering ranged from 1895-2482, no significant difference in the number of cells was observed in arsenic vs. control. Data were then normalized using SCTransform (Hafemeister et al., 2019) and variable features identified for each sample. Integration anchors between samples were identified using canonical correlation analysis (CCA) and mutual nearest neighbors (MNNs), as implemented in Seurat V3 (Stuart., et al., 2019) and used to integrate samples into a shared space for further comparison. This process enables identification of shared populations of cells between samples, even in the presence of technical or biological differences, while also allowing for non-overlapping populations that are unique to individual samples.

    Clustering and reference-based cell identity labeling of single immune cells from IAV-infected lung with scRNA-seq

    Principal components were identified from the integrated dataset and were used for Uniform Manifold Approximation and Projection (UMAP) visualization of the data in two-dimensional space. A shared-nearest-neighbor (SNN) graph was constructed using default parameters, and clusters identified using the SLM algorithm in Seurat at a range of resolutions (0.2-2). The first 30 principal components were used to identify 22 cell clusters ranging in size from 25 to 2310 cells. Gene markers for clusters were identified with the findMarkers function in scran. To label individual cells with cell type identities, we used the singleR package (v. 3.1.1) to compare gene expression profiles of individual cells with expression data from curated, FACS-sorted leukocyte samples in the Immgen compendium (Aran D. et al., 2019; Heng et al., 2008). We manually updated the Immgen reference annotation with 263 sample group labels for fine-grain analysis and 25 CD45+ cell type identities based on markers used to sort Immgen samples (Guilliams et al., 2014). The reference annotation is provided in Table S2, cells that were not labeled confidently after label pruning were assigned “Unknown”.

    Differential gene expression by immune cells

    Differential gene expression within individual cell types was performed by pooling raw count data from cells of each cell type on a per-sample basis to create a pseudo-bulk count table for each cell type. Differential expression analysis was only performed on cell types that were sufficiently represented (>10 cells) in each sample. In droplet-based scRNA-seq, ambient RNA from lysed cells is incorporated into droplets, and can result in spurious identification of these genes in cell types where they aren’t actually expressed. We therefore used a method developed by Young and Behjati (Young et al., 2018) to estimate the contribution of ambient RNA for each gene, and identified genes in each cell type that were estimated to be > 25% ambient-derived. These genes were excluded from analysis in a cell-type specific manner. Genes expressed in less than 5 percent of cells were also excluded from analysis. Differential expression analysis was then performed in Limma (limma-voom with quality weights) following a standard protocol for bulk RNA-seq (Law et al., 2014). Significant genes were identified using MA/QC criteria of P < .05, log2FC >1.

    Analysis of arsenic effect on immune cell gene expression by scRNA-seq.

    Sample-wide effects of arsenic on gene expression were identified by pooling raw count data from all cells per sample to create a count table for pseudo-bulk gene expression analysis. Genes with less than 20 counts in any sample, or less than 60 total counts were excluded from analysis. Differential expression analysis was performed using limma-voom as described above.

  12. Data from: Benchmarking computational doublet-detection methods for...

    • zenodo.org
    bin, zip
    Updated Apr 1, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nan Miles Xi; Jingyi Jessica Li; Nan Miles Xi; Jingyi Jessica Li (2022). Benchmarking computational doublet-detection methods for single-cell RNA sequencing data [Dataset]. http://doi.org/10.5281/zenodo.4444303
    Explore at:
    zip, binAvailable download formats
    Dataset updated
    Apr 1, 2022
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Nan Miles Xi; Jingyi Jessica Li; Nan Miles Xi; Jingyi Jessica Li
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This repository contains the real and simulation datasets used in the paper "Benchmarking Computational Doublet-Detection Methods for Single-Cell RNA Sequencing Data". Please check the full text published on Cell Systems or our preprint.

    1. real_datasets.zip: 16 real scRNA-seq datasets with experimentally annotated doublets. The name of each file corresponds to the names in the benchmark paper.

    2. simulation_datasets.zip: simulation datasets used in the benchmark, including different experimental conditions, scalability, stability, running time, and the effects of doublet detection on DE gene analysis, highly variable gene identification, cell clustering, and trajectory inference.

    3. result.xlsx: a tabular file that saves benchmarking results, including AUPRC, AUROC, precision, recall, TNR, and cell clustering. It is also the data source for drawing figures in the paper "Protocol for Benchmarking Computational Doublet-Detection Methods in Single-Cell RNA Sequencing Data Analysis".

  13. m

    Data from: RNA Sequencing-Based Single Sample Predictors of Molecular...

    • data.mendeley.com
    Updated Jun 1, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Johan Vallon-Christersson (2022). RNA Sequencing-Based Single Sample Predictors of Molecular Subtype and Risk of Recurrence for Clinical Assessment of Early-Stage Breast Cancer [Dataset]. http://doi.org/10.17632/yzxtxn4nmd.1
    Explore at:
    Dataset updated
    Jun 1, 2022
    Authors
    Johan Vallon-Christersson
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Gene expression data and associated supplementary files from RNAseq of breast cancer samples from Staaf et al. (source reference below). Library preparation for mRNA-sequencing was done by a stranded dUTP mRNA protocol or by Illumina stranded TruSeq mRNA protocol. Expression data (Fragments Per Kilobase per Million reads, FPKM) was generated by an analysis pipeline utilizing Hisat/StringTie with GRCh38 human genome primary assembly and GENCODE Release 27 transcripts/genes. Gene expression data is summarized on GENCODE gene identifier. Gene and transcript definitions and gene annotations are from GENCODE Release 27.

    Detailed description including material and methods for RNAseq, Hisat/StringTie analysis pipeline, and the development of the Single Sample Predictor (SSP) models for Breast Cancer is available in Staaf et al. (source reference below).

    The developed SSP models are available as an R package available at GitHub (reference below).

  14. Data from: RNAseq analysis of the response of Arabidopsis thaliana to...

    • datasets.ai
    • s.cnmilf.com
    • +1more
    21
    Updated Jan 31, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    National Aeronautics and Space Administration (2023). RNAseq analysis of the response of Arabidopsis thaliana to fractional gravity under blue-light stimulation during spaceflight [Dataset]. https://datasets.ai/datasets/rnaseq-analysis-of-the-response-of-arabidopsis-thaliana-to-fractional-gravity-under-blue-l-5c55d
    Explore at:
    21Available download formats
    Dataset updated
    Jan 31, 2023
    Dataset provided by
    NASAhttp://nasa.gov/
    Authors
    National Aeronautics and Space Administration
    Description

    Traveling to nearby extraterrestrial objects having a reduced gravity level (partial gravity) compared to Earth's gravity is becoming a realistic objective for space agencies. The use of plants as part of life support systems will require a better understanding of the interactions among plant growth responses including tropisms, under partial gravity conditions. Here, we present results from our latest space experiments on the ISS, in which seeds of Arabidopsis thaliana were germinated, and seedlings grew for six days under different gravity levels, namely micro-g, several intermediate partial-g levels, and 1g, and were subjected to irradiation with blue light for the last 48 hours. RNA was extracted from 20 samples for subsequent RNAseq analysis. Transcriptomic analysis was performed using the HISAT2-Stringtie-DESeq pipeline. Differentially expressed genes were further characterized for global responses using the GEDI tool, gene networks and for Gene Ontology (GO) enrichment.

  15. S

    U87 cell which overexpresdsed control vector or tau were used for RNA-Seq...

    • scidb.cn
    Updated Jun 12, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Han Wanhong; zhao wu jie; He Jiawei; Zhao Wenpeng; Lu Hanwen; Lu Zhenwei; Qiu xiansheng; Chang Chen; Zhang Yaya; Xie Yuanyuan; Geng Yanyan; Zhang Bingchang; Wang zhanxiang (2025). U87 cell which overexpresdsed control vector or tau were used for RNA-Seq analysis [Dataset]. http://doi.org/10.57760/sciencedb.26421
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jun 12, 2025
    Dataset provided by
    Science Data Bank
    Authors
    Han Wanhong; zhao wu jie; He Jiawei; Zhao Wenpeng; Lu Hanwen; Lu Zhenwei; Qiu xiansheng; Chang Chen; Zhang Yaya; Xie Yuanyuan; Geng Yanyan; Zhang Bingchang; Wang zhanxiang
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    This dataset originates from RNA sequencing analysis of human glioblastoma U87 cell lines, aiming to investigate the impact of Tau protein overexpression on the transcriptome. In this experiment, U87 cells were infected with either Tau-overexpressing plasmids or control plasmids (3 × 10⁶ cells per sample), yielding three independent samples for each group (Tau and control). Cells were lysed using TRIzol reagent, and total RNA was extracted. The concentration and purity of RNA were measured using the NanoDrop ND-1000 (NanoDrop, USA), and RNA integrity was assessed using the Agilent Bioanalyzer 2100, with all RIN values above 7.0, further confirmed by denaturing agarose gel electrophoresis.From each sample, 1 μg of total RNA was used for poly(A) RNA purification using Dynabeads Oligo (dT)25 (Thermo Fisher, USA) in two rounds. Fragmentation of the poly(A) RNA was then performed at 94°C for 5–7 minutes using the Magnesium RNA Fragmentation Module (NEB, USA). The fragmented RNA was reverse-transcribed into cDNA using SuperScript II reverse transcriptase (Invitrogen, USA), and the second-strand DNA was synthesized using E. coli DNA polymerase I, RNase H, and dUTP to produce U-labeled double-stranded DNA. After adenylation of 3' ends, sequencing adapters with T-overhangs were ligated to the A-tailed DNA fragments. Size selection was performed using AMPure XP magnetic beads, with an average insert size of 300 ± 50 bp. The U-labeled second-strand DNA was treated with UDG enzyme, and the libraries were amplified via PCR under the following conditions: 95°C for 3 min; 8 cycles of 98°C for 15 s, 60°C for 15 s, and 72°C for 30 s; followed by a final extension at 72°C for 5 min, generating the final sequencing libraries.Paired-end 150 bp sequencing (PE150) was performed using the Illumina NovaSeq™ 6000 platform, following the manufacturer's protocol. Raw sequencing data were processed using fastp to remove adapter contamination, low-quality bases, and reads with undetermined bases. Clean reads were aligned to the human reference genome (GRCh38) using HISAT2, followed by transcript assembly using StringTie. All transcriptomes were merged using gffcompare to generate a comprehensive transcript annotation. Gene expression levels were calculated by StringTie and normalized as FPKM (Fragments Per Kilobase of exon model per Million mapped reads). Differential gene expression analysis between groups was conducted using DESeq2, with differentially expressed genes (DEGs) defined as those with FDR < 0.05 and |fold change| ≥ 2. edgeR was also applied in specific pairwise comparisons. Enrichment analysis of DEGs was conducted using the KEGG database.

  16. n

    Data from: Whole-transcriptome sequencing identifies neuroinflammation,...

    • data.niaid.nih.gov
    • search.dataone.org
    • +1more
    zip
    Updated Jun 24, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Cheng Ni; Zizheng Suo; Jing Yang; Bowen Zhou; Yinyin Qu; Wenjie Xu; Min Li; Ting Xiao; Hui Zheng (2022). Whole-transcriptome sequencing identifies neuroinflammation, metabolism and blood-brain barrier related processes in the hippocampus of aged mice during perioperative period [Dataset]. http://doi.org/10.5061/dryad.jsxksn0cj
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jun 24, 2022
    Dataset provided by
    Chinese Academy of Medical Sciences & Peking Union Medical College
    Peking University Third Hospital
    Authors
    Cheng Ni; Zizheng Suo; Jing Yang; Bowen Zhou; Yinyin Qu; Wenjie Xu; Min Li; Ting Xiao; Hui Zheng
    License

    https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html

    Description

    Aim: Perioperative neurocognitive disorders (PND) occur frequently after surgery and anesthesia, especially in aged patients. Previous studies have shown multiple PND related mechanisms in the hippocampus, however, their relationships remain unclear. Meanwhile, the perioperative neuropathological processes are sophisticated and changeable, single period study could not reveal the accurate mechanisms. Thus, multiperiod whole-transcriptome study is necessary to elucidate the gene expression patterns during perioperative period. Methods: Aged C57BL/6 mice were subjected to exploratory laparotomy under sevoflurane anesthesia. Whole-transcriptome sequencing (RNA-seq analysis) was performed on the hippocampi from control condition (Con), 30 minutes (Day0), 2 days (Day2) and 7 days (Day7) after surgery. Gene Ontology/Kyoto Encyclopedia of Genes and Genomes analyses, quantitative Real-Time PCR, immunofluorescence and fear conditioning test were also performed to elucidate the pathological processes and modulation networks during the period. Results: Through RNA-seq analysis, 328, 3597 and 4179 differentially expressed genes (DEGs) were screened out in intraoperative period (Day0 vs Con), early postoperative period (Day2 vs Day0) and late postoperative period (Day7 vs Day2). The involved GO biological processes were divided into 9 categories, and positive-regulated processes were more than negative-regulated ones. Seventy-four transcription factors were highlighted. The potential synaptic and neuroinflammatory pathways were constructed for Neurotransmitter, Synapse and Neuronal alteration categories with 9 DEGs (Htr1a, Rims1, Ezh2, etc.). The metabolic and mitochondrial pathways were constructed for Metabolism, Oxidative stress and Biological rhythm categories with 9 DEGs (Gpld1, Sirt1, Cry2, etc.). The blood-brain barrier and neurotoxicity related pathways were constructed for Blood-brain barrier, Neurotoxicity and Cognitive function categories with 10 DEGs (Mmp2, Itpr1, Nrf1, etc.). Conclusion: The results revealed gene expression patterns and modulation networks in the aged hippocampus during perioperative period, which provide insights into overall mechanisms and potential therapeutic targets for prevention and treatment of perioperative central nervous system diseases, such as PND, from the genetic level. Methods Animals Female C57BL/6 mice, 18-month-old, weighing between 23 and 34 g were used. The mice were housed in cages and maintained on a standard housing condition with food and water ad libitum for 2 weeks. Four study time points were chosen: control condition (Con, preoperative time point), 30 minutes after surgery (Day0, postoperative day 0, the time point between intraoperative period and postoperative period), postoperative day 2 (Day2) and postoperative day 7 (Day7). The perioperative period was divided into intraoperative period (between Con and Day0), early postoperative period (between Day0 and Day2), and late postoperative period (between Day2 and Day7). Mice was randomly assigned to Con, Day0, Day2 and Day7 groups (n=6). Surgery and Anesthesia Minimum alveolar concentration of sevoflurane for mice has been reported as 2.4 - 2.7%. In the present study, mice in Day0, Day2 and Day7 groups received 2.5% sevoflurane in 50% oxygen for 30 min through breathing masks, and the control group received 50% oxygen for 30 min. The mice breathed spontaneously, and the sevoflurane concentration was monitored continuously with an anesthetic monitor (Datex, Tewksbury, MA, USA). The surgical procedure (exploratory laparotomy) was performed for the 3 groups. A longitudinal midline incision was made from xiphoid to 0.5 cm proximal pubic symphysis on the skin. The abdominal muscles and peritoneum, then approximately 10 cm of the intestine were exteriorized. The bowel loops remained outside the abdominal cavity for 1 minute and then replaced into the abdominal cavity. The incision was finally sutured layer by layer with 5-0 Vicryl thread. The entire procedure was completed under sevoflurane anesthesia. The rectal temperature was maintained at 37 ± 0.5 °C, and this surgical protocol has been shown not to significantly alter values of blood pressure and blood gas in the preliminary studies. Then the mice were put into a chamber containing 50% oxygen until 10 minutes after the recovery of consciousness. Mice in Day0, Day2 and Day7 groups were sacrificed by decapitation 30 min, 2 days and 7 days after surgery respectively. The brain tissue was removed rapidly, and the hippocampus was dissected out and frozen in liquid nitrogen. RNA-Seq Library Preparation and Sequencing Analysis Total RNAs were isolated from the hippocampus using TRIzol reagent (Invitrogen, Carlsbad, CA, USA), then digested with RNase-Free DNase to remove residual DNAs. The Quantity and purity were detected with Nanodrop 2000 (ThermoFisher, Wilmington, DE, USA) and Qubit Fluorometer (Invitrogen, Carlsbad, CA, USA). Library construction was performed according to the Illumina sample preparation for RNA-seq protocol. The mRNA was enriched by magnetic beads with Oligo (dT) after the samples were qualified. When the enrichment was complete, the mRNA was interrupted into short segments with the addition of a fragmentation buffer. Subsequently, double-stranded cDNA was synthesized by reverse transcription using 6-base random primers. The purified double-stranded cDNA was subjected to terminal reparation, singe nucleotide A (Adenine) addition and serial sequencing. The fragment size of double-stranded cDNA was selected by an AMpure XP bead (Beckman coulter, Shanghai, China), and the selected double-stranded cDNA was subjected to PCR enrichment to construct a cDNA library. Constructing and sequencing the RNA-seq library for each sample was conducted (Compass Biotechnology, Beijing, China) based on the protocols of Illumina HiSeqTM2500/MiSeq™ to generate paired-end reads (150 bp in length). The quality of RNA-seq reads from all the brain tissues was checked using FastQC (v0.11.5, Babraham institute, Cambridge, UK).

  17. Raw and processed (filtered and annotated) scRNAseq data

    • figshare.com
    zip
    Updated Jun 12, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Gabrielle Leclercq-Cohen; Sabrina Danilin; Llucia Alberti-Servera; Stephan Schmeing; Hélène Haegel; Sina Nassiri; Marina Bacac (2023). Raw and processed (filtered and annotated) scRNAseq data [Dataset]. http://doi.org/10.6084/m9.figshare.23499192.v1
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jun 12, 2023
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Gabrielle Leclercq-Cohen; Sabrina Danilin; Llucia Alberti-Servera; Stephan Schmeing; Hélène Haegel; Sina Nassiri; Marina Bacac
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Single cell RNA-seq data generated and reported as part of the manuscript entitled "Dissecting the mechanisms underlying the Cytokine Release Syndrome (CRS) mediated by T Cell Bispecific Antibodies" by Leclercq-Cohen et al 2023. Raw and processed (filtered and annotated) data are provided as AnnData objects which can be directly ingested to reproduce the findings of the paper or for ab initio data reuse: 1- raw.zip provides concatenated raw/unfiltered counts for the 20 samples in the standard Market Exchange Format (MEX) format. 2- 230330_sw_besca2_LowFil_raw.h5ad contains filtered cells and raw counts in the HDF5 format. 3- 221124_sw_besca2_LowFil.annotated.h5ad contains filtered cells and log normalized counts, along with cell type annotation in the HDF5 format.

    scRNAseq data generation: Whole blood from 4 donors was treated with 0.2 μg/mL CD20-TCB, or incubated in the absence of CD20- TCB. At baseline (before addition of TCB) and assay endpoints (2, 4, 6, and 20 hrs), blood was collected for total leukocyte isolation using EasySepTM red blood cell depletion reagent (Stemcell). Briefly, cells were counted and processed for single cell RNA sequencing using the BD Rhapsody platform. To load several samples on a single BD Rhapsody cartridge, sample cells were labelled with sample tags (BD Human Single-Cell Multiplexing Kit) following the manufacturer’s protocol prior to pooling. Briefly, 1x106 cells from each sample were re-suspended in 180 μL FBS Stain Buffer (BD, PharMingen) and sample tags were added to the respective samples and incubated for 20 min at RT. After incubation, 2 successive washes were performed by addition of 2 mL stain buffer and centrifugation for 5 min at 300 g. Cells were then re- suspended in 620 μL cold BD Sample Buffer, stained with 3.1 μL of both 2 mM Calcein AM (Thermo Fisher Scientific) and 0.3 mM Draq7 (BD Biosciences) and finally counted on the BD Rhapsody scanner. Samples were then diluted and/or pooled equally in 650 μL cold BD Sample Buffer. The BD Rhapsody cartridges were then loaded with up to 40 000 – 50 000 cells. Single cells were isolated using Single-Cell Capture and cDNA Synthesis with the BD Rhapsody Express Single-Cell Analysis System according to the manufacturer’s recommendations (BD Biosciences). cDNA libraries were prepared using the Whole Transcriptome Analysis Amplification Kit following the BD Rhapsody System mRNA Whole Transcriptome Analysis (WTA) and Sample Tag Library Preparation Protocol (BD Biosciences). Indexed WTA and sample tags libraries were quantified and quality controlled on the Qubit Fluorometer using the Qubit dsDNA HS Assay, and on the Agilent 2100 Bioanalyzer system using the Agilent High Sensitivity DNA Kit. Sequencing was performed on a Novaseq 6000 (Illumina) in paired-end mode (64-8- 58) with Novaseq6000 S2 v1 or Novaseq6000 SP v1.5 reagents kits (100 cycles). scRNAseq data analysis: Sequencing data was processed using the BD Rhapsody Analysis pipeline (v 1.0 https://www.bd.com/documents/guides/user-guides/GMX_BD-Rhapsody-genomics- informatics_UG_EN.pdf) on the Seven Bridges Genomics platform. Briefly, read pairs with low sequencing quality were first removed and the cell label and UMI identified for further quality check and filtering. Valid reads were then mapped to the human reference genome (GRCh38-PhiX-gencodev29) using the aligner Bowtie2 v2.2.9, and reads with the same cell label, same UMI sequence and same gene were collapsed into a single raw molecule while undergoing further error correction and quality checks. Cell labels were filtered with a multi-step algorithm to distinguish those associated with putative cells from those associated with noise. After determining the putative cells, each cell was assigned to the sample of origin through the sample tag (only for cartridges with multiplex loading). Finally, the single-cell gene expression matrices were generated and a metrics summary was provided. After pre-processing with BD’s pipeline, the count matrices and metadata of each sample were aggregated into a single adata object and loaded into the besca v2.3 pipeline for the single cell RNA sequencing analysis (43). First, we filtered low quality cells with less than 200 genes, less than 500 counts or more than 30% of mitochondrial reads. This permissive filtering was used in order to preserve the neutrophils. We further excluded potential multiplets (cells with more than 5,000 genes or 20,000 counts), and genes expressed in less than 30 cells. Normalization, log-transformed UMI counts per 10,000 reads [log(CP10K+1)], was applied before downstream analysis. After normalization, technical variance was removed by regressing out the effects of total UMI counts and percentage of mitochondrial reads, and gene expression was scaled. The 2,507 most variable genes (having a minimum mean expression of 0.0125, a maximum mean expression of 3 and a minimum dispersion of 0.5) were used for principal component analysis. Finally, the first 50 PCs were used as input for calculating the 10 nearest neighbours and the neighbourhood graph was then embedded into the two-dimensional space using the UMAP algorithm at a resolution of 2. Cell type annotation was performed using the Sig-annot semi-automated besca module, which is a signature- based hierarchical cell annotation method. The used signatures, configuration and nomenclature files can be found at https://github.com/bedapub/besca/tree/master/besca/datasets. For more details, please refer to the publication.

  18. o

    Data from: Multiple insert size paired-end sequencing for deconvolution of...

    • omicsdi.org
    xml
    Updated Nov 4, 2015
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Gunnar Rätsch,Lisa Hartmann,Andre Kahles,Philipp Drewe,Regina Bohnert,Lisa Smith,Christa Lanz (2015). Multiple insert size paired-end sequencing for deconvolution of complex transcriptomes [Dataset]. https://www.omicsdi.org/dataset/arrayexpress-repository/E-GEOD-40507
    Explore at:
    xmlAvailable download formats
    Dataset updated
    Nov 4, 2015
    Authors
    Gunnar Rätsch,Lisa Hartmann,Andre Kahles,Philipp Drewe,Regina Bohnert,Lisa Smith,Christa Lanz
    Variables measured
    Transcriptomics,Multiomics
    Description

    Deep sequencing of transcriptomes allows quantitative and qualitative analysis of many RNA species in a sample, with parallel comparison of expression levels, splicing variants, natural antisense transcripts, RNA editing and transcriptional start and stop sites the ideal goal. By computational modeling, we show how libraries of multiple insert sizes combined with strand-specific, paired-end (SS-PE) sequencing can increase the information gained on alternative splicing, especially in higher eukaryotes. Despite the benefits of gaining SS-PE data with paired ends of varying distance, the standard Illumina protocol allows only non-strand-specific, paired-end sequencing with a single insert size. Here, we modify the Illumina RNA ligation protocol to allow SS-PE sequencing by using a custom pre-adenylated 3’ adaptor. We generate parallel libraries with differing insert sizes to aid deconvolution of alternative splicing events and to characterize the extent and distribution of natural antisense transcription in C. elegans. Despite stringent requirements for detection of alternative splicing, our data increases the number of intron retention and exon skipping events annotated in the Wormbase genome annotations by 127 % and 121 %, respectively. We show that parallel libraries with a range of insert sizes increase transcriptomic information gained by sequencing and that by current established benchmarks our protocol gives competitive results with respect to library quality. Sequencing of mRNA from C. elegans with libraries of four differing insert sizes

  19. f

    RNA-seq analysis dataset for Pseudomonas aeruginosa PAO1 37°C vs. 43°C

    • plus.figshare.com
    application/gzip
    Updated Oct 18, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Esther Shmidov; Alexis Villani; Joseph Bondy-Denomy; Ehud Banin (2024). RNA-seq analysis dataset for Pseudomonas aeruginosa PAO1 37°C vs. 43°C [Dataset]. http://doi.org/10.25452/figshare.plus.27241997.v1
    Explore at:
    application/gzipAvailable download formats
    Dataset updated
    Oct 18, 2024
    Dataset provided by
    Figshare+
    Authors
    Esther Shmidov; Alexis Villani; Joseph Bondy-Denomy; Ehud Banin
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    The data in this item includes raw RNA-sequencing data from post-43°C exposure and during the recovery period to assess transcript-level effect.Sample preparation: Overall, 18 samples included 3 replicates of Pseudomonas aeruginosa after exposure to 37 °C (control) or 43 °C (heat shock), at 3 time points (T=0hr, T=18hr, and T=54hr). Samples from the T=0 groups are simply labeled “37” or “43”. For RNA sequencing, 2 ug of total RNA was used for the RiboMinus™ Bacteria Transcriptome Isolation Kit (Invitrogen). The library was constructed with NEBNext® Ultra™ II Directional RNA Library Prep Kit for Illumina (NEB) according to the manufacturer's instructions using 30 ng of depleted RNA. The final quality was evaluated by TapeStation High Sensitivity D1000 Assay (Agilent Technologies, CA, USA). Sequencing was performed based on Qubit values and loaded onto an Illumina MiSeq using the MiSeq V2 (50- cycles) Kit (Illumina, CA, USA). Paired-end RNA-seq protocol was used, yielding about 3.4-6.5 million paired-end reads per sample. FastQC (v0.11.2) was used to assess the quality of raw reads.Analysis: Reads were aligned to Pseudomonas aeruginosa PAO1 strain (assembly GCF_000006765.1 ) using the bowtie2 aligner software (v2.3.2) with default parameters. GTF annotation file for the PAO1 strain was downloaded from NCBIPseudomonas Genome DB ( www.pseudomonas.com). Raw read counts for gene-level features were determined using HTSeq-count with the intersection-strict mode. Differentially expressed genes were determined with the R Bioconductor package DESeq2 (Release 3.14). The p-values were corrected with the Benjamini-Hochberg FDR procedure. Genes with adjusted p-values; 0.05.

  20. f

    Data from: Deciphering the RNA landscape by RNAome sequencing

    • datasetcatalog.nlm.nih.gov
    • tandf.figshare.com
    Updated Oct 10, 2015
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    van den Hout, Mirjam CGN; Brouwer, Rutger WW; Derks, Kasper WJ; Hoeijmakers, Jan HJ; Pothof, Joris; Gomez, Cesar Payan; Vrieling, Harry; van IJcken, Wilfred FJ; Kockx, Christel EM; Misovic, Branislav (2015). Deciphering the RNA landscape by RNAome sequencing [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0001901420
    Explore at:
    Dataset updated
    Oct 10, 2015
    Authors
    van den Hout, Mirjam CGN; Brouwer, Rutger WW; Derks, Kasper WJ; Hoeijmakers, Jan HJ; Pothof, Joris; Gomez, Cesar Payan; Vrieling, Harry; van IJcken, Wilfred FJ; Kockx, Christel EM; Misovic, Branislav
    Description

    Current RNA expression profiling methods rely on enrichment steps for specific RNA classes, thereby not detecting all RNA species in an unperturbed manner. We report strand-specific RNAome sequencing that determines expression of small and large RNAs from rRNA-depleted total RNA in a single sequence run. Since current analysis pipelines cannot reliably analyze small and large RNAs simultaneously, we developed TRAP, Total Rna Analysis Pipeline, a robust interface that is also compatible with existing RNA sequencing protocols. RNAome sequencing quantitatively preserved all RNA classes, allowing cross-class comparisons that facilitates the identification of relationships between different RNA classes. We demonstrate the strength of RNAome sequencing in mouse embryonic stem cells treated with cisplatin. MicroRNA and mRNA expression in RNAome sequencing significantly correlated between replicates and was in concordance with both existing RNA sequencing methods and gene expression arrays generated from the same samples. Moreover, RNAome sequencing also detected additional RNA classes such as enhancer RNAs, anti-sense RNAs, novel RNA species and numerous differentially expressed RNAs undetectable by other methods. At the level of complete RNA classes, RNAome sequencing also identified a specific global repression of the microRNA and microRNA isoform classes after cisplatin treatment whereas all other classes such as mRNAs were unchanged. These characteristics of RNAome sequencing will significantly improve expression analysis as well as studies on RNA biology not covered by existing methods.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Sun, Zhifu; Zhang, Yuji; Wang, Liguo; Bhagwate, Aditya V.; Asmann, Yan W.; Kalari, Krishna R.; Perez, Edith A.; Baker, Tiffany R.; Thompson, E. Aubrey; Kocher, Jean-Pierre A.; Carr, Jennifer M.; Nair, Asha (2013). Impact of Library Preparation on Downstream Analysis and Interpretation of RNA-Seq Data: Comparison between Illumina PolyA and NuGEN Ovation Protocol [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0001650929

Data from: Impact of Library Preparation on Downstream Analysis and Interpretation of RNA-Seq Data: Comparison between Illumina PolyA and NuGEN Ovation Protocol

Related Article
Explore at:
Dataset updated
Aug 19, 2013
Authors
Sun, Zhifu; Zhang, Yuji; Wang, Liguo; Bhagwate, Aditya V.; Asmann, Yan W.; Kalari, Krishna R.; Perez, Edith A.; Baker, Tiffany R.; Thompson, E. Aubrey; Kocher, Jean-Pierre A.; Carr, Jennifer M.; Nair, Asha
Description

ObjectivesThe sequencing by the PolyA selection is the most common approach for library preparation. With limited amount or degraded RNA, alternative protocols such as the NuGEN have been developed. However, it is not yet clear how the different library preparations affect the downstream analyses of the broad applications of RNA sequencing.Methods and MaterialsEight human mammary epithelial cell (HMEC) lines with high quality RNA were sequenced by Illumina’s mRNA-Seq PolyA selection and NuGEN ENCORE library preparation. The following analyses and comparisons were conducted: 1) the numbers of genes captured by each protocol; 2) the impact of protocols on differentially expressed gene detection between biological replicates; 3) expressed single nucleotide variant (SNV) detection; 4) non-coding RNAs, particularly lincRNA detection; and 5) intragenic gene expression.ResultsSequences from the NuGEN protocol had lower (75%) alignment rate than the PolyA (over 90%). The NuGEN protocol detected fewer genes (12–20% less) with a significant portion of reads mapped to non-coding regions. A large number of genes were differentially detected between the two protocols. About 17–20% of the differentially expressed genes between biological replicates were commonly detected between the two protocols. Significantly higher numbers of SNVs (5–6 times) were detected in the NuGEN samples, which were largely from intragenic and intergenic regions. The NuGEN captured fewer exons (25% less) and had higher base level coverage variance. While 6.3% of reads were mapped to intragenic regions in the PolyA samples, the percentages were much higher (20–25%) for the NuGEN samples. The NuGEN protocol did not detect more known non-coding RNAs such as lincRNAs, but targeted small and “novel” lincRNAs.ConclusionDifferent library preparations can have significant impacts on downstream analysis and interpretation of RNA-seq data. The NuGEN provides an alternative for limited or degraded RNA but it has limitations for some RNA-seq applications.

Search
Clear search
Close search
Google apps
Main menu