100+ datasets found
  1. Additional file 2 of scTyper: a comprehensive pipeline for the cell typing...

    • springernature.figshare.com
    xlsx
    Updated May 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ji-Hye Choi; Hye In Kim; Hyun Goo Woo (2023). Additional file 2 of scTyper: a comprehensive pipeline for the cell typing analysis of single-cell RNA-seq data [Dataset]. http://doi.org/10.6084/m9.figshare.12762703.v1
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    May 30, 2023
    Dataset provided by
    figshare
    Figsharehttp://figshare.com/
    Authors
    Ji-Hye Choi; Hye In Kim; Hyun Goo Woo
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Additional file 2: Supplementary Table 2–3. This file contains the list of cell markers in each of scTyper.db (Table S2) and CellMarker DB (Table S3) and detailed information such as identifier, study name, species, cell type, gene symbol, and PMID.

  2. o

    Data from: SCDevDB: A Database for Insights Into Single-Cell Gene Expression...

    • omicsdi.org
    Updated Jan 14, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2021). SCDevDB: A Database for Insights Into Single-Cell Gene Expression Profiles During Human Developmental Processes. [Dataset]. https://www.omicsdi.org/dataset/biostudies/S-EPMC6775478
    Explore at:
    Dataset updated
    Jan 14, 2021
    Variables measured
    Unknown
    Description

    Single-cell RNA-seq studies profile thousands of cells in developmental processes. Current databases for human single-cell expression atlas only provide search and visualize functions for a selected gene in specific cell types or subpopulations. These databases are limited to technical properties or visualization of single-cell RNA-seq data without considering the biological relations of their collected cell groups. Here, we developed a database to investigate single-cell gene expression profiling during different developmental pathways (SCDevDB). In this database, we collected 10 human single-cell RNA-seq datasets, split these datasets into 176 developmental cell groups, and constructed 24 different developmental pathways. SCDevDB allows users to search the expression profiles of the interested genes across different developmental pathways. It also provides lists of differentially expressed genes during each developmental pathway, T-distributed stochastic neighbor embedding maps showing the relationships between developmental stages based on these differentially expressed genes, Gene Ontology, and Kyoto Encyclopedia of Genes and Genomes analysis results of these differentially expressed genes. This database is freely available at https://scdevdb.deepomics.org.

  3. scRNA-seq + scATAC-seq Challenge at NeurIPS 2021

    • kaggle.com
    zip
    Updated Sep 16, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Alexander Chervov (2022). scRNA-seq + scATAC-seq Challenge at NeurIPS 2021 [Dataset]. https://www.kaggle.com/datasets/alexandervc/scrnaseq-scatacseq-challenge-at-neurips-2021
    Explore at:
    zip(2917180928 bytes)Available download formats
    Dataset updated
    Sep 16, 2022
    Authors
    Alexander Chervov
    Description

    Context

    Dataset from NeurIPS2021 challenge similar to Kaggle 2022 competition: https://www.kaggle.com/competitions/open-problems-multimodal "Open Problems - Multimodal Single-Cell Integration Predict how DNA, RNA & protein measurements co-vary in single cells"

    It is https://en.wikipedia.org/wiki/ATAC-seq#Single-cell_ATAC-seq single cell ATAC-seq data. And single cell RNA-seq data: https://en.wikipedia.org/wiki/Single-cell_transcriptomics#Single-cell_RNA-seq

    Single cell RNA sequencing, i.e. rows - correspond to cells, columns to genes (or vice versa). value of the matrix shows how strong is "expression" of the corresponding gene in the corresponding cell. https://en.wikipedia.org/wiki/Single-cell_transcriptomics

    See tutorials: https://scanpy.readthedocs.io/en/stable/tutorials.html ("Scanpy" - main Python package to work with scRNA-seq data). Or https://satijalab.org/seurat/ "Seurat" - "R" package

    (For companion dataset on CITE-seq = scRNA-seq + Proteomics, see: https://www.kaggle.com/datasets/alexandervc/citeseqscrnaseqproteins-challenge-neurips2021)

    Particular data

    https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE194122

    Expression profiling by high throughput sequencing Genome binding/occupancy profiling by high throughput sequencing Summary Single-cell multiomics data collected from bone marrow mononuclear cells of 12 healthy human donors. Half the samples were measured using the 10X Multiome Gene Expression and Chromatin Accessability kit and half were measured using the 10X 3' Single-Cell Gene Expression kit with Feature Barcoding in combination with the BioLegend TotalSeq B Universal Human Panel v1.0. The dataset was generated to support Multimodal Single-Cell Data Integration Challenge at NeurIPS 2021. Samples were prepared using a standard protocol at four sites. The resulting data was then annotated to identify cell types and remove doublets. The dataset was designed with a nested batch layout such that some donor samples were measured at multiple sites with some donors measured at a single site. In the competition, participants were tasked with challenges including modality prediction, matching profiles from different modalities, and learning a joint embedding from multiple modalities.

    Overall design Single-cell multiomics data collected from bone marrow mononuclear cells of 12 healthy human donors.

    Contributor(s) Burkhardt DB, Lücken MD, Lance C, Cannoodt R, Pisco AO, Krishnaswamy S, Theis FJ, Bloom JM Citation https://datasets-benchmarks-proceedings.neurips.cc/paper/2021/hash/158f3069a435b314a80bdcb024f8e422-Abstract-round2.html

    Related datasets:

    Other single cell RNA seq datasets can be found on kaggle: Look here: https://www.kaggle.com/alexandervc/datasets Or search kaggle for "scRNA-seq"

    Inspiration

    Single cell RNA sequencing is important technology in modern biology, see e.g. "Eleven grand challenges in single-cell data science" https://genomebiology.biomedcentral.com/articles/10.1186/s13059-020-1926-6

    Also see review : Nature. P. Kharchenko: "The triumphs and limitations of computational methods for scRNA-seq" https://www.nature.com/articles/s41592-021-01171-x

    Search scholar.google "challenges in single cell rna sequencing" https://scholar.google.fr/scholar?q=challenges+in+single+cell+rna+sequencing&hl=en&as_sdt=0&as_vis=1&oi=scholart gives many interesting and highly cited articles

    (Cited 968) Computational and analytical challenges in single-cell transcriptomics Oliver Stegle, Sarah A. Teichmann, John C. Marioni Nat. Rev. Genet., 16 (3) (2015), pp. 133-145 https://www.nature.com/articles/nrg3833

  4. m

    Investigating Highly Variable Genes in Single-cell RNA-seq Data across...

    • data.mendeley.com
    Updated May 16, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jantarika Kumar Arora (2023). Investigating Highly Variable Genes in Single-cell RNA-seq Data across Multiple Cell Types and Conditions [Dataset]. http://doi.org/10.17632/6ry3x7r8hf.3
    Explore at:
    Dataset updated
    May 16, 2023
    Authors
    Jantarika Kumar Arora
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The peripheral blood immune cell (PBMC) samples were collected from patients infected with dengue virus (DENV) at four time points: two and one day(s) before defervescence (febrile phase), at defervescence (critical phase), and two-week convalescence. The raw and filtered matrix files were generated using CellRanger version 3.0.2 (10x Genomics, USA) with the reference human genome GRCh38 1.2.0. Potential contamination of ambient RNAs was corrected using SoupX. Low quality cells, including cells expressing mitochondrial genes higher than 10% and doublets/multiplets, were excluded using Seurat and doubletFinder, respectively. The individual samples were then integrated using the SCTransform method with 3,000 gene features. Principal component analysis (PCA) and clustering were performed with the Louvain algorithm applying multi-level refinement algorithm. The gene expression level of each cell was normalized using the LogNormalize method in Seurat. Cell types were annotated using the canonical marker genes described in the original paper, see related link below.

  5. n

    Cell Centered Database

    • neuinfo.org
    • scicrunch.org
    • +2more
    Updated Nov 5, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). Cell Centered Database [Dataset]. http://identifiers.org/RRID:SCR_002168
    Explore at:
    Dataset updated
    Nov 5, 2025
    Description

    THIS RESOURCE IS NO LONGER IN SERVICE, documented June 5, 2017. It has been merged with Cell Image Library. Database for sharing and mining cellular and subcellular high resolution 2D, 3D and 4D data from light and electron microscopy, including correlated imaging that makes unique and valuable datasets available to the scientific community for visualization, reuse and reanalysis. Techniques range from wide field mosaics taken with multiphoton microscopy to 3D reconstructions of cellular ultrastructure using electron tomography. Contributions from the community are welcome. The CCDB was designed around the process of reconstruction from 2D micrographs, capturing key steps in the process from experiment to analysis. The CCDB refers to the set of images taken from microscope the as the Microscopy Product. The microscopy product refers to a set of related 2D images taken by light (epifluorescence, transmitted light, confocal or multiphoton) or electron microscopy (conventional or high voltage transmission electron microscopy). These image sets may comprise a tilt series, optical section series, through focus series, serial sections, mosaics, time series or a set of survey sections taken in a single microscopy session that are not related in any systematic way. A given set of data may be more than one product, for example, it is possible for a set of images to be both a mosaic and a tilt series. The Microscopy Product ID serves as the accession number for the CCDB. All microscopy products must belong to a project and be stored along with key specimen preparation details. Each project receives a unique Project ID that groups together related microscopy products. Many of the datasets come from published literature, but publication is not a prerequisite for inclusion in the CCDB. Any datasets that are of high quality and interest to the scientific community can be included in the CCDB.

  6. n

    Allen Cell Types Database

    • neuinfo.org
    • dknet.org
    • +2more
    Updated Dec 16, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2022). Allen Cell Types Database [Dataset]. http://identifiers.org/RRID:SCR_014806
    Explore at:
    Dataset updated
    Dec 16, 2022
    Description

    Database of neuronal cell types based on multimodal characterization of single cells to enable data-driven approaches to classification. It includes data such as electrophysiology recordings, imaging data, morphological reconstructions, and RNA and DNA sequencing data.

  7. Meta data for single cells

    • figshare.com
    bin
    Updated May 7, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Andrew Leduc (2024). Meta data for single cells [Dataset]. http://doi.org/10.6084/m9.figshare.25282663.v1
    Explore at:
    binAvailable download formats
    Dataset updated
    May 7, 2024
    Dataset provided by
    figshare
    Figsharehttp://figshare.com/
    Authors
    Andrew Leduc
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Information such as label, run, diameter...ect ID column corresponds to columns of single cell data matrix

  8. Raw and processed (filtered and annotated) scRNAseq data

    • figshare.com
    zip
    Updated Jun 12, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Gabrielle Leclercq-Cohen; Sabrina Danilin; Llucia Alberti-Servera; Stephan Schmeing; Hélène Haegel; Sina Nassiri; Marina Bacac (2023). Raw and processed (filtered and annotated) scRNAseq data [Dataset]. http://doi.org/10.6084/m9.figshare.23499192.v1
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jun 12, 2023
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Gabrielle Leclercq-Cohen; Sabrina Danilin; Llucia Alberti-Servera; Stephan Schmeing; Hélène Haegel; Sina Nassiri; Marina Bacac
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Single cell RNA-seq data generated and reported as part of the manuscript entitled "Dissecting the mechanisms underlying the Cytokine Release Syndrome (CRS) mediated by T Cell Bispecific Antibodies" by Leclercq-Cohen et al 2023. Raw and processed (filtered and annotated) data are provided as AnnData objects which can be directly ingested to reproduce the findings of the paper or for ab initio data reuse: 1- raw.zip provides concatenated raw/unfiltered counts for the 20 samples in the standard Market Exchange Format (MEX) format. 2- 230330_sw_besca2_LowFil_raw.h5ad contains filtered cells and raw counts in the HDF5 format. 3- 221124_sw_besca2_LowFil.annotated.h5ad contains filtered cells and log normalized counts, along with cell type annotation in the HDF5 format.

    scRNAseq data generation: Whole blood from 4 donors was treated with 0.2 μg/mL CD20-TCB, or incubated in the absence of CD20- TCB. At baseline (before addition of TCB) and assay endpoints (2, 4, 6, and 20 hrs), blood was collected for total leukocyte isolation using EasySepTM red blood cell depletion reagent (Stemcell). Briefly, cells were counted and processed for single cell RNA sequencing using the BD Rhapsody platform. To load several samples on a single BD Rhapsody cartridge, sample cells were labelled with sample tags (BD Human Single-Cell Multiplexing Kit) following the manufacturer’s protocol prior to pooling. Briefly, 1x106 cells from each sample were re-suspended in 180 μL FBS Stain Buffer (BD, PharMingen) and sample tags were added to the respective samples and incubated for 20 min at RT. After incubation, 2 successive washes were performed by addition of 2 mL stain buffer and centrifugation for 5 min at 300 g. Cells were then re- suspended in 620 μL cold BD Sample Buffer, stained with 3.1 μL of both 2 mM Calcein AM (Thermo Fisher Scientific) and 0.3 mM Draq7 (BD Biosciences) and finally counted on the BD Rhapsody scanner. Samples were then diluted and/or pooled equally in 650 μL cold BD Sample Buffer. The BD Rhapsody cartridges were then loaded with up to 40 000 – 50 000 cells. Single cells were isolated using Single-Cell Capture and cDNA Synthesis with the BD Rhapsody Express Single-Cell Analysis System according to the manufacturer’s recommendations (BD Biosciences). cDNA libraries were prepared using the Whole Transcriptome Analysis Amplification Kit following the BD Rhapsody System mRNA Whole Transcriptome Analysis (WTA) and Sample Tag Library Preparation Protocol (BD Biosciences). Indexed WTA and sample tags libraries were quantified and quality controlled on the Qubit Fluorometer using the Qubit dsDNA HS Assay, and on the Agilent 2100 Bioanalyzer system using the Agilent High Sensitivity DNA Kit. Sequencing was performed on a Novaseq 6000 (Illumina) in paired-end mode (64-8- 58) with Novaseq6000 S2 v1 or Novaseq6000 SP v1.5 reagents kits (100 cycles). scRNAseq data analysis: Sequencing data was processed using the BD Rhapsody Analysis pipeline (v 1.0 https://www.bd.com/documents/guides/user-guides/GMX_BD-Rhapsody-genomics- informatics_UG_EN.pdf) on the Seven Bridges Genomics platform. Briefly, read pairs with low sequencing quality were first removed and the cell label and UMI identified for further quality check and filtering. Valid reads were then mapped to the human reference genome (GRCh38-PhiX-gencodev29) using the aligner Bowtie2 v2.2.9, and reads with the same cell label, same UMI sequence and same gene were collapsed into a single raw molecule while undergoing further error correction and quality checks. Cell labels were filtered with a multi-step algorithm to distinguish those associated with putative cells from those associated with noise. After determining the putative cells, each cell was assigned to the sample of origin through the sample tag (only for cartridges with multiplex loading). Finally, the single-cell gene expression matrices were generated and a metrics summary was provided. After pre-processing with BD’s pipeline, the count matrices and metadata of each sample were aggregated into a single adata object and loaded into the besca v2.3 pipeline for the single cell RNA sequencing analysis (43). First, we filtered low quality cells with less than 200 genes, less than 500 counts or more than 30% of mitochondrial reads. This permissive filtering was used in order to preserve the neutrophils. We further excluded potential multiplets (cells with more than 5,000 genes or 20,000 counts), and genes expressed in less than 30 cells. Normalization, log-transformed UMI counts per 10,000 reads [log(CP10K+1)], was applied before downstream analysis. After normalization, technical variance was removed by regressing out the effects of total UMI counts and percentage of mitochondrial reads, and gene expression was scaled. The 2,507 most variable genes (having a minimum mean expression of 0.0125, a maximum mean expression of 3 and a minimum dispersion of 0.5) were used for principal component analysis. Finally, the first 50 PCs were used as input for calculating the 10 nearest neighbours and the neighbourhood graph was then embedded into the two-dimensional space using the UMAP algorithm at a resolution of 2. Cell type annotation was performed using the Sig-annot semi-automated besca module, which is a signature- based hierarchical cell annotation method. The used signatures, configuration and nomenclature files can be found at https://github.com/bedapub/besca/tree/master/besca/datasets. For more details, please refer to the publication.

  9. Human Gene Expression Database Data Package

    • johnsnowlabs.com
    csv
    Updated Jan 20, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    John Snow Labs (2021). Human Gene Expression Database Data Package [Dataset]. https://www.johnsnowlabs.com/marketplace/human-gene-expression-database-data-package/
    Explore at:
    csvAvailable download formats
    Dataset updated
    Jan 20, 2021
    Dataset authored and provided by
    John Snow Labs
    Description

    This data package contains expression profiles for proteins in normal and cancer tissues. It also contains data on sequence based RNA levels in human tissue and cell line.

  10. n

    Cell Properties Database

    • neuinfo.org
    • dknet.org
    • +2more
    Updated Feb 1, 2001
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2001). Cell Properties Database [Dataset]. http://identifiers.org/RRID:SCR_007285
    Explore at:
    Dataset updated
    Feb 1, 2001
    Description

    A repository for data regarding membrane channels, receptor and neurotransmitters that are expressed in specific types of cells. The database is presently focused on neurons but will eventually include other cell types, such as glia, muscle, and gland cells. This resource is intended to: * Serve as a repository for data on gene products expressed in different brain regions * Support research on cellular properties in the nervous system * Provide a gateway for entering data into the cannonical neuron forms in NeuronDB * Identify receptors across neuron types to aid in drug development * Serve as a first step toward a functional genomics of nerve cells * Serve as a teaching aid

  11. u

    Data from: Cerebellum cell type collaboration database

    • rdr.ucl.ac.uk
    • datasetcatalog.nlm.nih.gov
    • +1more
    bin
    Updated Mar 4, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Maxime Beau; David Herzfeld; Francisco Naveros; Marie Hemelt; Federico D'Agostino; Marlies Oostland; Alvaro Sánchez-López; Young Yoon Chung; Michael Maibach; Stephen Kyranakis; Hannah N. Stabb; Gabriela Martínez Lopera; Agoston Lajko; Marie Zedler; Shogo Ohmae; Nathan Hall; Beverley Clark; Dana Cohen; Stephen Lisberger; Dimitar Kostadinov; Court Hull; Michael Hausser; Javier Medina (2025). Cerebellum cell type collaboration database [Dataset]. http://doi.org/10.5522/04/23702850.v1
    Explore at:
    binAvailable download formats
    Dataset updated
    Mar 4, 2025
    Dataset provided by
    University College London
    Authors
    Maxime Beau; David Herzfeld; Francisco Naveros; Marie Hemelt; Federico D'Agostino; Marlies Oostland; Alvaro Sánchez-López; Young Yoon Chung; Michael Maibach; Stephen Kyranakis; Hannah N. Stabb; Gabriela Martínez Lopera; Agoston Lajko; Marie Zedler; Shogo Ohmae; Nathan Hall; Beverley Clark; Dana Cohen; Stephen Lisberger; Dimitar Kostadinov; Court Hull; Michael Hausser; Javier Medina
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Description

    The C4 DatabaseThis is the official repository for the hdf5 datasets of the cerebellar cell-type classification collaboration (C4), published as a companion to the paper "A deep-learning strategy to identify cell types across species from high-density extracellular recordings" published in Cell (https://doi.org/10.1016/j.cell.2025.01.041).Instructions to use the cell-type classifier, links to download these datasets, and a data explorer can be found at https://www.c4-database.com.The specifications of the fields, data types and data formats stored in the hdf5 binary files can be found at https://www.tinyurl.com/c4database. Hdf5 files can be easily opened with Python, MATLAB and many other programming languages.Using and Citing the C4 DatabaseThe data and visualizations on this website are intended to be freely available for use by the scientific community. The C4 dataset is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License, while our classifier is licensed under the GNU General Public License v3.0 as part of NeuroPyxels. If you download and use our data for a publication, and/or if you would like to refer to the database, please cite Beau et al., 2025, Cell together with the NeuroPyxels repository (Beau et al., 2021, Zenodo), and include the link to the C4 online portal https://www.c4-database.com in your methods section. Thank you!

  12. r

    Database of Immune Cell Epigenomes

    • rrid.site
    • neuinfo.org
    Updated May 20, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2018). Database of Immune Cell Epigenomes [Dataset]. http://identifiers.org/RRID:SCR_018259/resolver?q=&i=rrid
    Explore at:
    Dataset updated
    May 20, 2018
    Description

    Database of Immune Cell Expression, Expression quantitative trait loci (eQTLs) and Epigenomics. Collection of identified cis-eQTLs for 12,254 unique genes, which represent 61% of all protein-coding genes expressed in human cell types. Datasets to help reveal effects of disease risk associated genetic polymorphisms on specific immune cell types, providing mechanistic insights into how they might influence pathogenesis.

  13. f

    Data and metadata supporting the published article: Development and...

    • datasetcatalog.nlm.nih.gov
    • springernature.figshare.com
    Updated Jun 25, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ethier, Stephen; Kappler, Chistiana; Couch, Daniel; Guest, Stephen T.; Duchinski, Kathryn; Armeson, Kent; Wilson, Robert C.; Gray, Joe W.; Garrett-Mayer, Elizabeth (2020). Data and metadata supporting the published article: Development and implementation of the SUM breast cancer cell line functional genomics knowledge base. [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0000577286
    Explore at:
    Dataset updated
    Jun 25, 2020
    Authors
    Ethier, Stephen; Kappler, Chistiana; Couch, Daniel; Guest, Stephen T.; Duchinski, Kathryn; Armeson, Kent; Wilson, Robert C.; Gray, Joe W.; Garrett-Mayer, Elizabeth
    Description

    The SUM human breast cancer cell lines have been used by many labs around the world to develop extensive data sets derived from comparative genomic hybridization analysis, gene expression profiling, whole exome sequencing, and reverse phase protein array analysis. In a previous study, the authors of this paper performed genome-scale shRNA essentiality screens on the entire SUM line panel, as well as on MCF10A cells, MCF-7 cells, and MCF-7LTED cells. In this study, the authors have developed the SUM Breast Cancer Cell Line Knowledge Base, to make all of these omics data sets available to users of the SUM lines, and to allow users to mine the data and analyse them with respect to biological pathways enriched by the data in each cell line.Data access: All the datasets supporting the findings of this study are publicly available in the SLKBase platform here: https://sumlineknowledgebase.com/. RPPA data, drug sensitivity data, apelisib response data, and data on dose response, are also part of this figshare data record (https://doi.org/10.6084/m9.figshare.12497630).Study aims and methodology: This web-based knowledge base provides users with data and information on the derivation of each of the cell lines, provides narrative summaries of the genomics and cell biology of each breast cancer cell line, and provides protocols for the proper maintenance of the cells. The database includes a series of data mining tools that allow rapid identification of the functional oncogene signatures for each line, the enrichment of any KEGG pathway with screen hit and gene expression data for each of the lines, and a rapid analysis of protein and phospho-protein expression for the cell lines. A gene search tool that returns all of the functional genome and functional druggable data for any gene for the entire cell line panel, is included. Additionally, the authors have expanded the database to include functional genomic data for an additional 29 commonly used breast cancer cell lines. The three overarching goals in the original development of the SLKBase are: 1) to provide a rich source of information for anyone working with any of the SUM breast cancer cell lines, 2) to give researchers ready access to the large genomic data sets that have been developed with these cells, and 3) to allow researchers to perform orthogonal analyses of the various genomics data sets that we and others have obtained from the SUM lines. For more information on the development and contents of the database, please read the related article.Datasets supporting the paper:The data mining tools accessed the following datasets to generate the figures and tables, and these datasets are downloadable from the Data Download centre on the SLKBase: Exome sequencing data: SLKBase.exome_.seq_.sum_.xlsxGene amplification and expression data for the SUM cell lines: SUM44amplificationdata.xlsSUM52.xlsSUM149.xlsSUM159.xlsSUM185.xlsSUM190.xlsSUM225.xlsSUM229.xlsSUM1315.xlsCellecta shRNA screen data for the SUM cell lines:SUM44Celectadata.csvSUM52Cellectadata.csvSUM102Cellectadata.csvSUM149Cellectadata.csvSUM159Cellectadata.csvSUM185Cellectadata.csvSUM190Cellectadata.csvSUM225Cellectadata.csvSUM229Cellectadata.csvSUM1315hits.hit.csvMCF10A.hits_.csvBreast cancer cell line data included in this data record (these datasets were used to generate figures 1, 2 and 7 in the article):Proteomics data from the Reverse Phase Protein Array (RPPA) assay analysis: Ethier.SUMline.RPPA.xlsxDrug sensitivity data: NAVITOCLAX.drugsensitivity.Zscores.xlsxApelisib response data: Apelisib all lines (2).xlsxDose response data: 092614 Dose Response CP 52s.11.15.xlsxAll the files are either in .xlsx or .csv file format.

  14. s

    Single Cell Smart-Seq 3 RNA-Seq and Bulk Exome Seq from Breast Cancer...

    • figshare.scilifelab.se
    • datasetcatalog.nlm.nih.gov
    • +3more
    Updated Jan 15, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Seong-Hwan Jun; Hosein Toosi; Jeff Mold; Camilla Engblom; Xinsong Chen; Ciara O’Flanagan; Michael Hagemann-Jensen; Rickard Sandberg; Johan Hartman; Samuel Aparicio; Andrew Roth; Jens Lagergren (2025). Single Cell Smart-Seq 3 RNA-Seq and Bulk Exome Seq from Breast Cancer Patients [Dataset]. http://doi.org/10.17044/scilifelab.15082398.v1
    Explore at:
    Dataset updated
    Jan 15, 2025
    Dataset provided by
    KTH Royal Institute of Technology
    Authors
    Seong-Hwan Jun; Hosein Toosi; Jeff Mold; Camilla Engblom; Xinsong Chen; Ciara O’Flanagan; Michael Hagemann-Jensen; Rickard Sandberg; Johan Hartman; Samuel Aparicio; Andrew Roth; Jens Lagergren
    License

    https://www.scilifelab.se/data/restricted-access/https://www.scilifelab.se/data/restricted-access/

    Description

    Data Set DescriptionSingle cell RNA sequencing (Samrt-Seq3) and Whole exome sequencing from multiple regions of individual tumors from Breast Cancer patients and also single cell RNA seq for two ovarian cancer cell lines.The dataset contains raw sequencing data for various high-throughput molecular tests performed on two sample types: tumor samples from two breast cancer patients and cell lines derived from High-grade serous carcinoma Patients. The breast cancer data comes from two patients: patient 1 (BCSA1) has two tumor regions A-B and patient 2 (BCSA2) has five regions(A-E). For a normal sample and each region from each patient Whole Exome Sequencing was performed using Twist Biosciences Human Exome Kit by the SNP&SEQ Technology platform, SciLifeLab, National Genomics Infrastructure Uppsala, Sweden. Also for each patient, EPCAM+ CD45- sorted cells from all the regions where sorted to a 384 well plate, and Smart-Seq3 libraries were prepared at Karolinska Institutet and sequenced at National Genomics Infrastructure Uppsala, Sweden.The HGSOC cell-line data comes from OV2295R2 and TOV2295R cell lines described in Laks et al Cell 2019 Nov 14; 179(5): 1207–1221.e22 doi: 10.1016/j.cell.2019.10.026 . The cell line Smart-Seq3 libraries were prepared from two 384 well plates at Karolinska Institutet and sequenced at National Genomics Infrastructure Uppsala, Sweden.Terms for accessThis dataset is to be used for research on intratumor heterogeneity and subclonal evolution of tumors. To apply for conditional access to the dataset in this publication, please contact datacentre@scilifelab.se.

  15. m

    AtT-20 cell line expression data

    • figshare.mq.edu.au
    • researchdata.edu.au
    xlsx
    Updated Nov 10, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Marina Junqueira Santiago; Mark Connor (2022). AtT-20 cell line expression data [Dataset]. http://doi.org/10.25949/21529404.v1
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Nov 10, 2022
    Dataset provided by
    Macquarie University
    Authors
    Marina Junqueira Santiago; Mark Connor
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Results of transcript sequencing for AtT-20FlpIn cells. mRNA was isolated from AtT-20FlpIn cells using standard procedures, next generation sequencing was performed by Macrogen (https://dna.macrogen.com/). A report ourtlining the workflow and data analysis methods is available from the Authors by request.

    Deposited data is in an Excel file, which includes the gene symbol, transcript ID from the reference mouse genome, protein ID and transcript abundance. The AtT-20FlpIn cells were generated by Dr Santiago, and have been used as the 'wild type' cells for generating cell lines stably expressing GPCR and ion channels for most of the molecular pharmacology projects in the Molecular Pharmacodynamics group.

  16. G

    Single-Cell Data Analysis Software Market Research Report 2033

    • growthmarketreports.com
    csv, pdf, pptx
    Updated Aug 29, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Growth Market Reports (2025). Single-Cell Data Analysis Software Market Research Report 2033 [Dataset]. https://growthmarketreports.com/report/single-cell-data-analysis-software-market
    Explore at:
    pptx, csv, pdfAvailable download formats
    Dataset updated
    Aug 29, 2025
    Dataset authored and provided by
    Growth Market Reports
    Time period covered
    2024 - 2032
    Area covered
    Global
    Description

    Single-Cell Data Analysis Software Market Outlook



    According to our latest research, the global single-cell data analysis software market size reached USD 424.5 million in 2024. The market is demonstrating a robust upward trajectory, driven by technological advancements and expanding applications across life sciences. The market is projected to grow at a CAGR of 15.9% from 2025 to 2033, reaching an estimated USD 1,483.4 million by 2033. This impressive growth is primarily fueled by the increasing adoption of single-cell sequencing technologies in genomics, transcriptomics, and proteomics research, as well as the expanding demand from pharmaceutical and biotechnology companies for advanced data analytics solutions.




    One of the primary growth factors for the single-cell data analysis software market is the rapid evolution and adoption of high-throughput single-cell sequencing technologies. Over the past decade, there has been a significant shift from bulk cell analysis to single-cell approaches, allowing researchers to unravel cellular heterogeneity with unprecedented resolution. This transition has generated massive volumes of complex data, necessitating sophisticated software tools for effective analysis, visualization, and interpretation. The need to extract actionable insights from these intricate datasets is compelling both academic and commercial entities to invest in advanced single-cell data analysis software, thus propelling market expansion.




    Another major driver is the expanding application scope of single-cell data analysis across various omics fields, including genomics, transcriptomics, proteomics, and epigenomics. The integration of these multi-omics datasets is enabling deeper insights into disease mechanisms, biomarker discovery, and personalized medicine. Pharmaceutical and biotechnology companies are increasingly leveraging single-cell data analysis software to accelerate drug discovery and development processes, optimize clinical trials, and identify novel therapeutic targets. The continuous innovation in algorithms, machine learning, and artificial intelligence is further enhancing the capabilities of these software solutions, making them indispensable tools in modern biomedical research.



    Single-cell Analysis is revolutionizing the field of life sciences by providing unprecedented insights into cellular diversity and function. This cutting-edge approach allows researchers to study individual cells in isolation, revealing intricate details about their genetic, transcriptomic, and proteomic profiles. By focusing on single cells, scientists can uncover rare cell types and understand complex biological processes that were previously masked in bulk analyses. The ability to perform Single-cell Analysis is transforming our understanding of diseases, enabling the identification of novel biomarkers and therapeutic targets, and paving the way for personalized medicine.




    The surge in government and private funding for single-cell research, coupled with the rising prevalence of chronic and infectious diseases, is also contributing to market growth. Governments worldwide are launching initiatives to support precision medicine and genomics research, fostering collaborations between academic institutions and industry players. This supportive ecosystem is not only stimulating the development of new single-cell technologies but also driving the adoption of specialized data analysis software. Moreover, the increasing awareness of the importance of data reproducibility and standardization is prompting the adoption of advanced software platforms that ensure robust, scalable, and reproducible analysis workflows.




    From a regional perspective, North America continues to dominate the single-cell data analysis software market, attributed to its strong research infrastructure, presence of leading biotechnology and pharmaceutical companies, and substantial funding for genomics research. However, the Asia Pacific region is emerging as a significant growth engine, driven by increasing investments in life sciences, growing collaborations between academia and industry, and the rapid adoption of advanced sequencing technologies. Europe also holds a considerable share, supported by robust research activities and supportive regulatory frameworks. The market landscape in Latin America and the Middle East & Africa r

  17. CITE-seq = scRNA-seq + Proteins: Human PBMCs 2019

    • kaggle.com
    zip
    Updated Sep 11, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Alexander Chervov (2022). CITE-seq = scRNA-seq + Proteins: Human PBMCs 2019 [Dataset]. https://www.kaggle.com/datasets/alexandervc/citeseq-scrnaseq-proteins-human-pbmcs-2019
    Explore at:
    zip(44334628 bytes)Available download formats
    Dataset updated
    Sep 11, 2022
    Authors
    Alexander Chervov
    Description

    Data and Context

    Data - results of single cell RNA sequencing and CD** proteins measurements - so-called CITE-seq technology, which combines single cell RNA sequencing with protein measurements.

    Kaggle competition https://www.kaggle.com/competitions/open-problems-multimodal/overview uses similar data in one the subtasks.

    Particular data: Paper: Stuart T, Butler A, Hoffman P, Hafemeister C et al. Comprehensive Integration of Single-Cell Data. Cell 2019 Jun 13;177(7):1888-1902.e21. PMID: 31178118 https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6687398/

    Data: https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE128639

    Related datasets:

    Other single cell RNA seq datasets can be found on kaggle: Look here: https://www.kaggle.com/alexandervc/datasets Or search kaggle for "scRNA-seq"

    Inspiration

    Single cell RNA sequencing is important technology in modern biology, see e.g. "Eleven grand challenges in single-cell data science" https://genomebiology.biomedcentral.com/articles/10.1186/s13059-020-1926-6

    Also see review : Nature. P. Kharchenko: "The triumphs and limitations of computational methods for scRNA-seq" https://www.nature.com/articles/s41592-021-01171-x

    Search scholar.google "challenges in single cell rna sequencing" https://scholar.google.fr/scholar?q=challenges+in+single+cell+rna+sequencing&hl=en&as_sdt=0&as_vis=1&oi=scholart gives many interesting and highly cited articles

    (Cited 968) Computational and analytical challenges in single-cell transcriptomics Oliver Stegle, Sarah A. Teichmann, John C. Marioni Nat. Rev. Genet., 16 (3) (2015), pp. 133-145 https://www.nature.com/articles/nrg3833

    Challenges in unsupervised clustering of single-cell RNA-seq data https://www.nature.com/articles/s41576-018-0088-9 Review Article 07 January 2019 Vladimir Yu Kiselev, Tallulah S. Andrews & Martin Hemberg Nature Reviews Genetics volume 20, pages273–282 (2019)

    Challenges and emerging directions in single-cell analysis https://link.springer.com/article/10.1186/s13059-017-1218-y Published: 08 May 2017 Guo-Cheng Yuan, Long Cai, Michael Elowitz, Tariq Enver, Guoping Fan, Guoji Guo, Rafael Irizarry, Peter Kharchenko, Junhyong Kim, Stuart Orkin, John Quackenbush, Assieh Saadatpour, Timm Schroeder, Ramesh Shivdasani & Itay Tirosh Genome Biology volume 18, Article number: 84 (2017)

    Single-Cell RNA Sequencing in Cancer: Lessons Learned and Emerging Challenges https://www.sciencedirect.com/science/article/pii/S1097276519303569 Molecular Cell Volume 75, Issue 1, 11 July 2019, Pages 7-12 Journal home page for Molecular Cell

  18. R

    Data and code accompanying the publication titled : "Reconstruction of...

    • entrepot.recherche.data.gouv.fr
    bin, docx, html +4
    Updated Jul 28, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    David MORIZET; David MORIZET; Isabelle FOUCHER; Alessandro ALUNNI; Alessandro ALUNNI; Laure BALLY-CUIF; Laure BALLY-CUIF; Isabelle FOUCHER (2025). Data and code accompanying the publication titled : "Reconstruction of macroglia and adult neurogenesis evolution through cross-species single-cell transcriptomic analyses" [Dataset]. http://doi.org/10.57745/2POJIR
    Explore at:
    docx(4434383), bin(458043432), html(24655843), html(21734703), bin(771178106), txt(1751), bin(1577795436), html(21438133), bin(439071899), bin(216960765), bin(1608940888), text/x-r-notebook(5908), type/x-r-syntax(39346), bin(1723139554), bin(335550400), text/x-r-notebook(8543), text/x-r-notebook(11233), docx(4105943), bin(508079898), bin(3349536138), tsv(948181), text/x-r-notebook(24764), bin(523305093), bin(683208511), html(18483775), bin(1529657884), bin(2366850541), docx(17288114)Available download formats
    Dataset updated
    Jul 28, 2025
    Dataset provided by
    Recherche Data Gouv
    Authors
    David MORIZET; David MORIZET; Isabelle FOUCHER; Alessandro ALUNNI; Alessandro ALUNNI; Laure BALLY-CUIF; Laure BALLY-CUIF; Isabelle FOUCHER
    License

    https://spdx.org/licenses/etalab-2.0.htmlhttps://spdx.org/licenses/etalab-2.0.html

    Dataset funded by
    Fondation pour la Recherche Médicale (Foundation for Medical Research in France)
    Ligue Contre le Cancer
    Agence nationale de la recherche
    European Research Council
    Description

    Codes, processed data and examples of analyses used in the publication: Reconstruction of macroglia and adult neurogenesis evolution through cross-species single-cell transcriptomic analyses. This repository contains data generated in the Bally-Cuif lab through scRNA-seq of dissected zebrafish telencephala, as well as code used in their analysis. Included are datasets and illustrations obtained from reanalysis of other studies. Comparative analyses between these datasets were conducted to investigate the evolution of adult neurogenesis and of macroglia. All the data present in this repository are derived from scRNA-seq analysis obtained from various species (zebrafish in our case, and species ranging from cnidarians to humans in the case of reanalyzed datasets). Our dataset was generated by dissecting zebrafish brains, enriching for adult neural stem cells and sequencing individual cells with the 10X Chromium v2 kit. Our aim was to gain insight into the physiology and heterogeneity of adult neural stem cells and compare zebrafish neural stem cells to those found in other species. Retrieved datasets were generated for various purposes but were used in this study to serve as basis for comparative evolution analyses.

  19. p

    Human Protein Atlas - Subcellular

    • proteinatlas.org
    Updated Sep 26, 2008
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Human Protein Atlas (2008). Human Protein Atlas - Subcellular [Dataset]. https://www.proteinatlas.org/humanproteome/subcellular
    Explore at:
    Dataset updated
    Sep 26, 2008
    Dataset authored and provided by
    Human Protein Atlas
    License

    https://www.proteinatlas.org/about/licencehttps://www.proteinatlas.org/about/licence

    Description

    Subcellular methods

    The subcellular resource of the Human Protein Atlas provides high-resolution insights into the expression and spatiotemporal distribution of proteins encoded by 13603 genes (67% of the human protein-coding genes), as well as predictions for an additional 3459 secreted- or membrane proteins, covering a total of 17062 genes (85% of the human protein-coding genes). For each gene, the subcellular distribution of the protein has been investigated by immunofluorescence (ICC-IF) and confocal microscopy in up to three different standard cell lines, selected from a panel of 42 cell lines used in the subcellular resource. For some genes, the protein has also been stained in up to three ciliated cell lines, induced pluripotent stem cells (iPSCs) and/or in human sperm cells. Upon image analysis, the subcellular localization of the protein has been classified into one or more of 49 different organelles and subcellular structures. In addition, the resource includes an annotation of genes that display single-cell variation in protein expression levels and/or subcellular distribution, as well as an extended analysis of cell cycle dependency of such variations. 
    

    The subcellular resource offers a database for detailed exploration of individual genes and proteins of interest, as well as for systematic analysis of proteomes in a broader context. More information about the content of the resouce, as well as the generation and analysis of the data, can be found in the Methods summary. Learn about:

    The subcellular distribution of proteins in standard human cell lines, including ciliated cells and iPSCs. The subcellular distribution of proteins in human sperm. The proteomes of different organelles and subcellular structures. Single-cell variability in the expression levels and/or localizations of proteins.

  20. scRNA-seq "Tabula sapiens" - human, Part 2

    • kaggle.com
    zip
    Updated Feb 5, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Alexander Chervov (2022). scRNA-seq "Tabula sapiens" - human, Part 2 [Dataset]. https://www.kaggle.com/datasets/alexandervc/scrnaseq-tabula-sapiens-human-part-2
    Explore at:
    zip(7504637468 bytes)Available download formats
    Dataset updated
    Feb 5, 2022
    Authors
    Alexander Chervov
    Description

    Remark 1: for cell cycle analysis - see paper https://arxiv.org/abs/2208.05229 "Computational challenges of cell cycle analysis using single cell transcriptomics" Alexander Chervov, Andrei Zinovyev

    Remark 2: The first of the data see in https://www.kaggle.com/alexandervc/scrnaseq-tabula-sapiens-human-500-000-cells

    Data and Context

    Data - results of single cell RNA sequencing, i.e. rows - correspond to cells, columns to genes (or vice versa). value of the matrix shows how strong is "expression" of the corresponding gene in the corresponding cell. https://en.wikipedia.org/wiki/Single-cell_transcriptomics

    Particular data: "Tabula Sapiens" project: https://tabula-sapiens-portal.ds.czbiohub.org/ Data section for download: https://figshare.com/articles/dataset/Tabula_Sapiens_release_1_0/14267219 Paper: https://www.science.org/doi/10.1126/science.abl4896 https://www.biorxiv.org/content/10.1101/2021.07.19.452956v2

    Tabula Sapiens is a benchmark, first-draft human cell atlas of nearly 500,000 cells from 24 organs of 15 normal human subjects. This work is the product of the Tabula Sapiens Consortium. Special thanks to the Chan Zuckerberg Initiative for funding this project and to the CZI Science Technology team for creating cellxgene, the tool that makes the visualization of this research possible.

    See also tutorials:

    Course at Sanger's institute https://scrnaseq-course.cog.sanger.ac.uk/website/tabula-muris.html

    Course at CZ-hub: https://chanzuckerberg.github.io/scRNA-python-workshop/intro/about

    On kaggle - copies of the notebooks and data from the course above https://www.kaggle.com/aayush9753/singlecell-rnaseq-data-from-mouse-brain

    Inspiration

    Single cell RNA sequencing is important technology in modern biology, see e.g. "Eleven grand challenges in single-cell data science" https://genomebiology.biomedcentral.com/articles/10.1186/s13059-020-1926-6

    Also see review : Nature. P. Kharchenko: "The triumphs and limitations of computational methods for scRNA-seq" https://www.nature.com/articles/s41592-021-01171-x

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Ji-Hye Choi; Hye In Kim; Hyun Goo Woo (2023). Additional file 2 of scTyper: a comprehensive pipeline for the cell typing analysis of single-cell RNA-seq data [Dataset]. http://doi.org/10.6084/m9.figshare.12762703.v1
Organization logoOrganization logo

Additional file 2 of scTyper: a comprehensive pipeline for the cell typing analysis of single-cell RNA-seq data

Related Article
Explore at:
xlsxAvailable download formats
Dataset updated
May 30, 2023
Dataset provided by
figshare
Figsharehttp://figshare.com/
Authors
Ji-Hye Choi; Hye In Kim; Hyun Goo Woo
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Additional file 2: Supplementary Table 2–3. This file contains the list of cell markers in each of scTyper.db (Table S2) and CellMarker DB (Table S3) and detailed information such as identifier, study name, species, cell type, gene symbol, and PMID.

Search
Clear search
Close search
Google apps
Main menu