100+ datasets found
  1. s

    Cellosaurus

    • scicrunch.org
    • neuinfo.org
    • +2more
    Updated May 6, 2015
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2015). Cellosaurus [Dataset]. http://identifiers.org/RRID:SCR_013869
    Explore at:
    Dataset updated
    May 6, 2015
    Description

    Database of all cell lines used in biomedical research which include immortalized cell lines, naturally immortal cell lines (stem cells), widely used and distributed finite life cell lines, vertebrate cell lines (majority being human, mouse, and rat), and invertebrate (insects and ticks) cell lines, as well as cell line synonyms. Each cell line is provided with the following information: the recommended name (the name which appears in the original publication), a list of synonyms, a unique accession number, comments on a number of topics including misspellings and gene transfection, information on the tissue/organ origin with the UBERON code, the NCI Thesaurus or Orphanet ORDO code for the disease(s) the individual suffered from (for cancer and human genetic disease lines only), the species of origin, the parent cell line, cross-references of sister cell lines, the sex of the individual, the category in which the cell line belongs (Adult stem cell; Cancer cell line; Embryonic stem cell; Factor-dependent cell line; Finite cell line; Hybrid cell line; Hybridoma; Induced pluripotent stem cell; Spontaneously immortalized cell line; Stromal cell line; Telomerase immortalized cell line; Transformed cell line; Undefined cell line type), web links, publication references, and/or cross-references to cell line catalogs/collections, ontologies, cell lines databases/resources, and to databases that list cell lines as samples.

  2. Data and metadata supporting the published article: Development and...

    • springernature.figshare.com
    • datasetcatalog.nlm.nih.gov
    xlsx
    Updated Jun 4, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Stephen Ethier; Stephen T. Guest; Elizabeth Garrett-Mayer; Kent Armeson; Robert C. Wilson; Kathryn Duchinski; Daniel Couch; Joe W. Gray; Chistiana Kappler (2023). Data and metadata supporting the published article: Development and implementation of the SUM breast cancer cell line functional genomics knowledge base. [Dataset]. http://doi.org/10.6084/m9.figshare.12497630.v1
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Jun 4, 2023
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Stephen Ethier; Stephen T. Guest; Elizabeth Garrett-Mayer; Kent Armeson; Robert C. Wilson; Kathryn Duchinski; Daniel Couch; Joe W. Gray; Chistiana Kappler
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    The SUM human breast cancer cell lines have been used by many labs around the world to develop extensive data sets derived from comparative genomic hybridization analysis, gene expression profiling, whole exome sequencing, and reverse phase protein array analysis. In a previous study, the authors of this paper performed genome-scale shRNA essentiality screens on the entire SUM line panel, as well as on MCF10A cells, MCF-7 cells, and MCF-7LTED cells. In this study, the authors have developed the SUM Breast Cancer Cell Line Knowledge Base, to make all of these omics data sets available to users of the SUM lines, and to allow users to mine the data and analyse them with respect to biological pathways enriched by the data in each cell line.Data access: All the datasets supporting the findings of this study are publicly available in the SLKBase platform here: https://sumlineknowledgebase.com/. RPPA data, drug sensitivity data, apelisib response data, and data on dose response, are also part of this figshare data record (https://doi.org/10.6084/m9.figshare.12497630).Study aims and methodology: This web-based knowledge base provides users with data and information on the derivation of each of the cell lines, provides narrative summaries of the genomics and cell biology of each breast cancer cell line, and provides protocols for the proper maintenance of the cells. The database includes a series of data mining tools that allow rapid identification of the functional oncogene signatures for each line, the enrichment of any KEGG pathway with screen hit and gene expression data for each of the lines, and a rapid analysis of protein and phospho-protein expression for the cell lines. A gene search tool that returns all of the functional genome and functional druggable data for any gene for the entire cell line panel, is included. Additionally, the authors have expanded the database to include functional genomic data for an additional 29 commonly used breast cancer cell lines. The three overarching goals in the original development of the SLKBase are: 1) to provide a rich source of information for anyone working with any of the SUM breast cancer cell lines, 2) to give researchers ready access to the large genomic data sets that have been developed with these cells, and 3) to allow researchers to perform orthogonal analyses of the various genomics data sets that we and others have obtained from the SUM lines. For more information on the development and contents of the database, please read the related article.Datasets supporting the paper:The data mining tools accessed the following datasets to generate the figures and tables, and these datasets are downloadable from the Data Download centre on the SLKBase: Exome sequencing data: SLKBase.exome_.seq_.sum_.xlsxGene amplification and expression data for the SUM cell lines: SUM44amplificationdata.xlsSUM52.xlsSUM149.xlsSUM159.xlsSUM185.xlsSUM190.xlsSUM225.xlsSUM229.xlsSUM1315.xlsCellecta shRNA screen data for the SUM cell lines:SUM44Celectadata.csvSUM52Cellectadata.csvSUM102Cellectadata.csvSUM149Cellectadata.csvSUM159Cellectadata.csvSUM185Cellectadata.csvSUM190Cellectadata.csvSUM225Cellectadata.csvSUM229Cellectadata.csvSUM1315hits.hit.csvMCF10A.hits_.csvBreast cancer cell line data included in this data record (these datasets were used to generate figures 1, 2 and 7 in the article):Proteomics data from the Reverse Phase Protein Array (RPPA) assay analysis: Ethier.SUMline.RPPA.xlsxDrug sensitivity data: NAVITOCLAX.drugsensitivity.Zscores.xlsxApelisib response data: Apelisib all lines (2).xlsxDose response data: 092614 Dose Response CP 52s.11.15.xlsxAll the files are either in .xlsx or .csv file format.

  3. b

    Cell Line Database

    • bioregistry.io
    Updated Dec 28, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2021). Cell Line Database [Dataset]. https://bioregistry.io/cldb
    Explore at:
    Dataset updated
    Dec 28, 2021
    Description

    The Cell Line Data Base (CLDB) is a reference information source for human and animal cell lines. It provides the characteristics of the cell lines and their availability through distributors, allowing cell line requests to be made from collections and laboratories.

  4. m

    CCLE Cell Line Gene Expression Profiles

    • maayanlab.cloud
    gz
    Updated Apr 6, 2015
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ma'ayan Laboratory of Computational Systems Biology (2015). CCLE Cell Line Gene Expression Profiles [Dataset]. https://maayanlab.cloud/Harmonizome/dataset/CCLE+Cell+Line+Gene+Expression+Profiles
    Explore at:
    gzAvailable download formats
    Dataset updated
    Apr 6, 2015
    Dataset provided by
    Harmonizome
    Ma'ayan Laboratory of Computational Systems Biology
    Authors
    Ma'ayan Laboratory of Computational Systems Biology
    Description

    mRNA microarray expression profiles for cancer cell lines

  5. b

    NCI-60 Cancer Cell Lines

    • bigomics.ch
    Updated Nov 8, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    National Cancer Institute (NCI) (2024). NCI-60 Cancer Cell Lines [Dataset]. https://bigomics.ch/blog/top-databases-for-drug-discovery/
    Explore at:
    Dataset updated
    Nov 8, 2024
    Dataset authored and provided by
    National Cancer Institute (NCI)
    Description

    A panel of 60 human cancer cell lines used for screening anticancer drugs.

  6. M

    RNA sequencing data for 30 bladder cancer cell lines

    • datacatalog.mskcc.org
    Updated Nov 18, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Lee, I-Ling; McConkey, David J.; Su, Xiaoping; Choi, Woonyoung (2019). RNA sequencing data for 30 bladder cancer cell lines [Dataset]. https://datacatalog.mskcc.org/dataset/10401
    Explore at:
    Dataset updated
    Nov 18, 2019
    Authors
    Lee, I-Ling; McConkey, David J.; Su, Xiaoping; Choi, Woonyoung
    Description

    Summary from the GEO: "RNA-sequencing of a panel of urothelial cancer cells. The goal of the study is to examine the genome-wide expression profile in each of the 30 urothelial cancer cells tested in our laboratory."

    "Overall design: Each of the 30 cell lines was DNA fingerprinted to confirm its real identity. Total RNA was obtained from each cell line and subjected to Illumina RNA sequencing."

    The data was from a study on comprehensive molecular characterization of muscle-invasive bladder cancer.

  7. Investigation of Cross-Contamination and Misidentification of 278 Widely...

    • plos.figshare.com
    • datasetcatalog.nlm.nih.gov
    tiff
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yaqing Huang; Yuehong Liu; Congyi Zheng; Chao Shen (2023). Investigation of Cross-Contamination and Misidentification of 278 Widely Used Tumor Cell Lines [Dataset]. http://doi.org/10.1371/journal.pone.0170384
    Explore at:
    tiffAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Yaqing Huang; Yuehong Liu; Congyi Zheng; Chao Shen
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    In recent years, biological research involving human cell lines has been rapidly developing in China. However, some of the cell lines are not authenticated before use. Therefore, misidentified and/or cross-contaminated cell lines are unfortunately commonplace. In this study, we present a comprehensive investigation of cross-contamination and misidentification for a panel of 278 cell lines from 28 institutes in China by using short tandem repeat profiling method. By comparing the DNA profiles with the cell bank databases of ATCC and DSMZ, a total of 46.0% (128/278) cases with cross-contamination/misidentification were uncovered coming from 22 institutes. Notably, 73.2% (52 out of 71) of the cell lines established by the Chinese researchers were misidentified and accounted for 40.6% of total misidentification (52/128). Further, 67.3% (35/52) of the misidentified cell lines established in laboratories of China were HeLa cells or a possible hybrid of HeLa with another kind of cell line. Furthermore, the bile duct cancer cell line HCCC-9810 and degenerative lung cancer Calu-6 exhibited 88.9% match in the ATCC database (9-loci), indicating that they were from the same origin. However, when we used 21-loci to compare these two cell lines with the same algorithm, the percent match was only 48.2%, indicating that these two cell lines were different. The SNP profiles of HCCC-9810 and Calu-6 also revealed that they were different cell lines. 150 cell lines with unique profiles demonstrated a wide range of in vitro phenotypes. This panel of 150 genomically validated cancer cell lines represents a valuable resource for the cancer research community and will advance our understanding of the disease by providing a standard reference for cell lines that can be used for biological as well as preclinical studies.

  8. n

    ATCC STR database

    • neuinfo.org
    • scicrunch.org
    • +2more
    Updated Apr 28, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2021). ATCC STR database [Dataset]. http://identifiers.org/RRID:SCR_019203
    Explore at:
    Dataset updated
    Apr 28, 2021
    Description

    Comprehensive database of Short Tandem Repeat DNA profiles for all of ATCC human cell lines. ATCC data collection as part of continuing efforts to characterize and authenticate cell lines in Cell Biology collection.

  9. r

    International Cell Line Authentication Committee

    • rrid.site
    Updated Jul 3, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). International Cell Line Authentication Committee [Dataset]. http://identifiers.org/RRID:SCR_014414
    Explore at:
    Dataset updated
    Jul 3, 2024
    Description

    An independent committee established to improve visibility of cell lines and promote awareness and authentication testing to combat false or misidentified cell lines. It contains a databases of cross-contaminated or otherwise misidentified cell lines, as well as resources to familiarize users of cell lines and the problem of misidentification. Their Terms of Reference defines false or misidentified cell lines and other commonly used terms, as well as sets out the committee goals and ground rules.

  10. h

    hPSCreg dataset, continuously updated

    • hpscreg.eu
    Updated Jul 15, 2015
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2015). hPSCreg dataset, continuously updated [Dataset]. https://hpscreg.eu/
    Explore at:
    Dataset updated
    Jul 15, 2015
    Variables measured
    usage, ethics, derivation, genotyping, characterisation, donor information, culture conditions, general information, genetic modification
    Description

    hPSCreg is a global registry of human pluripotent stem cell (hPSC) lines containing manually validated information, including ethical provenance, procurement, derivation process, genetic and expression data, other biological and molecular characteristics, use, and quality of the line — Current status: 1123 hESC lines, 7670 hiPSC lines, and 205 clinical studies, and 2402 certificates

  11. Genomics of Drug Sensitivity in Cancer (GDSC)

    • kaggle.com
    zip
    Updated Aug 13, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Samira Alipour (2024). Genomics of Drug Sensitivity in Cancer (GDSC) [Dataset]. https://www.kaggle.com/datasets/samiraalipour/genomics-of-drug-sensitivity-in-cancer-gdsc/discussion
    Explore at:
    zip(15094344 bytes)Available download formats
    Dataset updated
    Aug 13, 2024
    Authors
    Samira Alipour
    License

    https://www.gnu.org/licenses/gpl-3.0.htmlhttps://www.gnu.org/licenses/gpl-3.0.html

    Description

    The Genomics of Drug Sensitivity in Cancer (GDSC) dataset is a valuable resource for therapeutic biomarker discovery in cancer research. This dataset combines drug response data with genomic profiles of cancer cell lines, allowing researchers to investigate the relationship between genetic features and drug sensitivity.

    Task:

    The primary task associated with this dataset is to predict drug sensitivity (measured as IC50 values) based on genomic features of cancer cell lines. This can involve regression tasks to predict exact IC50 values or classification tasks to categorize cell lines as sensitive or resistant to specific drugs. The dataset also allows for the identification of genomic markers that correlate with drug response.

    Files:

    1. GDSC2-dataset.csv: Contains drug sensitivity data, including IC50 values, for various drugs tested against cancer cell lines.(Original source file)
    2. Cell_Lines_Details.xlsx: Provides detailed information about the cancer cell lines, including genomic features such as mutations, copy number alterations, and gene expression. (Original source file)
    3. Compounds-annotation.csv: Offers information about the drugs used in the screening, including their targets and pathways. (Original source file)
    4. GDSC_DATASET.csv: This is the main dataset file for analysis. It's a merged file combining key information from the above three files, created to facilitate easier analysis. This consolidated dataset includes all necessary features for drug sensitivity prediction and is recommended for use in your analysis.

    Detailed Column Descriptions:

    1. GDSC2-dataset.csv:

    • DATASET: Identifier for the specific GDSC dataset version.
    • NLME_RESULT_ID: Unique identifier for the non-linear mixed effects model result.
    • NLME_CURVE_ID: Identifier for the dose-response curve fitted by NLME.
    • COSMIC_ID: Unique identifier for the cell line from the COSMIC database.
    • CELL_LINE_NAME: Name of the cancer cell line used in the experiment.
    • SANGER_MODEL_ID: Identifier used by the Sanger Institute for the cell line model.
    • TCGA_DESC: Description of the cancer type according to The Cancer Genome Atlas.
    • DRUG_ID: Unique identifier for the drug used in the experiment.
    • DRUG_NAME: Name of the drug used in the experiment.
    • PUTATIVE_TARGET: The presumed molecular target of the drug.
    • PATHWAY_NAME: The biological pathway affected by the drug.
    • COMPANY_ID: Identifier for the company that provided the drug.
    • WEBRELEASE: Date or version of web release for this data.
    • MIN_CONC: Minimum concentration of the drug used in the experiment.
    • MAX_CONC: Maximum concentration of the drug used in the experiment.
    • LN_IC50: Natural log of the half-maximal inhibitory concentration (IC50).
    • AUC: Area Under the Curve, a measure of drug effectiveness.
    • RMSE: Root Mean Square Error, indicating the fit quality of the dose-response curve.
    • Z_SCORE: Standardized score of the drug response, allowing comparison across different drugs and cell lines. ### 2. Cell_Lines_Details.xlsx:
    • Sample Name: Unique identifier for the cell line sample.
    • COSMIC identifier: Unique ID from the COSMIC database for the cell line.
    • Whole Exome Sequencing (WES): Genetic mutation data from whole exome sequencing.
    • Copy Number Alterations (CNA): Data on gene copy number changes in the cell line.
    • Gene Expression: Information on gene expression levels in the cell line.
    • Methylation: Data on DNA methylation patterns in the cell line.
    • Drug Response: Information on how the cell line responds to various drugs.
    • GDSC Tissue descriptor 1: Primary tissue type classification.
    • GDSC Tissue descriptor 2: Secondary tissue type classification.
    • Cancer Type (matching TCGA label): Cancer type according to TCGA classification.
    • Microsatellite instability Status (MSI): Indicates the cell line's MSI status.
    • Screen Medium: The growth medium used for culturing the cell line.
    • Growth Properties: Characteristics of how the cell line grows in culture. ### 3. Compounds-annotation.csv:
    • DRUG_ID: Unique identifier for the drug.
    • SCREENING_SITE: Location where the drug screening was performed.
    • DRUG_NAME: Name of the drug compound.
    • SYNONYMS: Alternative names for the drug.
    • TARGET: The molecular target(s) of the drug.
    • TARGET_PATHWAY: The biological pathway(s) targeted by the drug.

    Target Variable:

    The primary target variable in this dataset is LN_IC50 (Natural log of the half-maximal inhibitory concentration). This variable represents the concentration of a drug that inhibits cell viability by 50%, measured on a logarithmic scale. Lower LN_IC50 values indicate higher drug sensitivity, making it a crucial metric for evaluating the effectiveness of anti-ca...

  12. NCI-60 Cell Lines (NCI, Cancer Res 2012): Whole-exome sequencing of 67...

    • datacatalog.mskcc.org
    Updated May 20, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    National Cancer Institute (U.S.) (2020). NCI-60 Cell Lines (NCI, Cancer Res 2012): Whole-exome sequencing of 67 samples by NCI-60 cell line project [Dataset]. https://datacatalog.mskcc.org/dataset/10453
    Explore at:
    Dataset updated
    May 20, 2020
    Dataset provided by
    National Cancer Institutehttp://www.cancer.gov/
    MSK Library
    Description

    This dataset contains summary data visualizations and clinical data from 67 samples from 67 patients as part an NCI-60 cell line project to compile NCI-60 cell line high-throughput and high-content data into CellMiner, a genomic and pharmacologic database created by the National Cancer Center Institute. The clinical data includes deidentified patient and sample IDs, mutation counts, detailed cancer type information, patient demographics, and past modality. The data set also includes copy-number segment data downloadable as .seg files and viewable via the Integrative Genomics Viewer.

  13. f

    Careful Selection of Reference Genes Is Required for Reliable Performance of...

    • figshare.com
    • datasetcatalog.nlm.nih.gov
    pdf
    Updated Jan 18, 2016
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Francis Jacob; Rea Guertler; Stephanie Naim; Sheri Nixdorf; André Fedier; Neville F. Hacker; Viola Heinzelmann-Schwarz (2016). Careful Selection of Reference Genes Is Required for Reliable Performance of RT-qPCR in Human Normal and Cancer Cell Lines [Dataset]. http://doi.org/10.1371/journal.pone.0059180
    Explore at:
    pdfAvailable download formats
    Dataset updated
    Jan 18, 2016
    Dataset provided by
    PLOS ONE
    Authors
    Francis Jacob; Rea Guertler; Stephanie Naim; Sheri Nixdorf; André Fedier; Neville F. Hacker; Viola Heinzelmann-Schwarz
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Reverse Transcription - quantitative Polymerase Chain Reaction (RT-qPCR) is a standard technique in most laboratories. The selection of reference genes is essential for data normalization and the selection of suitable reference genes remains critical. Our aim was to 1) review the literature since implementation of the MIQE guidelines in order to identify the degree of acceptance; 2) compare various algorithms in their expression stability; 3) identify a set of suitable and most reliable reference genes for a variety of human cancer cell lines. A PubMed database review was performed and publications since 2009 were selected. Twelve putative reference genes were profiled in normal and various cancer cell lines (n = 25) using 2-step RT-qPCR. Investigated reference genes were ranked according to their expression stability by five algorithms (geNorm, Normfinder, BestKeeper, comparative ΔCt, and RefFinder). Our review revealed 37 publications, with two thirds patient samples and one third cell lines. qPCR efficiency was given in 68.4% of all publications, but only 28.9% of all studies provided RNA/cDNA amount and standard curves. GeNorm and Normfinder algorithms were used in 60.5% in combination. In our selection of 25 cancer cell lines, we identified HSPCB, RRN18S, and RPS13 as the most stable expressed reference genes. In the subset of ovarian cancer cell lines, the reference genes were PPIA, RPS13 and SDHA, clearly demonstrating the necessity to select genes depending on the research focus. Moreover, a cohort of at least three suitable reference genes needs to be established in advance to the experiments, according to the guidelines. For establishing a set of reference genes for gene normalization we recommend the use of ideally three reference genes selected by at least three stability algorithms. The unfortunate lack of compliance to the MIQE guidelines reflects that these need to be further established in the research community.

  14. d

    Integrated Cell Lines

    • dknet.org
    • neuinfo.org
    • +2more
    Updated Jan 29, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2022). Integrated Cell Lines [Dataset]. http://identifiers.org/RRID:SCR_008994
    Explore at:
    Dataset updated
    Jan 29, 2022
    Description

    A virtual database currently indexing available cell lines from: Coriell Cell Repositories, International Mouse Strain Resource (IMSR), ATCC, NIH Human Pluripotent Stem Cell Registry, NIGMS Human Genetic Cell Repository, and Developmental Therapeutics Program.

  15. r

    Cancer Cell Line Encyclopedia

    • rrid.site
    Updated Aug 21, 2010
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2010). Cancer Cell Line Encyclopedia [Dataset]. http://identifiers.org/RRID:SCR_013836
    Explore at:
    Dataset updated
    Aug 21, 2010
    Description

    A collaborative project between the Broad Institute and the Novartis Institutes for Biomedical Research and its Genomics Institute of the Novartis Research Foundation, with the goal of conducting a detailed genetic and pharmacologic characterization of a large panel of human cancer models. The CCLE also works to develop integrated computational analyses that link distinct pharmacologic vulnerabilities to genomic patterns and to translate cell line integrative genomics into cancer patient stratification. The CCLE provides public access to genomic data, analysis and visualization for about 1000 cell lines.

  16. Pharmacogenomics Datasets for Cancer Cell Lines from CellMiner...

    • zenodo.org
    application/gzip
    Updated Sep 11, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Augustin Luna; Augustin Luna; Fathi Elloumi; Fathi Elloumi; Vinodh Rajapakse; Vinodh Rajapakse (2025). Pharmacogenomics Datasets for Cancer Cell Lines from CellMiner Cross-Database (CellMinerCDB) [Dataset]. http://doi.org/10.5281/zenodo.17088217
    Explore at:
    application/gzipAvailable download formats
    Dataset updated
    Sep 11, 2025
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Augustin Luna; Augustin Luna; Fathi Elloumi; Fathi Elloumi; Vinodh Rajapakse; Vinodh Rajapakse
    License

    https://www.gnu.org/licenses/lgpl-3.0-standalone.htmlhttps://www.gnu.org/licenses/lgpl-3.0-standalone.html

    Time period covered
    Sep 2025
    Description

    If you use this data, please cite: Luna A, Elloumi F, Varma S et al. NAR. 2021. PMID: 33196823

    Cell line pharmacogenomics datasets for cancer biology and machine learning studies. The datasets are compatible with rcellminer and CellMinerCDB (see publications for details) and data can be extracted for use with Python-based projects.

    An example for extracting data from the rcellminer and CellMinerCDB compatible packages:

    # INSTALL ----
    if (!require("BiocManager", quietly = TRUE))
      install.packages("BiocManager")
    
    BiocManager::install("rcellminer")
    
    # Replace path_to_file with the data package filename
    install.packages(path_to_file, repos = NULL, type="source")
    
    # GET DATA ----
    ## Replace nciSarcomaData with name of dataset through code 
    library(nciSarcomaData)
    
    ## DRUG DATA ----
    drugAct <- exprs(getAct(nciSarcomaData::drugData))
    drugAnnot <- getFeatureAnnot(nciSarcomaData::drugData)[["drug"]]
    
    ## MOLECULAR DATA ----
    ### List available datasets
    names(getAllFeatureData(nciSarcomaData::molData))
    
    ### Extract data and annotations
    expData <- exprs(nciSarcomaData::molData[["exp"]])
    mirData <- exprs(nciSarcomaData::molData[["mir"]])
    
    expAnnot <- getFeatureAnnot(nciSarcomaData::molData)[["exp"]]
    mirAnnot <- getFeatureAnnot(nciSarcomaData::molData)[["mir"]]
    
    ## SAMPLE DATA ----
    sampleAnnot <- getSampleData(nciSarcomaData::molData)

  17. p

    Human Protein Atlas - Cell Atlas

    • v19.proteinatlas.org
    Updated Sep 5, 2019
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Human Protein Atlas (2019). Human Protein Atlas - Cell Atlas [Dataset]. https://v19.proteinatlas.org/humanproteome/cell
    Explore at:
    Dataset updated
    Sep 5, 2019
    Dataset provided by
    Human Protein Atlas
    License

    https://www.proteinatlas.org/about/licencehttps://www.proteinatlas.org/about/licence

    Description

    The Cell Atlas provides high-resolution insights into the expression and spatio-temporal distribution of proteins within human cells. Using a panel of 64 cell lines to represent various cell populations in different organs and tissues of the human body, the mRNA expression of all human genes are characterized by deep RNA-sequencing. The subcellular distribution of each protein is investigated in a subset of cell lines selected based on corresponding gene expression. The protein localization data is derived from antibody-based profiling by immunofluorescence confocal microscopy, and classified into 32 different organelles and fine subcellular structures. The Cell Atlas currently covers 12390 genes (63%) for which there are available antibodies. It offers a database for exploring details of individual genes and proteins of interest, as well as systematically analyzing transcriptomes and proteomes in broader contexts, in order to increase our understanding of human cells.

  18. p

    Human Protein Atlas - Subcellular

    • proteinatlas.org
    Updated Sep 26, 2008
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Human Protein Atlas (2008). Human Protein Atlas - Subcellular [Dataset]. https://www.proteinatlas.org/humanproteome/subcellular
    Explore at:
    Dataset updated
    Sep 26, 2008
    Dataset authored and provided by
    Human Protein Atlas
    License

    https://www.proteinatlas.org/about/licencehttps://www.proteinatlas.org/about/licence

    Description

    Subcellular methods

    The subcellular resource of the Human Protein Atlas provides high-resolution insights into the expression and spatiotemporal distribution of proteins encoded by 13603 genes (67% of the human protein-coding genes), as well as predictions for an additional 3459 secreted- or membrane proteins, covering a total of 17062 genes (85% of the human protein-coding genes). For each gene, the subcellular distribution of the protein has been investigated by immunofluorescence (ICC-IF) and confocal microscopy in up to three different standard cell lines, selected from a panel of 42 cell lines used in the subcellular resource. For some genes, the protein has also been stained in up to three ciliated cell lines, induced pluripotent stem cells (iPSCs) and/or in human sperm cells. Upon image analysis, the subcellular localization of the protein has been classified into one or more of 49 different organelles and subcellular structures. In addition, the resource includes an annotation of genes that display single-cell variation in protein expression levels and/or subcellular distribution, as well as an extended analysis of cell cycle dependency of such variations. 
    

    The subcellular resource offers a database for detailed exploration of individual genes and proteins of interest, as well as for systematic analysis of proteomes in a broader context. More information about the content of the resouce, as well as the generation and analysis of the data, can be found in the Methods summary. Learn about:

    The subcellular distribution of proteins in standard human cell lines, including ciliated cells and iPSCs. The subcellular distribution of proteins in human sperm. The proteomes of different organelles and subcellular structures. Single-cell variability in the expression levels and/or localizations of proteins.

  19. Additional file 6: of A map of gene expression in neutrophil-like cell lines...

    • springernature.figshare.com
    • search.datacite.org
    xlsx
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Esther Rincรณn; Briana Rocha-Gregg; Sean Collins (2023). Additional file 6: of A map of gene expression in neutrophil-like cell lines [Dataset]. http://doi.org/10.6084/m9.figshare.6891509.v1
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Esther Rincรณn; Briana Rocha-Gregg; Sean Collins
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Full tables of reanalyzed gene expression data for primary neutrophils and HL-60 cells from previously published studies. This Excel file contains four sheets. The first sheet contains FPKM gene expression values generated by Cufflinks for all primary human neutrophil and HL-60 samples reanalyzed in this study. The second sheet contains the corresponding log10-transformed normalized expression values. The third sheet contains FPKM gene expression values for all primary mouse neutrophil samples reanalyzed in this study, and the fourth sheet contains the corresponding log10-transformed normalized values. (XLSX 21707 kb)

  20. n

    SBM DB

    • neuinfo.org
    • dknet.org
    • +1more
    Updated Jan 29, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2022). SBM DB [Dataset]. http://identifiers.org/RRID:SCR_013491/resolver?q=&i=rrid
    Explore at:
    Dataset updated
    Jan 29, 2022
    Description

    It is a comprehensive database of Gene Expression Profiles, which enable to compare the transcriptome of various tissues, organs and experiments. mRNA expression levels of thousands of genes are measured with oligo-nucleotide DNA microarray "GeneChip". All gene expression data in this database is produced by LSBM (Laboratory for Systems Biology and Medicine) and the collaborators. SBM DB provides two different databases: A reference database for fur expression analysis (RefEXA) and LSMB GeNet, a database of various organisms, tissues, and experiences. RefEXA provides a comprehensive gene expression database of Human normal tissues, normal cultured cells and cancer cell lines with GeneChip HG-U133A, can help investigation of Human disease. LSMB provides

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
(2015). Cellosaurus [Dataset]. http://identifiers.org/RRID:SCR_013869

Cellosaurus

RRID:SCR_013869, nif-0000-30108, r3d100010875, Cellosaurus (RRID:SCR_013869)

Explore at:
12 scholarly articles cite this dataset (View in Google Scholar)
Dataset updated
May 6, 2015
Description

Database of all cell lines used in biomedical research which include immortalized cell lines, naturally immortal cell lines (stem cells), widely used and distributed finite life cell lines, vertebrate cell lines (majority being human, mouse, and rat), and invertebrate (insects and ticks) cell lines, as well as cell line synonyms. Each cell line is provided with the following information: the recommended name (the name which appears in the original publication), a list of synonyms, a unique accession number, comments on a number of topics including misspellings and gene transfection, information on the tissue/organ origin with the UBERON code, the NCI Thesaurus or Orphanet ORDO code for the disease(s) the individual suffered from (for cancer and human genetic disease lines only), the species of origin, the parent cell line, cross-references of sister cell lines, the sex of the individual, the category in which the cell line belongs (Adult stem cell; Cancer cell line; Embryonic stem cell; Factor-dependent cell line; Finite cell line; Hybrid cell line; Hybridoma; Induced pluripotent stem cell; Spontaneously immortalized cell line; Stromal cell line; Telomerase immortalized cell line; Transformed cell line; Undefined cell line type), web links, publication references, and/or cross-references to cell line catalogs/collections, ontologies, cell lines databases/resources, and to databases that list cell lines as samples.

Search
Clear search
Close search
Google apps
Main menu