Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Cellosaurus is a knowledge resource on cell lines.
The Cell Line Data Base (CLDB) is a reference information source for human and animal cell lines. It provides the characteristics of the cell lines and their availability through distributors, allowing cell line requests to be made from collections and laboratories.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
The SUM human breast cancer cell lines have been used by many labs around the world to develop extensive data sets derived from comparative genomic hybridization analysis, gene expression profiling, whole exome sequencing, and reverse phase protein array analysis. In a previous study, the authors of this paper performed genome-scale shRNA essentiality screens on the entire SUM line panel, as well as on MCF10A cells, MCF-7 cells, and MCF-7LTED cells. In this study, the authors have developed the SUM Breast Cancer Cell Line Knowledge Base, to make all of these omics data sets available to users of the SUM lines, and to allow users to mine the data and analyse them with respect to biological pathways enriched by the data in each cell line.Data access: All the datasets supporting the findings of this study are publicly available in the SLKBase platform here: https://sumlineknowledgebase.com/. RPPA data, drug sensitivity data, apelisib response data, and data on dose response, are also part of this figshare data record (https://doi.org/10.6084/m9.figshare.12497630).Study aims and methodology: This web-based knowledge base provides users with data and information on the derivation of each of the cell lines, provides narrative summaries of the genomics and cell biology of each breast cancer cell line, and provides protocols for the proper maintenance of the cells. The database includes a series of data mining tools that allow rapid identification of the functional oncogene signatures for each line, the enrichment of any KEGG pathway with screen hit and gene expression data for each of the lines, and a rapid analysis of protein and phospho-protein expression for the cell lines. A gene search tool that returns all of the functional genome and functional druggable data for any gene for the entire cell line panel, is included. Additionally, the authors have expanded the database to include functional genomic data for an additional 29 commonly used breast cancer cell lines. The three overarching goals in the original development of the SLKBase are: 1) to provide a rich source of information for anyone working with any of the SUM breast cancer cell lines, 2) to give researchers ready access to the large genomic data sets that have been developed with these cells, and 3) to allow researchers to perform orthogonal analyses of the various genomics data sets that we and others have obtained from the SUM lines. For more information on the development and contents of the database, please read the related article.Datasets supporting the paper:The data mining tools accessed the following datasets to generate the figures and tables, and these datasets are downloadable from the Data Download centre on the SLKBase: Exome sequencing data: SLKBase.exome_.seq_.sum_.xlsxGene amplification and expression data for the SUM cell lines: SUM44amplificationdata.xlsSUM52.xlsSUM149.xlsSUM159.xlsSUM185.xlsSUM190.xlsSUM225.xlsSUM229.xlsSUM1315.xlsCellecta shRNA screen data for the SUM cell lines:SUM44Celectadata.csvSUM52Cellectadata.csvSUM102Cellectadata.csvSUM149Cellectadata.csvSUM159Cellectadata.csvSUM185Cellectadata.csvSUM190Cellectadata.csvSUM225Cellectadata.csvSUM229Cellectadata.csvSUM1315hits.hit.csvMCF10A.hits_.csvBreast cancer cell line data included in this data record (these datasets were used to generate figures 1, 2 and 7 in the article):Proteomics data from the Reverse Phase Protein Array (RPPA) assay analysis: Ethier.SUMline.RPPA.xlsxDrug sensitivity data: NAVITOCLAX.drugsensitivity.Zscores.xlsxApelisib response data: Apelisib all lines (2).xlsxDose response data: 092614 Dose Response CP 52s.11.15.xlsxAll the files are either in .xlsx or .csv file format.
Results of transcript sequencing for AtT-20FlpIn cells. mRNA was isolated from AtT-20FlpIn cells using standard procedures, next generation sequencing was performed by Macrogen (https://dna.macrogen.com/). A report ourtlining the workflow and data analysis methods is available from the Authors by request.
Deposited data is in an Excel file, which includes the gene symbol, transcript ID from the reference mouse genome, protein ID and transcript abundance. The AtT-20FlpIn cells were generated by Dr Santiago, and have been used as the 'wild type' cells for generating cell lines stably expressing GPCR and ion channels for most of the molecular pharmacology projects in the Molecular Pharmacodynamics group.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Originally a reproduction of the EFO/Cellosaurus/DepMap/CCLE scenario posed in the Biomappings paper, this configuration imports several different cell and cell line resources and identifies mappings between them.
We have developed an online database describing the known cell lines from Coleoptera, Diptera, Hemiptera, Hymenoptera, and Lepidoptera that originated from crop pest insects. Cell line information has been primarily obtained from previous compilations of insect cell lines. (from homepage)
A virtual database currently indexing available cell lines from: Coriell Cell Repositories, International Mouse Strain Resource (IMSR), ATCC, NIH Human Pluripotent Stem Cell Registry, NIGMS Human Genetic Cell Repository, and Developmental Therapeutics Program.
Database of all cell lines used in biomedical research which include immortalized cell lines, naturally immortal cell lines (stem cells), widely used and distributed finite life cell lines, vertebrate cell lines (majority being human, mouse, and rat), and invertebrate (insects and ticks) cell lines, as well as cell line synonyms. Each cell line is provided with the following information: the recommended name (the name which appears in the original publication), a list of synonyms, a unique accession number, comments on a number of topics including misspellings and gene transfection, information on the tissue/organ origin with the UBERON code, the NCI Thesaurus or Orphanet ORDO code for the disease(s) the individual suffered from (for cancer and human genetic disease lines only), the species of origin, the parent cell line, cross-references of sister cell lines, the sex of the individual, the category in which the cell line belongs (Adult stem cell; Cancer cell line; Embryonic stem cell; Factor-dependent cell line; Finite cell line; Hybrid cell line; Hybridoma; Induced pluripotent stem cell; Spontaneously immortalized cell line; Stromal cell line; Telomerase immortalized cell line; Transformed cell line; Undefined cell line type), web links, publication references, and/or cross-references to cell line catalogs/collections, ontologies, cell lines databases/resources, and to databases that list cell lines as samples.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
HTLV-1, HHV-8, and SMRV specific read numbers of cell lines ordered by CCLE file names.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Journals that include instructions or requirements for cell line authentication in their author guidelines [28].
https://ega-archive.org/dacs/EGAC00001001606https://ega-archive.org/dacs/EGAC00001001606
RNA-seq data for 54 Glioblastoma stem cell (GSC) lines. Fastq files of the strand-specific paired-end RNA-seq data are available.
Cell Line Adenosine-to-Inosine RNA editing database. Searchable catalogue of RNA editing levels across cell lines. Used to facilitate rational choice of appropriate cell lines for future work on A-to-I RNA editing.
Database that aggregates sample information for reference samples (e.g. Coriell Cell lines) and samples for which data exist in one of the EBI''''s assay databases such as ArrayExpress, the European Nucleotide Archive or PRoteomics Identificates DatabasE. It provides links to assays for specific samples, and accepts direct submissions of sample information. The goals of the BioSample Database include: # recording and linking of sample information consistently within EBI databases such as ENA, ArrayExpress and PRIDE; # minimizing data entry efforts for EBI database submitters by enabling submitting sample descriptions once and referencing them later in data submissions to assay databases and # supporting cross database queries by sample characteristics. The database includes a growing set of reference samples, such as cell lines, which are repeatedly used in experiments and can be easily referenced from any database by their accession numbers. Accession numbers for the reference samples will be exchanged with a similar database at NCBI. The samples in the database can be queried by their attributes, such as sample types, disease names or sample providers. A simple tab-delimited format facilitates submissions of sample information to the database, initially via email to biosamples (at) ebi.ac.uk. Current data sources: * European Nucleotide Archive (424,811 samples) * PRIDE (17,001 samples) * ArrayExpress (1,187,884 samples) * ENCODE cell lines (119 samples) * CORIELL cell lines (27,002 samples) * Thousand Genome (2,628 samples) * HapMap (1,417 samples) * IMSR (248,660 samples)
https://www.gnu.org/licenses/lgpl-3.0-standalone.htmlhttps://www.gnu.org/licenses/lgpl-3.0-standalone.html
Cell line pharmacogenomics datasets for cancer biology and machine learning studies. The datasets are compatible with rcellminer and CellMinerCDB (see publications for details) and data can be extracted for use with Python-based projects.
An example for extracting data from the rcellminer and CellMinerCDB compatible packages:
# INSTALL ----
if (!require("BiocManager", quietly = TRUE))
install.packages("BiocManager")
BiocManager::install("rcellminer")
# Replace path_to_file with the data package filename
install.packages(path_to_file, repos = NULL, type="source")
# GET DATA ----
## Replace nciSarcomaData with name of dataset through code
library(nciSarcomaData)
## DRUG DATA ----
drugAct <- exprs(getAct(nciSarcomaData::drugData))
drugAnnot <- getFeatureAnnot(nciSarcomaData::drugData)[["drug"]]
## MOLECULAR DATA ----
### List available datasets
names(getAllFeatureData(nciSarcomaData::molData))
### Extract data and annotations
expData <- exprs(nciSarcomaData::molData[["exp"]])
mirData <- exprs(nciSarcomaData::molData[["mir"]])
expAnnot <- getFeatureAnnot(nciSarcomaData::molData)[["exp"]]
mirAnnot <- getFeatureAnnot(nciSarcomaData::molData)[["mir"]]
## SAMPLE DATA ----
sampleAnnot <- getSampleData(nciSarcomaData::molData)
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset contains the results of Avana library CRISPR-Cas9 genome-scale knockout (prefixed with Achilles) as well as mutation, copy number and gene expression data (prefixed with CCLE) for cancer cell lines as part of the Broad Institute’s Cancer Dependency Map project. We have repackaged our fileset to include all quarterly-updating datasets produced by DepMap.The Avana CRISPR-Cas9 genome-scale knockout data has expanded to include 625 cell lines, the RNAseq data includes 1,210 cell lines, and the copy number data includes 1,657 cell lines. Please see the README files for details regarding data processing pipeline procedures updates.As our screening efforts continue, we will be releasing additional cancer dependency data on a quarterly basis for unrestricted use. For the latest datasets available, further analyses, and to subscribe to our mailing list visit https://depmap.org.Descriptions of the experimental methods and the CERES algorithm are published in http://dx.doi.org/10.1038/ng.3984. Additional Achilles processing information is published here https://www.biorxiv.org/content/10.1101/720243v1.full. Some cell lines were process using copy number data based on the Sanger Institute whole exome sequencing data (COSMIC: http://cancer.sanger.ac.uk.cell_lines, EGA accession number: EGAD00001001039) reprocessed using CCLE pipelines. A detailed description of the pipelines and tool versions for CCLE expression can be found here: https://github.com/broadinstitute/gtex-pipeline/blob/v9/TOPMed_RNAseq_pipeline.md.version 2: uploaded a new version of CCLE_gene_cn.csv to correctly reflect released cell lines.version 3: uploaded a new version of CCLE_gene_cn.csv that has the log2 transform correctly applied to it and to removed duplicate cell lines.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
BL-70 RNA-Seq reads from the Taxonomer virus bin assigned to virus species or genera.
This data package contains expression profiles for proteins in normal and cancer tissues. It also contains data on sequence based RNA levels in human tissue and cell line.
Thousands of protein post-translational modifications (PTMs) dynamically impact nearly all cellular functions. Mass spectrometry is well suited to PTM identification, but proteome-scale analyses are biased towards PTMs with existing enrichment methods. To measure the full landscape of PTM regulation, software must overcome two fundamental challenges: intractably large search spaces and difficulty distinguishing correct from incorrect identifications. Here, we describe TagGraph, software that overcomes both challenges with a string-based search method orders of magnitude faster than current approaches, and probabilistic validation model optimized for PTM assignments. When applied to a human proteome map, TagGraph tripled confident identifications while revealing thousands of modification types on nearly one million sites spanning the proteome. We expand known sites by orders of magnitude for highly abundant yet understudied PTMs such as proline hydroxylation, and derive tissue-specific insight into these PTMs’ roles. TagGraph expands our ability to survey the full landscape of PTM function and regulation.
hPSCreg is a global registry of human pluripotent stem cell (hPSC) lines containing manually validated information, including ethical provenance, procurement, derivation process, genetic and expression data, other biological and molecular characteristics, use, and quality of the line — Current status: 1092 hESC lines, 7212 hiPSC lines, and 182 clinical studies, and 2394 certificates
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
893 United States import shipment records of Cell line with prices, volume & current Buyer’s suppliers relationships based on actual United States import trade database.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Cellosaurus is a knowledge resource on cell lines.