Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset contains all the Seurat objects that were used for generating all the figures in Pal et al. 2021 (https://doi.org/10.15252/embj.2020107333). All the Seurat objects were created under R v3.6.1 using the Seurat package v3.1.1. The detailed information of each object is listed in a table in Chen et al. 2021.
https://ega-archive.org/dacs/EGAC00001001974https://ega-archive.org/dacs/EGAC00001001974
Single-cell RNA-Sequencing of 26 primary breast cancers from Wu et al. (2021) study. Data was generated using the Chromium controller (10X Genomics) and sequenced on the NextSeq 500 platform.
https://www.scilifelab.se/data/restricted-access/https://www.scilifelab.se/data/restricted-access/
Data Set DescriptionSingle cell RNA sequencing (Samrt-Seq3) and Whole exome sequencing from multiple regions of individual tumors from Breast Cancer patients and also single cell RNA seq for two ovarian cancer cell lines.The dataset contains raw sequencing data for various high-throughput molecular tests performed on two sample types: tumor samples from two breast cancer patients and cell lines derived from High-grade serous carcinoma Patients. The breast cancer data comes from two patients: patient 1 (BCSA1) has two tumor regions A-B and patient 2 (BCSA2) has five regions(A-E). For a normal sample and each region from each patient Whole Exome Sequencing was performed using Twist Biosciences Human Exome Kit by the SNP&SEQ Technology platform, SciLifeLab, National Genomics Infrastructure Uppsala, Sweden. Also for each patient, EPCAM+ CD45- sorted cells from all the regions where sorted to a 384 well plate, and Smart-Seq3 libraries were prepared at Karolinska Institutet and sequenced at National Genomics Infrastructure Uppsala, Sweden.The HGSOC cell-line data comes from OV2295R2 and TOV2295R cell lines described in Laks et al Cell 2019 Nov 14; 179(5): 1207–1221.e22 doi: 10.1016/j.cell.2019.10.026 . The cell line Smart-Seq3 libraries were prepared from two 384 well plates at Karolinska Institutet and sequenced at National Genomics Infrastructure Uppsala, Sweden.Terms for accessThis dataset is to be used for research on intratumor heterogeneity and subclonal evolution of tumors. To apply for conditional access to the dataset in this publication, please contact datacentre@scilifelab.se.
https://ega-archive.org/dacs/EGAC00001001974https://ega-archive.org/dacs/EGAC00001001974
Single-cell RNA-Sequencing of five TNBC primary breast cancers from Wu et al. (2020) EMBO J study. Data was generated using the Chromium controller (10X Genomics) and sequenced on the NextSeq 500 platform.
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
we collected 40 tumor and adjacent normal tissue samples from 19 pathologically diagnosed NSCLC patients (10 LUAD and 9 LUSC) during surgical resections, and rapidly digested the tissues to obtain single-cell suspensions and constructed the cDNA libraries of these samples within 24 hours using the protocol of 10X gennomic. These libraries were sequenced on the Illumina NovaSeq 6000 platform. Finally we obtained the raw gene expression matrices were generated using CellRanger (version 3.0.1). Information was processed in R (version 3.6.0) using the Seurat R package (version 2.3.4).
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
"*.csv" files contain the single cell gene expression values (log2(tpm+1)) for all genes in each cell from melanoma and squamous cell carcinoma of head and neck (HNSCC) tumors. The cell type and origin of tumor for each cell is also included in "*.csv" files.The "MalignantCellSubtypes.xlsx" defines the tumor subtype."CCLE_RNAseq_rsem_genes_tpm_20180929.zip" is downloaded from CCLE database.
There is a growing need for integration of “Big Data” into undergraduate biology curricula. Transcriptomics is one venue to examine biology from an informatics perspective. RNA sequencing has largely replaced the use of microarrays for whole genome gene expression studies. Recently, single cell RNA sequencing (scRNAseq) has unmasked population heterogeneity, offering unprecedented views into the inner workings of individual cells. scRNAseq is transforming our understanding of development, cellular identity, cell function, and disease. As a ‘Big Data,’ scRNAseq can be intimidating for students to conceptualize and analyze, yet it plays an increasingly important role in modern biology. To address these challenges, we created an engaging case study that guides students through an exploration of scRNAseq technologies. Students work in groups to explore external resources, manipulate authentic data and experience how single cell RNA transcriptomics can be used for personalized cancer treatment. This five-part case study is intended for upper-level life science majors and graduate students in genetics, bioinformatics, molecular biology, cell biology, biochemistry, biology, and medical genomics courses. The case modules can be completed sequentially, or individual parts can be separately adapted. The first module can also be used as a stand-alone exercise in an introductory biology course. Students need an intermediate mastery of Microsoft Excel but do not need programming skills. Assessment includes both students’ self-assessment of their learning as answers to previous questions are used to progress through the case study and instructor assessment of final answers. This case provides a practical exercise in the use of high-throughput data analysis to explore the molecular basis of cancer at the level of single cells.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
We developed a single-cell transcriptomics pipeline for high-throughput pharmacotranscriptomic screening. We explored the transcriptional landscape of three HGSOC models (JHOS2, a representative cell line; PDC2 and PDC3, two patient-derived samples) after treating their cells for 24 hours with 45 drugs representing 13 distinct classes of mechanism of action. Our work establishes a new precision oncology framework for the study of molecular mechanisms activated by a broad array of drug responses in cancer. . ├── 3D UMAPs/ → Interactive 3D UMAPs of cells treated with the 45 drugs used for multiplexed scRNA-seq. Related to Figure 4. Coordinates: x = UMAP 1; y = UMAP 2; z = UMAP 3. Legend: green = PDC1; blue = PDC2; red = JHOS2. │ ├── DMSO_3D_UMAP_Dini.et.al.html → 3D UMAP of untreated cells. │ └── drug_3D_UMAP_Dini.et.al.html → 3D UMAP of cells treated with (drug). ├── QC_plots/ → Diagnostic plots. Related to Figures 2–4. │ ├── model_QC_violin_plot_2023.pdf → Violin plots of the QC metrics used to filter the data. │ ├── model_col_HTO or model_row_HTO before and after filt → Heatmaps of the row or column HTO expression in each cell. │ └── model_counts_histogram_2023.pdf → Histogram of the distribution of the total counts per cell after filtering for high-quality cells. ├── scRNAseq/ → scRNA-seq data. Related to Figures 2–4. │ ├── AllData_subsampled_DGE_edgeR.csv.gz → Differential gene expression analyses results between treated and untreated cells via pseudobulk of aggregate subsamples, for each of the three models. Related to Figure 3. │ └── All_vs_all_RNAclusters_DEG_signif.txt → Differential gene expression analysis results (p.adj < 0.05) of FindAllMarkers for the Leiden/RNA clusters. ├── PDCs.transcript.counts.tsv → Bulk RNA-seq count data for PDCs 1–3 processed by Kallisto. Related to Figure S6. └── PDCs.transcript.TPM.tsv → Bulk RNA-seq TPM data for PDCs 1–3 processed by Kallisto. Related to Figure S6.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
List of tumor microenvironment scRNA-seq datasets included in TMExplorer.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
AbstractPurpose: Triple-negative breast cancer presents a significant clinical challenge due to its aggressive nature and limited treatment options. This subtype is notorious for a poorer prognosis compared to other breast cancer forms, primarily due to the lack of identifiable treatment targets.Methods: In our study, we delve deep into the molecular landscape of TNBC using public single-cell RNA sequencing datasets. Our integrative analysis aims to identify unique markers specific to TNBC, unravel the intricate gene mechanisms they are involved in, and explore new avenues for potential therapeutic interventions.Results: Employing three comprehensive datasets, our study offers a novel perspective on the tumor microenvironment of TNBC. Specifically, we found 12 marker genes, including DSC2 and CDKN2A, uniquely expressed in TNBC cells, marking an advancement in understanding this cancer subtype. A comparative analysis of these markers across various components of the tumor microenvironment, including both cancerous and normal cells, highlights a distinctive feature. A key discovery of our study is the interaction between DSC2 and DSG2 genes within TNBC cells, suggesting a novel pathway of intercellular communication exclusive to this cancer type.Conclusion: This finding not only corroborates previous hypotheses but also lays the foundation for a new structural understanding of triple-negative breast cancer, as revealed through our single-cell analysis workflow.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Metadata and counts matrix (barcode and genes files also provided) for the colorectal cancer (CRC) spatial transcriptomics and scRNA-seq dataset utilized in the Crescrendo manuscript published by Millard et al. (2025). Batch column indicates whether cell is from scRNA-seq data or which spatial transcriptomics slice. The sample_id column indicates the sample the cell is from. The center_x and center_y columns indicate the center of the cell in space (scRNA-seq cells have 0 in these columns). The orig_publication_type indicates fine-grained cell type labels from the original publication of the CRC scRNA-seq dataset, while the cresc_publication_type column indicates the more coarse-grained cell type labels from the Crescendo publication.
Raw gene expression counts are provided for:
all viable cells from 4T1 tumors (4T1_vaibale_raw_counts.txt),
CAFs from 4T1-Thy1.1 tumor (4T1_Thy1.1_CAF_raw.txt),
CAFs from mT3 tumor (MT3_CAF_raw.txt) and
normal mammary fibroblasts (Normal_mammary_fibroblasts_raw.txt).
Table of Contents
1. Main Description
---------------------------
This is the Zenodo repository for the manuscript titled "A TCR β chain-directed antibody-fusion molecule that activates and expands subsets of T cells and promotes antitumor activity.". The code included in the file titled `marengo_code_for_paper_jan_2023.R` was used to generate the figures from the single-cell RNA sequencing data.
The following libraries are required for script execution:
File Descriptions
---------------------------
Linked Files
---------------------
This repository contains code for the analysis of single cell RNA-seq dataset. The dataset contains raw FASTQ files, as well as, the aligned files that were deposited in GEO. The "Rdata" or "Rds" file was deposited in Zenodo. Provided below are descriptions of the linked datasets:
Gene Expression Omnibus (GEO) ID: GSE223311(https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE223311)
Sequence read archive (SRA) repository ID: SRX19088718 and SRX19088719
Zenodo DOI: 10.5281/zenodo.7566113(https://zenodo.org/record/7566113#.ZCcmvC2cbrJ)
Installation and Instructions
--------------------------------------
The code included in this submission requires several essential packages, as listed above. Please follow these instructions for installation:
> Ensure you have R version 4.1.2 or higher for compatibility.
> Although it is not essential, you can use R-Studios (Version 2022.12.0+353 (2022.12.0+353)) for accessing and executing the code.
1. Download the *"Rdata" or ".Rds" file from Zenodo (https://zenodo.org/record/7566113#.ZCcmvC2cbrJ) (Zenodo DOI: 10.5281/zenodo.7566113).
2. Open R-Studios (https://www.rstudio.com/tags/rstudio-ide/) or a similar integrated development environment (IDE) for R.
3. Set your working directory to where the following files are located:
You can use the following code to set the working directory in R:
> setwd(directory)
4. Open the file titled "Install_Packages.R" and execute it in R IDE. This script will attempt to install all the necessary pacakges, and its dependencies in order to set up an environment where the code in "marengo_code_for_paper_jan_2023.R" can be executed.
5. Once the "Install_Packages.R" script has been successfully executed, re-start R-Studios or your IDE of choice.
6. Open the file "marengo_code_for_paper_jan_2023.R" file in R-studios or your IDE of choice.
7. Execute commands in the file titled "marengo_code_for_paper_jan_2023.R" in R-Studios or your IDE of choice to generate the plots.
https://ega-archive.org/dacs/EGAC00001001380https://ega-archive.org/dacs/EGAC00001001380
This dataset contains single cell RNA sequencing data of PBMC samples from 10 bladder cancer patients. cDNAs and single cell RNA libraries were prepared following manufacturer’s user guide (10x Genomics). Each library was sequenced in HiSeq4000 (Illumina) to achieve ~300 million reads following manufacturer’s sequencing specification.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Beyondcell is a methodology for the identification of drug vulnerabilities in single cell RNA-seq data. To this end, Beyondcell focuses on the analysis of drug-related commonalities between cells by classifying them into distinct therapeutic clusters. We have validated the tool in a population of MCF7-AA cells exposed to 500nM of bortezomib and collected at different time points: t0 (before treatment), t12, t48 and t96 (72h treatment followed by drug wash and 24h of recovery) obtained from Ben-David U, et al., Nature, 2018. Here, you can find the integrated Seurat object obtained from this analysis. This object is meant to help users follow Beyondcell's analysis workflow.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Data accompanying the manuscript describing MIX-Seq, a method for transcriptional profiling of mixtures of cancer cell lines treated with small molecule and genetic perturbations (McFarland and Paolella et al., Nat Commun, 2020). Data consists of single-cell RNA-sequencing (UMI count matrices), and associated drug sensitivity and genomic features of the cancer cell lines.See README file for more information on dataset contents.
Attribution 1.0 (CC BY 1.0)https://creativecommons.org/licenses/by/1.0/
License information was derived automatically
All data supporting the findings of the publication "The Single-Cell Pathology Landscape of Breast Cancer", including high-dimension tiff images, single-cell and tumor/stroma masks, single-cell and patient data. The code used to produce the results of this study is available at https://github.com/BodenmillerGroup/SCPathology_publication.
OMEandSingleCellMasks.zip contains the ome-tiff stacks and the single-cell masks.
TumorStroma_masks.zip contains masks for tumor and stromal regions.
SingleCell_and_Metadata.zip contains the single-cell and patient data as well as all other input data for the R pipelines on the linked Github repository.
Important notes when working with the data:
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Gene expression (counts) scRNA-seq of co-cultured cancer- and immune cells treated with trifluridine and DMSO control assayed at two time-points (12h and 72h).
HCT116 were seeded in 6-well Nunc plates (50,000 cells/3mL/well) and precultured for 24 h before PBMCs were added at a 1:8 ratio. Co-cultures were treated with DMSO vehicle (0.1%) or FTD (3mM) for 12 h or 72 h. MACS Dead Cell Removal Kit (Miltenyi Biotec, Gladbach, DEU) was performed according to the manufacturer’s instructions on cells treated for 72 h to increase the viability of the samples before RNA-sequencing. The viability of the samples treated for 12 h was not subjected to Dead Cell Removal as the viability was already sufficient. All samples were washed in PBS with 0.04% BSA (2x1mL). Chromium Next GEM Single Cell 3’ library preparation and RNA-sequencing were performed by the SNP&SEQ Technology Platform (National Genomics Infrastructure (NGI), Science for Life Laboratory, Uppsala University, Sweden).
This data set contains processed data using Cell Ranger toolkit version 5.0.1 provided by 10x Genomics, for demultiplexing, aligning reads to the human reference genome GRCh38, and generating gene-cell unique molecular identifiers
Publication version of the Single-Cell Tumor Immune Atlas This upload contains: TICAtlas.rds: an rds file containing a Seurat object with the whole Atlas TICAtlas.h5ad: an h5ad file with the whole Atlas TICAtlas_downsampled.rds: an rds file containing a downsampled version of the Seurat object of the whole Atlas TICAtlas_downsampled.h5ad: an rds file containing a downsampled version of the Seurat object of the whole Atlas TICAtlas_metadata.csv: a comma-separated text file with the metadata for each of the cells All the files contain the following patient/sample metadata variables: patient: assigned patient identifiers nCountRNA and nFeatureRNA: number of UMIs and genes per cell percent.mt: percentage of mitochondrial genes gender: the patient's gender (male/female/unknown) source: dataset of origin subtype: cancer type (abbreviations as indicated in the preprint) kmeans_cluster: patients clusters, NA if filtered out before clustering lv1 and lv2: annotated cell type for each of the cells, two level annotation (lv2 has more cell types) If you have any issues with the metadata (i.e. unexpected factors, NA values...) you can use the TICAtlas_metadata.csv file. For more information, read our paper, check our GitHub and our ShinyApp. h5ad files can be read with Python using Scanpy, rds files can be read in R using Seurat. For format conversion between AnnData and Seurat we recommend SeuratDisk. For other single-cell data formats you can use sceasy.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This repository contains spatial transcriptomics data related to the Wu et al. 2021 study "A single-cell and spatially resolved atlas of human breast cancers". Processed count matrices, brightfield HE-images (plain and annotated) and meta-data (containing clinical information and spot pathological details) for 6 primary breast cancers profiled using the Visium assay (10X Genomics). If you use this dataset in your research, please consider citing the above study.
The content of the files are:
raw_count_matrices.tar.gz - spaceranger processed raw count matrices.
spatial.tar.gz - spaceranger processed spatial files (images, scalefactors, aligned fiducials, position lists)
filtered_count_matrices.tar.gz - filtered count matrices.
metadata.tar.gz - metadata for tissues and spots of filtered count matrices, including clinical subtype and pathological annotation of each spot.
images.pdf - pdf detailing the H&E and annotation images.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset contains all the Seurat objects that were used for generating all the figures in Pal et al. 2021 (https://doi.org/10.15252/embj.2020107333). All the Seurat objects were created under R v3.6.1 using the Seurat package v3.1.1. The detailed information of each object is listed in a table in Chen et al. 2021.