Table of Contents
Main Description File Descriptions Linked Files Installation and Instructions
This is the Zenodo repository for the manuscript titled "A TCR β chain-directed antibody-fusion molecule that activates and expands subsets of T cells and promotes antitumor activity.". The code included in the file titled marengo_code_for_paper_jan_2023.R
was used to generate the figures from the single-cell RNA sequencing data.
The following libraries are required for script execution:
Seurat scReportoire ggplot2 stringr dplyr ggridges ggrepel ComplexHeatmap
The code can be downloaded and opened in RStudios. The "marengo_code_for_paper_jan_2023.R" contains all the code needed to reproduce the figues in the paper The "Marengo_newID_March242023.rds" file is available at the following address: https://zenodo.org/badge/DOI/10.5281/zenodo.7566113.svg (Zenodo DOI: 10.5281/zenodo.7566113). The "all_res_deg_for_heat_updated_march2023.txt" file contains the unfiltered results from DGE anlaysis, also used to create the heatmap with DGE and volcano plots. The "genes_for_heatmap_fig5F.xlsx" contains the genes included in the heatmap in figure 5F.
This repository contains code for the analysis of single cell RNA-seq dataset. The dataset contains raw FASTQ files, as well as, the aligned files that were deposited in GEO. The "Rdata" or "Rds" file was deposited in Zenodo. Provided below are descriptions of the linked datasets:
Gene Expression Omnibus (GEO) ID: GSE223311(https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE223311)
Title: Gene expression profile at single cell level of CD4+ and CD8+ tumor infiltrating lymphocytes (TIL) originating from the EMT6 tumor model from mSTAR1302 treatment. Description: This submission contains the "matrix.mtx", "barcodes.tsv", and "genes.tsv" files for each replicate and condition, corresponding to the aligned files for single cell sequencing data. Submission type: Private. In order to gain access to the repository, you must use a reviewer token (https://www.ncbi.nlm.nih.gov/geo/info/reviewer.html).
Sequence read archive (SRA) repository ID: SRX19088718 and SRX19088719
Title: Gene expression profile at single cell level of CD4+ and CD8+ tumor infiltrating lymphocytes (TIL) originating from the EMT6 tumor model from mSTAR1302 treatment.
Description: This submission contains the raw sequencing or .fastq.gz
files, which are tab delimited text files.
Submission type: Private. In order to gain access to the repository, you must use a reviewer token (https://www.ncbi.nlm.nih.gov/geo/info/reviewer.html).
Zenodo DOI: 10.5281/zenodo.7566113(https://zenodo.org/record/7566113#.ZCcmvC2cbrJ)
Title: A TCR β chain-directed antibody-fusion molecule that activates and expands subsets of T cells and promotes antitumor activity. Description: This submission contains the "Rdata" or ".Rds" file, which is an R object file. This is a necessary file to use the code. Submission type: Restricted Acess. In order to gain access to the repository, you must contact the author.
The code included in this submission requires several essential packages, as listed above. Please follow these instructions for installation:
Ensure you have R version 4.1.2 or higher for compatibility.
Although it is not essential, you can use R-Studios (Version 2022.12.0+353 (2022.12.0+353)) for accessing and executing the code.
marengo_code_for_paper_jan_2023.R Install_Packages.R Marengo_newID_March242023.rds genes_for_heatmap_fig5F.xlsx all_res_deg_for_heat_updated_march2023.txt
You can use the following code to set the working directory in R:
setwd(directory)
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
We analyzed the field of expression profiling by high throughput sequencing, or HT-seq, in terms of replicability and reproducibility, using data from the NCBI GEO (Gene Expression Omnibus) repository. Our work puts an upper bound of 62% to field-wide reproducibility, based on the types of files submitted to GEO.
Archived dataset contains following files:
- output/parsed_suppfiles.csv, p-value histograms, histogram classes, estimated number of true null hypotheses (pi0).
- output/document_summaries.csv, document summaries of NCBI GEO series
- output/publications.csv, publication info of NCBI GEO series
- output/scopus_citedbycount.csv, Scopus citation info of NCBI GEO series
- output/single-cell.csv, single cell experiments
- spots.csv, NCBI SRA sequencing run metadata
- suppfilenames.txt, list of all supplementary file names of NCBI GEO submissions. One filename per row.
- suppfilenames_filtered.txt, list of supplementary file names used for downloading files from NCBI GEO. One filename per row.
This is the GitHub repository for the single cell RNA sequencing data analysis for the human manuscript. The following essential libraries are required for script execution: Seurat scReportoire ggplot2 dplyr ggridges ggrepel ComplexHeatmap Linked File: -------------------------------------- This repository contains code for the analysis of single cell RNA-seq dataset. The dataset contains raw FASTQ files, as well as, the aligned files that were deposited in GEO. Provided below are descriptions of the linked datasets: 1. Gene Expression Omnibus (GEO) ID: GSE229626 - Title: Gene expression profile at single cell level of human T cells stimulated via antibodies against the T Cell Receptor (TCR) - Description: This submission contains the matrix.mtx
, barcodes.tsv
, and genes.tsv
files for each replicate and condition, corresponding to the aligned files for single cell sequencing data. - Submission type: Private. In order to gain access to the repository, you must use a "reviewer token"(https://www.ncbi.nlm.nih.gov/geo/info/reviewer.html). 2. Sequence read archive (SRA) repository - Title: Gene expression profile at single cell level of human T cells stimulated via antibodies against the T Cell Receptor (TCR) - Description: This submission contains the "raw sequencing" or .fastq.gz
files, which are tab delimited text files. - Submission type: Private. In order to gain access to the repository, you must use a "reviewer token" (https://www.ncbi.nlm.nih.gov/geo/info/reviewer.html). Please note that since the GSE submission is private, the raw data deposited at SRA may not be accessible until the embargo on GSE229626 has been lifted. Installation and Instructions -------------------------------------- The code included in this submission requires several essential packages, as listed above. Please follow these instructions for installation: > Ensure you have R version 4.1.2 or higher for compatibility. > Although it is not essential, you can use R-Studios (Version 2022.12.0+353 (2022.12.0+353)) for accessing and executing the code. The following code can be used to set working directory in R: > setwd(directory) Steps: 1. Download the "Human_code_April2023.R" and "Install_Packages.R" R scripts, and the processed data from GSE229626. 2. Open "R-Studios"(https://www.rstudio.com/tags/rstudio-ide/) or a similar integrated development environment (IDE) for R. 3. Set your working directory to where the following files are located: - Human_code_April2023.R - Install_Packages.R 4. Open the file titled Install_Packages.R
and execute it in R IDE. This script will attempt to install all the necessary pacakges, and its dependencies. 5. Open the Human_code_April2023.R
R script and execute commands as necessary.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
We analysed the field of expression profiling by high throughput sequencing, or HT-seq, in terms of replicability and reproducibility, using data from the NCBI GEO (Gene Expression Omnibus) repository.
geo-htseq.tar.gz archive contains following files:
output/parsed_suppfiles.csv, p-value histograms, histogram classes, estimated number of true null hypotheses (pi0).
output/document_summaries.csv, document summaries of NCBI GEO series.
output/suppfilenames.txt, list of all supplementary file names of NCBI GEO submissions.
output/suppfilenames_filtered.txt, list of supplementary file names used for downloading files from NCBI GEO.
output/publications.csv, publication info of NCBI GEO series.
output/scopus_citedbycount.csv, Scopus citation info of NCBI GEO series
output/spots.csv, NCBI SRA sequencing run metadata.
output/cancer.csv, cancer related experiment accessions.
output/transcription_factor.csv, TF related experiment accessions.
output/single-cell.csv, single cell experiment accessions.
blacklist.txt, list of supplementary files that were either too large to import or were causing computing environment crash during import.
Workflow to produce this dataset is available on Github at rstats-tartu/geo-htseq.
geo-htseq-updates.tar.gz archive contains files:
results/detools_from_pmc.csv, differential expression analysis programs inferred from published articles
results/n_data.csv, manually curated sample size info for NCBI GEO HT-seq series
results/simres_df_parsed.csv, pi0 values estimated from differential expression results obtained from simulated RNA-seq data
results/data/parsed_suppfiles_rerun.csv, pi0 values estimated using smoother method from anti-conservative p-value sets
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
We analysed the field of expression profiling by high throughput sequencing, or HT-seq, in terms of replicability and reproducibility, using data from the NCBI GEO (Gene Expression Omnibus) repository.
- This release includes GEO series up to Dec-31, 2020;
- Fixed xlrd missing optional dependency, which affected import of some xls files, previously we were using only openpyxl (thanks to anonymous reviewer);
- All files in supplementary _RAW.tar files were checked for p values, previously _RAW.tar files were completely omitted, alas (thanks to anonymous reviewer).
Archived dataset contains following files:
- output/parsed_suppfiles.csv, p-value histograms, histogram classes, estimated number of true null hypotheses (pi0).
- output/document_summaries.csv, document summaries of NCBI GEO series
- output/publications.csv, publication info of NCBI GEO series
- output/scopus_citedbycount.csv, Scopus citation info of NCBI GEO series
- output/single-cell.csv, single cell experiments
- spots.csv, NCBI SRA sequencing run metadata
- suppfilenames.txt, list of all supplementary file names of NCBI GEO submissions. One filename per row.
- suppfilenames_filtered.txt, list of supplementary file names used for downloading files from NCBI GEO. One filename per row.
Normalization of RNA-sequencing data is essential for accurate downstream inference, but the assumptions upon which most methods are based do not hold in the single-cell setting. Consequently, applying existing normalization methods to single-cell RNA-seq data introduces artifacts that bias downstream analyses. To address this, we introduce SCnorm for accurate and efficient normalization of scRNA-seq data. Total 183 single cells (92 H1 cells, 91 H9 cells), sequenced twice, were used to evaluate SCnorm in normalizing single cell RNA-seq experiments. Total 48 bulk H1 samples were used to compare bulk and single cell properties. For single-cell RNA-seq, the identical single-cell indexed and fragmented cDNA were pooled at 96 cells per lane or at 24 cells per lane to test the effects of sequencing depth, resulting in approximately 1 million and 4 million mapped reads per cell in the two pooling groups, respectively.
Single-cell RNA-seq analysis of HSCs and HPCs during fetal, neonatal and adult stages of development HSCs and HPCs were isolated at the stated ages for single-cell sequencingThe strain background for all experiments is B6
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Background: Colorectal cancer (CRC) is the second most common cancer in China. Autophagy plays an important role in the initiation and development of CRC. Here, we assessed the prognostic value and potential functions of autophagy-related genes (ARGs) using integrated analysis using single-cell RNA sequencing (scRNA-seq) data from the Gene Expression Omnibus (GEO) and RNA sequencing (RNA-seq) data from The Cancer Genome Atlas (TCGA).Methods: We analyzed GEO-scRNA-seq data from GEO using various single-cell technologies, including cell clustering, and identification of differentially expressed genes (DEGs) in different cell types. Additionally, we performed gene set variation analysis (GSVA). The differentially expressed ARGs among different cell types and those between CRC and normal tissues were identified using TCGA-RNA-seq data, and the hub ARGs were screened. Finally, a prognostic model based on the hub ARGs was constructed and validated, and patients with CRC in TCGA datasets were divided into high- and low-risk groups based on their risk-score, and immune cells infiltration and drug sensitivity analyses between the two groups were performed.Results: We obtained single-cell expression profiles of 16,270 cells, and clustered them into seven types of cells. GSVA revealed that the DEGs among the seven types of cells were enriched in many signaling pathways associated with cancer development. We screened 55 differentially expressed ARGs, and identified 11 hub ARGs. Our prognostic model revealed that the 11 hub ARGs including CTSB, ITGA6, and S100A8, had a good predictive ability. Moreover, the immune cell infiltrations in CRC tissues were different between the two groups, and the hub ARGs were significantly correlated with the enrichment of immune cell infiltration. The drug sensitivity analysis revealed that the patients in the two risk groups had difference in their response to anti-cancer drugs.Conclusion: We developed a novel prognostic 11-hub ARG risk model, and these hubs may act as potential therapeutic targets for CRC.
We report the scRNA-seq of cells isolated from brain of young and old mice. We isolated single cells from brain of young (6 months, n=2) and old (24 months, n=4) mice and performed the scRNA-seq using 10X Genomics system.
🧬 GSE120180 – Single-Cell Transcriptomics of Aging Human Skin
This dataset contains single-cell RNA-seq profiles from aging human skin, originally published as part of the GEO Series GSE120180. The dataset has been converted to .parquet format for faster I/O and compatibility with machine learning pipelines.
📂 Dataset Overview
Original Source: GEO: GSE120180
Species: Homo sapiens
Tissue: Human skin
Technique: 10x Genomics scRNA-seq
Format: .parquet (converted… See the full description on the dataset page: https://huggingface.co/datasets/Iris8090/GSE120180.
This repository contains a collection of three datasets we use to introduce the Gene Mover Distance in [1] and described below. The three datasets are exported with a basic text-based format (.csv file) like other public datasets largely used in the Machine Learning community. The three datasets are extracted from the Gene Expression Omnibus (GEO) database [2], where they appear, respectively, with access number GSE116256 (blood leukemia, [3]), GSE84133 (human pancreas, [4]), and GSE67835 (human brain, [5]). In GEO, the datasets are decomposed into several files, which contain much more details than those reported in this version. However, the proposed format should facilitate other researchers in using this data. The Gene Mover's Distance is a measure of similarity between a pair of cells based on their gene expression profiles obtained via single-cell RNA sequencing. The underlying idea of GMD is to interpret the gene expression array of a single cell as a discrete probability measure. The distance between two cells is hence computed by solving an Optimal Transport problem between the two corresponding discrete measures. The Gene Mover's Distance can be used, for instance, to solve two classification problems: the classification of cells according to their condition and according to their type. The repository contains a python script to check the basic statistics of the data. [1] Bellazzi, R., Codegoni, A., Gualandi, S., Nicora, G., Vercesi, E. The Gene Mover's Distance: Single-cell similarity via Optimal Transport. https://arxiv.org/abs/2102.01218 [2] Gene Expression Omnibus (GEO) database, http://www.ncbi.nlm.nih.gov/geo [3] van Galen, P., Hovestadt, V., Wadsworth II, M.H., Hughes, T.K., Griffin, G.K., Battaglia, S., Verga, J.A., Stephansky, J., Pastika, T.J., Story, J.L. and Pinkus, G.S., 2019. Single-cell RNA-seq reveals AML hierarchies relevant to disease progression and immunity. Cell, 176(6), pp.1265-1281. [4] Baron, M., Veres, A., Wolock, S.L., Faust, A.L., Gaujoux, R., Vetere, A., Ryu, J.H., Wagner, B.K., Shen-Orr, S.S., Klein, A.M. and Melton, D.A., 2016. A single-cell transcriptomic map of the human and mouse pancreas reveals inter-and intra-cell population structure. Cell systems, 3(4), pp.346-360. [5] Darmanis, S., Sloan, S.A., Zhang, Y., Enge, M., Caneda, C., Shuer, L.M., Gephart, M.G.H., Barres, B.A. and Quake, S.R., 2015. A survey of human brain transcriptome diversity at the single cell level. Proceedings of the National Academy of Sciences, 112(23), pp.7285-7290.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
We analyzed the field of expression profiling by high throughput sequencing, or RNA-seq, in terms of replicability and reproducibility, using data from the NCBI GEO (Gene Expression Omnibus) repository. Our work puts an upper bound of 56% to field-wide reproducibility, based on the types of files submitted to GEO.
Archived dataset contains following files:
- output/parsed_suppfiles.csv, p-value histograms, histogram classes, estimated number of true null hypotheses (pi0).
- output/document_summaries.csv, document summaries of GEO series
- output/publications.csv, publication info of GEO series
- output/scopus_citedbycount.csv, Scopus citation info of GEO series
- output/single-cell.csv, single cell experiments
- spots.csv, sequencing run metadata: number of spots and bases
- suppfilenames.txt, list of all supplementary file names of GEO submissions. One filename per row.
- suppfilenames_filtered.txt, list of supplementary file names used for downloading files from NCBI GEO. One filename per row.
GSE208154 – Complete Cell Atlas of Aging C. elegans
📄 Description
This dataset provides a comprehensive single-cell RNA sequencing (scRNA-seq) profile of C. elegans across six adult time points. It is a highly curated developmental and aging atlas, useful for longevity, developmental biology, and transcriptomic studies.
📊 Dataset Summary
Species: Caenorhabditis elegans Type: Single-cell RNA-seq (scRNA-seq) Samples: 11 GSM samples from GEO Format:… See the full description on the dataset page: https://huggingface.co/datasets/longevity-db/GSE208154.
Remark 1: for cell cycle analysis - see paper https://arxiv.org/abs/2208.05229 "Computational challenges of cell cycle analysis using single cell transcriptomics" Alexander Chervov, Andrei Zinovyev
Remark 2: See same data at: https://www.kaggle.com/datasets/alexandervc/scrnaseq-exposed-to-multiple-compounds extracted pieces from huge file here - more easy to load and work.
Data - results of single cell RNA sequencing, i.e. rows - correspond to cells, columns to genes (or vice versa). value of the matrix shows how strong is "expression" of the corresponding gene in the corresponding cell. https://en.wikipedia.org/wiki/Single-cell_transcriptomics
Data - scRNA expressions for several cell lines affected by drugs with different doses/durations.
The data from https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE139944 Status Public on Dec 05, 2019 Title Massively multiplex chemical transcriptomics at single cell resolution Organisms Homo sapiens; Mus musculus Experiment type Expression profiling by high throughput sequencing Summary Single-cell RNA-seq libraries were generated using two and three level single-cell combinatorial indexing RNA sequencing (sci-RNA-seq) of untreated or small molecule inhibitor exposed HEK293T, NIH3T3, A549, MCF7 and K562 cells. Different cells and different treatment were hashed and pooled prior to sci-RNA-seq using a nuclear barcoding strategy. This nuclear barcoding strategy relies on fixation of barcode containing well-specific oligos that are specific to a given cell type, replicate or treatment condition.
The corresponding paper is here: https://pubmed.ncbi.nlm.nih.gov/31806696/ Science. 2020 Jan 3;367(6473):45-51 "Massively multiplex chemical transcriptomics at single-cell resolution" Sanjay R Srivatsan, ... , Cole Trapnell
The authors splitted data into 4 subdatasets - see sciPlex1, sciPlex2, sciPlex3,sciPlex4 in filenames. The main dataset is the sciPlex3 which contains about 600K cells.
The data splitted into small parts - which one can be easily loaded into memory can be found in https://www.kaggle.com/alexandervc/scrnaseq-exposed-to-multiple-compounds
Other single cell RNA seq datasets can be found on kaggle: Look here: https://www.kaggle.com/alexandervc/datasets Or search kaggle for "scRNA-seq"
A collection of some bioinformatics related resources on kaggle: https://www.kaggle.com/general/203136
Single cell RNA sequencing is important technology in modern biology, see e.g. "Eleven grand challenges in single-cell data science" https://genomebiology.biomedcentral.com/articles/10.1186/s13059-020-1926-6
Also see review : Nature. P. Kharchenko: "The triumphs and limitations of computational methods for scRNA-seq" https://www.nature.com/articles/s41592-021-01171-x
Search scholar.google "challenges in single cell rna sequencing" https://scholar.google.fr/scholar?q=challenges+in+single+cell+rna+sequencing&hl=en&as_sdt=0&as_vis=1&oi=scholart gives many interesting and highly cited articles
(Cited 968) Computational and analytical challenges in single-cell transcriptomics Oliver Stegle, Sarah A. Teichmann, John C. Marioni Nat. Rev. Genet., 16 (3) (2015), pp. 133-145 https://www.nature.com/articles/nrg3833
Challenges in unsupervised clustering of single-cell RNA-seq data https://www.nature.com/articles/s41576-018-0088-9 Review Article 07 January 2019 Vladimir Yu Kiselev, Tallulah S. Andrews & Martin Hemberg Nature Reviews Genetics volume 20, pages273–282 (2019)
Challenges and emerging directions in single-cell analysis https://link.springer.com/article/10.1186/s13059-017-1218-y Published: 08 May 2017 Guo-Cheng Yuan, Long Cai, Michael Elowitz, Tariq Enver, Guoping Fan, Guoji Guo, Rafael Irizarry, Peter Kharchenko, Junhyong Kim, Stuart Orkin, John Quackenbush, Assieh Saadatpour, Timm Schroeder, Ramesh Shivdasani & Itay Tirosh Genome Biology volume 18, Article number: 84 (2017)
Single-Cell RNA Sequencing in Cancer: Lessons Learned and Emerging Challenges https://www.sciencedirect.com/science/article/pii/S1097276519303569 Molecular Cell Volume 75, Issue 1, 11 July 2019, Pages 7-12 Journal home page for Molecular Cell
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Raw sequencing data to "Comparative Analysis of Single-Cell RNA Sequencing Methods".
https://www.ncbi.nlm.nih.gov/pubmed/28212749
In addition to the GEO submission https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE75790, you can find here raw bam files for UMI-methods tagged with cell barcode and UMI sequences.
MD5 checksum: f10825509952fffd9c4dc0c1dcb9eb8e
This SuperSeries is composed of the SubSeries listed below. Refer to individual Series
We performed single-cell RNA sequencing (scRNA-Seq) on nasal wash cells freshly collected from adults with COVID-19, influenza A, or no disease (healthy). Major cell types and subtypes were defined using cluster analysis and classic transcriptional markers. Seq-Well single-cell RNA-Seq analysis of cells taken from nasal wash samples from healthy donors and patients diagnosed with either COVID-19 or influenza A
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The dataset includes the raw data of graphs shown in figures 1, 2, 3, 5, 6, and 7 and in supplementary figures 1, 2, 3, 4, 5, 7, and 8 of the paper: Human iPSC-derived neural stem cells displaying radial glia signature exhibit long-term safety in mice. Luciani M, Garsia C, Beretta S, Cifola I, Peano C, Merelli I, Petiti L, Miccio A, Meneghini V, Gritti A. Nat Commun. 2024 Nov 1;15(1):9433. doi: 10.1038/s41467-024-53613-7. PMID: 39487141 The bulk RNA-seq, SREBF1-deficient RNA-seq, and ChIP-seq data generated in this study have been deposited at GEO under accession number GSE239446. The single-cell RNA-seq data generated in this study have been deposited at GEO under accession number GSE238206. The processed RNA-seq, ChIP-seq, and scRNA-seq data (list of DEGs and GO terms) are available in Supplementary Data 1-4 files.
Human induced pluripotent stem cell-derived neural stem/progenitor cells (hiPSC-NSCs) hold promise for treating neurodegenerative and demyelinating disorders. However, comprehensive studies on their identity and safety remain limited. In this study, we demonstrate that hiPSC-NSCs adopt a radial glia-associated signature, sharing key epigenetic and transcriptional characteristics with human fetal neural stem cells (hfNSCs) while exhibiting divergent profiles from glioblastoma stem cells. Long-term transplantation studies in mice showed robust and stable engraftment of hiPSC-NSCs, with predominant differentiation into glial cells and no evidence of tumor formation. Additionally, we identified the Sterol Regulatory Element Binding Transcription Factor 1 (SREBF1) as a regulator of astroglial differentiation in hiPSC-NSCs. These findings provide valuable transcriptional and epigenetic reference datasets to prospectively define the maturation stage of NSCs derived from different hiPSC sources and demonstrate the long-term safety of hiPSC-NSCs, reinforcing their potential as a viable alternative to hfNSCs for clinical applications.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Upon metastatic seeding in the liver, liver macrophages, including Kupffer cells, acquire a transcriptional profile typical of tumor-associated macrophages (TAMs), which support tumor progression. MicroRNAs (miRNAs) fine-tune TAM pro-tumoral functions, making their modulation a promising strategy for macrophage reprogramming into an anti-tumoral phenotype. Here, we analyze the transcriptomic profiles of liver and splenic macrophages, identifying miR-342-3p as a key regulator of liver macrophage function. miR-342-3p is highly active in healthy liver macrophages but significantly downregulated in colorectal cancer liver metastases (CRLMs). Lentiviral vector-engineered liver macrophages enforcing miR-342-3p expression acquire a pro-inflammatory phenotype and reduce CRLM growth. We identify Slc7a11, a cysteine-glutamate antiporter linked to pro-tumoral activity, as a direct miR-342-3p target, which may be at least partially responsible for TAM phenotypic reprogramming. Our findings highlight the potential of in vivo miRNA modulation as a therapeutic strategy for TAM reprogramming, offering an approach to enhance cancer immunotherapy.
Data and code availability: Next-generation sequencing data are deposited at the Gene Expression Omnibus (GEO) with the following accession numbers: GEO: GSE274043 (scRNA-seq), GSE274044 (RNA-seq on iKCs), GSE274045 (small RNA-seq on splenic and hepatic cell populations), and GSE274046 (bulk RNA-seq on splenic and hepatic cell populations). The code is available at GitLab: http://www.bioinfotiget.it/gitlab/custom/Bresesti_Cell_Reports_2025.
Single cell suspensions enriched for epithelial cells were obtained from duodenum, jejunum, and ileum of a 7.5-week-old pig and subjected to single-cell RNA sequencing (scRNA-seq). scRNA-seq was performed to provide transcriptomic profiles of epithelial cells, with 695 cells annotated into 6 cell types. Deeper interrogation of data revealed previously undescribed cells in porcine intestine, and region-specific gene expression profiles within specific cell subsets. Data herein includes a .h5seurat files of the epithelial cell subsets analyzed. Files may be used to reconstruct different analyses and perform further data query. Scripts for original data analyses are found at https://github.com/USDA-FSEPRU/scRNAseqEpSI_Pilot. Raw data are available at GEO accession GSE208613. Data are available for online query at https://singlecell.broadinstitute.org/single_cell/study/SCP1936/regional-epithelial-cell-diversity-in-the-small-intestine-of-pigs.
Resource Title: .h5Seurat object - epithelial cells.
File Name: EpithelialCells.tar
Resource Description: Epithelial cells used for data analysis, available in .h5Seurat file format. Untar file before use.
Table of Contents
Main Description File Descriptions Linked Files Installation and Instructions
This is the Zenodo repository for the manuscript titled "A TCR β chain-directed antibody-fusion molecule that activates and expands subsets of T cells and promotes antitumor activity.". The code included in the file titled marengo_code_for_paper_jan_2023.R
was used to generate the figures from the single-cell RNA sequencing data.
The following libraries are required for script execution:
Seurat scReportoire ggplot2 stringr dplyr ggridges ggrepel ComplexHeatmap
The code can be downloaded and opened in RStudios. The "marengo_code_for_paper_jan_2023.R" contains all the code needed to reproduce the figues in the paper The "Marengo_newID_March242023.rds" file is available at the following address: https://zenodo.org/badge/DOI/10.5281/zenodo.7566113.svg (Zenodo DOI: 10.5281/zenodo.7566113). The "all_res_deg_for_heat_updated_march2023.txt" file contains the unfiltered results from DGE anlaysis, also used to create the heatmap with DGE and volcano plots. The "genes_for_heatmap_fig5F.xlsx" contains the genes included in the heatmap in figure 5F.
This repository contains code for the analysis of single cell RNA-seq dataset. The dataset contains raw FASTQ files, as well as, the aligned files that were deposited in GEO. The "Rdata" or "Rds" file was deposited in Zenodo. Provided below are descriptions of the linked datasets:
Gene Expression Omnibus (GEO) ID: GSE223311(https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE223311)
Title: Gene expression profile at single cell level of CD4+ and CD8+ tumor infiltrating lymphocytes (TIL) originating from the EMT6 tumor model from mSTAR1302 treatment. Description: This submission contains the "matrix.mtx", "barcodes.tsv", and "genes.tsv" files for each replicate and condition, corresponding to the aligned files for single cell sequencing data. Submission type: Private. In order to gain access to the repository, you must use a reviewer token (https://www.ncbi.nlm.nih.gov/geo/info/reviewer.html).
Sequence read archive (SRA) repository ID: SRX19088718 and SRX19088719
Title: Gene expression profile at single cell level of CD4+ and CD8+ tumor infiltrating lymphocytes (TIL) originating from the EMT6 tumor model from mSTAR1302 treatment.
Description: This submission contains the raw sequencing or .fastq.gz
files, which are tab delimited text files.
Submission type: Private. In order to gain access to the repository, you must use a reviewer token (https://www.ncbi.nlm.nih.gov/geo/info/reviewer.html).
Zenodo DOI: 10.5281/zenodo.7566113(https://zenodo.org/record/7566113#.ZCcmvC2cbrJ)
Title: A TCR β chain-directed antibody-fusion molecule that activates and expands subsets of T cells and promotes antitumor activity. Description: This submission contains the "Rdata" or ".Rds" file, which is an R object file. This is a necessary file to use the code. Submission type: Restricted Acess. In order to gain access to the repository, you must contact the author.
The code included in this submission requires several essential packages, as listed above. Please follow these instructions for installation:
Ensure you have R version 4.1.2 or higher for compatibility.
Although it is not essential, you can use R-Studios (Version 2022.12.0+353 (2022.12.0+353)) for accessing and executing the code.
marengo_code_for_paper_jan_2023.R Install_Packages.R Marengo_newID_March242023.rds genes_for_heatmap_fig5F.xlsx all_res_deg_for_heat_updated_march2023.txt
You can use the following code to set the working directory in R:
setwd(directory)