Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Processsed data for MASI manuscript
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
IntroductionData-driven omics approaches have rapidly advanced our understanding of the molecular heterogeneity of Alzheimer’s disease (AD). However, limited by the unavailability of brain tissue, there is an urgent need for a non-invasive tool to detect alterations in the AD brain. Cell-free RNA (cfRNA), which crosses the blood-brain barrier, could reflect AD brain pathology and serve as a diagnostic biomarker.MethodsHere, we integrated plasma-derived cfRNA-seq data from 337 samples (172 AD patients and 165 age-matched controls) with brain-derived single cell RNA-seq (scRNA-seq) data from 88 samples (46 AD patients and 42 controls) to explore the potential of cfRNA profiling for AD diagnosis. A systematic comparative analysis of cfRNA and brain scRNA-seq datasets was conducted to identify dysregulated genes linked to AD pathology. Machine learning models—including support vector machine, random forest, and logistic regression—were trained using cfRNA expression patterns of the identified gene set to predict AD diagnosis and classify disease progression stages. Model performance was rigorously evaluated using area under the receiver operating characteristic curve (AUC), with robustness assessed through cross-validation and independent validation cohorts.ResultsNotably, we identified 34 dysregulated genes with consistent expression changes in both cfRNA and scRNA-seq. Machine learning models based on the cfRNA expression patterns of these 34 genes can accurately predict AD patients (the highest AUC = 89%) and effectively distinguish patients at early stage of AD. Furthermore, classifiers developed based on the expression of 34 genes in brain transcriptome data demonstrated robust predictive performance for assessing the risk of AD in the population (the highest AUC = 94%).DiscussionThis multi-omics approach overcomes limitations of invasive brain biomarkers and noisy blood-based signatures. The 34-gene panel provides non-invasive molecular insights into AD pathogenesis and early screening. While cfRNA stability challenges clinical translation, our framework highlights the potential for precision diagnostics and personalized therapeutic monitoring in AD.
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
we collected 40 tumor and adjacent normal tissue samples from 19 pathologically diagnosed NSCLC patients (10 LUAD and 9 LUSC) during surgical resections, and rapidly digested the tissues to obtain single-cell suspensions and constructed the cDNA libraries of these samples within 24 hours using the protocol of 10X gennomic. These libraries were sequenced on the Illumina NovaSeq 6000 platform. Finally we obtained the raw gene expression matrices were generated using CellRanger (version 3.0.1). Information was processed in R (version 3.6.0) using the Seurat R package (version 2.3.4).
https://ega-archive.org/dacs/EGAC00001003458https://ega-archive.org/dacs/EGAC00001003458
This dataset contains 10 samples from 9 patients with chronic graft-versus-host disease (GVHD). Each sample is analysed with Chromium V(D)J and 5' Gene Expression Platform v1.1 (10X Genomics). The raw data includes fastq files for Gene expression and fastq files for V(D)J Expression. The processed data have been deposited in the ArrayExpress database at EMBL-EBI (www.ebi.ac.uk/arrayexpress) under accession number E-MTAB-13419.
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
This RDS file contains processed single-cell RNA sequencing (scRNA-seq) data comparing immune cell populations from germ-free (GF) and specific-pathogen-free (SPF) mice. The dataset includes:Samples: Peripheral blood (PB) and bone marrow (BM) from GF and SPF miceCell Counts:Raw: 21,827 cells (PB) and 19,940 cells (BM)Quality-filtered: 18,344 high-quality cells (PB) and 16,537 high-quality cells (BM)Gene Coverage: Median 1,426 genes per cell (PB) and 1,391 genes per cell (BM)Cell Classifications: 18 major cell identities further divided into 25 subpopulationsAnnotation: Cells identified using established marker genes for blood cells
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Developing robot perception systems for handling objects in the real-world requires computer vision algorithms to be carefully scrutinized with respect to the expected operating domain. This demands large quantities of ground truth data to rigorously evaluate the performance of algorithms.
The Object Cluttered Indoor Dataset is an RGBD-dataset containing point-wise labeled point-clouds for each object. The data was captured using two ASUS-PRO Xtion cameras that are positioned at different heights. It captures diverse settings of objects, background, context, sensor to scene distance, viewpoint angle and lighting conditions. The main purpose of OCID is to allow systematic comparison of existing object segmentation methods in scenes with increasing amount of clutter. In addition OCID does also provide ground-truth data for other vision tasks like object-classification and recognition.
OCID comprises 96 fully built up cluttered scenes. Each scene is a sequence of labeled pointclouds which are created by building a increasing cluttered scene incrementally and adding one object after the other. The first item in a sequence contains no objects, the second one object, up to the final count of added objects.
The dataset uses 89 different objects that are chosen representatives from the Autonomous Robot Indoor Dataset(ARID)[1] classes and YCB Object and Model Set (YCB)[2] dataset objects.
The ARID20 subset contains scenes including up to 20 objects from ARID. The ARID10 and YCB10 subsets include cluttered scenes with up to 10 objects from ARID and the YCB objects respectively. The scenes in each subset are composed of objects from only one set at a time to maintain separation between datasets. Scene variation includes different floor (plastic, wood, carpet) and table textures (wood, orange striped sheet, green patterned sheet). The complete set of data provides 2346 labeled point-clouds.
OCID subsets are structured so that specific real-world factors can be individually assessed.
You can find all labeled pointclouds of the ARID20 dataset for the first sequence on a table recorded with the lower mounted camera in this directory:
./ARID20/table/bottom/seq01/pcd/
In addition to labeled organized point-cloud files, corresponding depth, RGB and 2d-label-masks are available:
OCID was created using EasyLabel – a semi-automatic annotation tool for RGBD-data. EasyLabel processes recorded sequences of organized point-cloud files and exploits incrementally built up scenes, where in each take one additional object is placed. The recorded point-cloud data is then accumulated and the depth difference between two consecutive recordings are used to label new objects. The code is available here.
OCID data for instance recognition/classification
For ARID10 and ARID20 there is additional data available usable for object recognition and classification tasks. It contains semantically annotated RGB and depth image crops extracted from the OCID dataset.
The structure is as follows:
The data is provided by Mohammad Reza Loghmani.
If you found our dataset useful, please cite the following paper:
@inproceedings{DBLP:conf/icra/SuchiPFV19,
author = {Markus Suchi and
Timothy Patten and
David Fischinger and
Markus Vincze},
title = {EasyLabel: {A} Semi-Automatic Pixel-wise Object Annotation Tool for
Creating Robotic {RGB-D} Datasets},
booktitle = {International Conference on Robotics and Automation, {ICRA} 2019,
Montreal, QC, Canada, May 20-24, 2019},
pages = {6678--6684},
year = {2019},
crossref = {DBLP:conf/icra/2019},
url = {https://doi.org/10.1109/ICRA.2019.8793917},
doi = {10.1109/ICRA.2019.8793917},
timestamp = {Tue, 13 Aug 2019 20:25:20 +0200},
biburl = {https://dblp.org/rec/bib/conf/icra/SuchiPFV19},
bibsource = {dblp computer science bibliography, https://dblp.org}
}
@proceedings{DBLP:conf/icra/2019,
title = {International Conference on Robotics and Automation, {ICRA} 2019,
Montreal, QC, Canada, May 20-24, 2019},
publisher = {{IEEE}},
year = {2019},
url = {http://ieeexplore.ieee.org/xpl/mostRecentIssue.jsp?punumber=8780387},
isbn = {978-1-5386-6027-0},
timestamp = {Tue, 13 Aug 2019 20:23:21 +0200},
biburl = {https://dblp.org/rec/bib/conf/icra/2019},
bibsource = {dblp computer science bibliography, https://dblp.org}
}
For any questions or issues with the OCID-dataset, feel free to contact the author:
For specific questions about the OCID-semantic crops data please contact:
[1] Loghmani, Mohammad Reza et al. "Recognizing Objects in-the-Wild: Where do we Stand?" 2018 IEEE International Conference on Robotics and Automation (ICRA) (2018): 2170-2177.
[2] Berk Calli, Arjun Singh, James Bruce, Aaron Walsman, Kurt Konolige, Siddhartha Srinivasa, Pieter Abbeel, Aaron M Dollar, Yale-CMU-Berkeley dataset for robotic manipulation research, The International Journal of Robotics Research, vol. 36, Issue 3, pp. 261 – 268, April 2017.
Here we employed single cell RNA sequencing to identify the transcriptional program of Nanos and Vasa positive cells and their changes during development. Our single cell sequencing analysis of six developmental stages in P. miniata revealed cell types derived from the three germ layers and expression of the germ cell genes Nanos and Vasa. We used these datasets to parse out 20 cell lineages of the embryo identified by this approach and to focus on the key transitions of germ cell gene expression and test their coexpression with key signaling components. Overall design: Adult Patiria miniata animals were collected by either Peter Halmay (PeterHalmay@gmail.com) or Josh Ross (info@scbiomarine.com) off the Californian coast. Embryos were cultured essentially as described previously (Fresques et al., 2016). Embryos were cultured in filtered (0.2micron) sea water collected at the Marine Biological laboratories in Woods Hole MA, until the appropriate stage for dissociation. All embryos used in the study resulted from mating of one male and one female. Multiple fertilizations were initiated in this study and timed such that the appropriate stages of embryonic development were reached at a common endpoint. The embryos were then collected and washed twice with calcium-free sea water, and then suspended hyalin-extraction media (HEM) for 10-15 minutes, depending on the stage of dissociation. When cells were beginning to dissociate, the embryos were collected and washed in 0.5M NaCl, gently sheared with a pipette, run through a 40micron Nitex mesh, counted on a hemocytometer, and diluted to reach the appropriate concentration for the scRNA-seq protocol. Equal numbers of embryos were used in each time point and at no time were cells or embryos pelleted in a centrifuge (Oulhen et al., 2019).
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
IntroductionT cells induced from induced pluripotent stem cells(iPSCs) derived from antigen-specific T cells (T-iPS-T cells) are an attractive tool for T cell immunotherapy. The induction of cytotoxic T-iPS-T cells is well established in feeder-free condition for the aim of off-the-shelf production, however, the induction of helper T-iPS-T cells remains challenging.MethodsWe analyzed T-iPS-T cells matured in 3D organoid culture at different steps in the culture process at the single-cell level. T-iPS-T cell datasets were merged with an available human thymocyte dataset based in single-cell RNA sequencing (scRNA-seq). Particularly, we searched for genes crucial for generation CD4+ T-iPS-T cells by comparing T-iPS-T cells established in 2D feeder-free or 3D organoid culture.ResultsThe scRNA-seq data indicated that T-iPS-T cells are similar to T cells transitioning to human thymocytes, with SELENOW, GIMAP4, 7, SATB1, SALMF1, IL7R, SYTL2, S100A11, STAT1, IFITM1, LZTFL1 and SOX4 identified as candidate genes for the 2D feeder-free induction of CD4+ T-iPS-T cells.DiscussionThis study provides single cell transcriptome datasets of iPS-T cells and leads to further analysis for CD4+ T cell generation from T-iPSCs.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Non-small cell lung cancer (NSCLC) metastatic to the brain leptomeninges (LMD) is rapidly fatal, cannot be biopsied, and the number of cancer cells in the cerebral spinal fluid (CSF) are few; therefore, the tissue samples available for research and the development of effective treatments are severely limited. We overcame these obstacles using LMD patient CSF to perform massive parallel qPCR to analyze the cell-free RNA signatures (n=14), and performed single cell RNA sequencing (scRNAseq; n=197 cells from 4 patients).
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset contains single-cell RNA sequencing (scRNA-seq) data of 3,000 peripheral blood mononuclear cells (PBMCs) from a healthy donor, processed using the 10x Genomics Chromium platform. The raw data was obtained from 10x Genomics and subsequently aligned using Cell Ranger 8.0.1 with the GENCODE Release 47 (GRCh38.p14) reference genome.The dataset includes the following output files from the Cell Ranger pipeline:filtered_feature_bc_matrix.h5 – Filtered count matrix in HDF5 formatfiltered_feature_bc_matrix – Filtered gene-barcode matrix in directory formatraw_feature_bc_matrix – Raw gene-barcode matrix in directory formatraw_feature_bc_matrix.h5 – Raw count matrix in HDF5 formatThis dataset is valuable for researchers studying single-cell transcriptomics, immune cell profiling, and bioinformatics pipeline benchmarking.File format: HDF5 and Matrix Market (MTX)Reference Genome: GENCODE Release 47 (GRCh38.p14)Processing Pipeline: Cell Ranger 8.0.1For any questions or collaborations, please feel free to contact the uploader.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Objectives: The aim of the present study was to construct a polygenic risk score (PRS) for poor survival among patients with stomach adenocarcinoma (STAD) based on expression of malignant cell markers.Methods: Integrated analyses of bulk and single-cell RNA sequencing (scRNA-seq) of STAD and normal stomach tissues were conducted to identify malignant and non-malignant markers. Analyses of the scRNA-seq profile from early STAD were used to explore intratumoral heterogeneity (ITH) of the malignant cell subpopulations. Dimension reduction, cell clustering, pseudotime, and gene set enrichment analyses were performed. The marker genes of each malignant tissue and cell clusters were screened to create a PRS using Cox regression analyses. Combined with the PRS and routine clinicopathological characteristics, a nomogram tool was generated to predict prognosis of patients with STAD. The prognostic power of the PRS was validated in two independent external datasets.Results: The malignant and non-malignant cells were identified according to 50 malignant and non-malignant cell markers. The malignant cells were divided into nine clusters with different marker genes and biological characteristics. Pseudotime analysis showed the potential differentiation trajectory of these nine malignant cell clusters and identified genes that affect cell differentiation. Ten malignant cell markers were selected to generate a PRS: RGS1, AADAC, NPC2, COL10A1, PRKCSH, RAMP1, PRR15L, TUBA1A, CXCR6, and UPP1. The PRS was associated with both overall and progression-free survival (PFS) and proved to be a prognostic factor independent of routine clinicopathological characteristics. PRS could successfully divide patients with STAD in three datasets into high- or low-risk groups. In addition, we combined PRS and the tumor clinicopathological characteristics into a nomogram tool to help predict the survival of patients with STAD.Conclusion: We revealed limited but significant intratumoral heterogeneity in STAD and proposed a malignant cell subset marker-based PRS through integrated analysis of bulk sequencing and scRNA-seq data.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
BackgroundGlobal patterns of immune cell communications in the immune microenvironment of skin cutaneous melanoma (SKCM) haven’t been well understood. Here we recognized signaling roles of immune cell populations and main contributive signals. We explored how multiple immune cells and signal paths coordinate with each other and established a prognosis signature based on the key specific biomarkers with cellular communication.MethodsThe single-cell RNA sequencing (scRNA-seq) dataset was downloaded from the Gene Expression Omnibus (GEO) database, in which various immune cells were extracted and re-annotated according to cell markers defined in the original study to identify their specific signs. We computed immune-cell communication networks by calculating the linking number or summarizing the communication probability to visualize the cross-talk tendency in different immune cells. Combining abundant analyses of communication networks and identifications of communication modes, all networks were quantitatively characterized and compared. Based on the bulk RNA sequencing data, we trained specific markers of hub communication cells through integration programs of machine learning to develop new immune-related prognostic combinations.ResultsAn eight-gene monocyte-related signature (MRS) has been built, confirmed as an independent risk factor for disease-specific survival (DSS). MRS has great predictive values in progression free survival (PFS) and possesses better accuracy than traditional clinical variables and molecular features. The low-risk group has better immune functions, infiltrated with more lymphocytes and M1 macrophages, with higher expressions of HLA, immune checkpoints, chemokines and costimulatory molecules. The pathway analysis based on seven databases confirms the biological uniqueness of the two risk groups. Additionally, the regulon activity profiles of 18 transcription factors highlight possible differential regulatory patterns between the two risk groups, suggesting epigenetic event-driven transcriptional networks may be an important distinction. MRS has been identified as a powerful tool to benefit SKCM patients. Moreover, the IFITM3 gene has been identified as the key gene, validated to express highly at the protein level via the immunohistochemical assay in SKCM.ConclusionMRS is accurate and specific in evaluating SKCM patients’ clinical outcomes. IFITM3 is a potential biomarker. Moreover, they are promising to improve the prognosis of SKCM patients.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
IntroductionThe development of high-throughput sequencing technologies and targeted therapeutic strategies has significantly improved the prognosis of lung adenocarcinoma (LUAD) patients with sensitive gene mutations. However, patients harboring rare or no actionable mutations were rarely benefit from these targeted therapies. This study aimed to identify novel molecular subtypes and construct a prognostic signature to enhance the stratification of LUAD prognosis.Materials and methodsNovel molecular subtypes of LUAD patients were identified by applying 10 distinct clustering algorithms on multi-omics data. Single-cell RNA-sequencing (scRNA-seq) data were integrated to characterize subtype-specific immune microenvironments. A multi-omics and machine learning-driven prognostic signature (MO-MLPS) was constructed in The Cancer Genome Atlas (TCGA) LUAD dataset using ten machine learning algorithms and subsequently validated across six independent datasets from the Gene Expression Omnibus (GEO) database. The robustness of the model was assessed using the concordance index (C-index), Kaplan-Meier survival analyses, receiver operating characteristic (ROC) curves, and both univariate and multivariate Cox regression analyses. We further confirmed the effects of ANLN knockdown and the expression of a domain-negative anillin protein (dnANLN) via western blotting, cell proliferation assays, flow cytometry, and transwell migration assays in vitro.ResultsOur analysis revealed that the novel molecular subtypes exhibited differences in prognoses, biological functions, and immune infiltration profiles in LUAD. The MO-MLPS was successfully established and validated across TCGA-LUAD cohorts, six independent GEO datasets, and their composite meta-cohort. Higher risk scores from the MO-MLPS correlated with poorer prognosis in LUAD, with AUC values exceeding 0.5 at 1, 3, and 5 years across various cohorts. The signature outperformed 49 previously published prognostic signatures. Furthermore, patients classified as high risk exhibited significantly worse overall and progression-free survival than those classified as low risk. Notably, ANLN knockdown and dnANLN expression significantly inhibited cell proliferation and migration in vitro and enhanced the efficacy of docetaxel.ConclusionA comprehensive analysis of multi-omics data redefines the molecular subtype of LUAD patients. The MO-MLPS derived from subtype characteristics has the potential to serve as a clinically valuable prognostic tool. Furthermore, ANLN emerges as a promising novel therapeutic target in the treatment of LUAD.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Evaluation of GraphFP’s performance on quantifying the stochastic dynamics of cell-type frequencies with cell-cell interaction term (W ≠ 0) and without cell-cell interaction term (W = 0) on the murine cerebral cortex dataset.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
BackgroundIdiopathic pulmonary fibrosis (IPF) is one of interstitial lung diseases (ILDs) with poor prognosis. S100 calcium binding protein A12 (S100A12) has been reported as a prognostic serum biomarker in the IPF, but its correlation with IPF remains unclear in the lung tissue and bronchoalveolar lavage fluids (BALF).MethodsDatasets were collected from the Gene Expression Omnibus (GEO) database. Person correlation coefficient, Kaplan–Meier analysis, Cox regression analysis, functional enrichment analysis and so on were used. And single cell RNA-sequencing (scRNA-seq) analysis was also used to explore the role of S100A12 and related genes in the IPF.ResultsS100A12 was mainly and highly expressed in the monocytes, and its expression was downregulated in the lung of patients with IPF according to scRNA-seq and the transcriptome analysis. However, S100A12 expression was upregulated both in blood and BALF of patients with IPF. In addition, 10 genes were found to interact with S100A12 according to protein–protein interaction (PPI) network, and the first four transcription factors (TF) targeted these genes were found according to hTFtarget database. Two most significant co-expression genes of S100A12 were S100A8 and S100A9. The 3 genes were significantly negatively associated with lung function and positively associated with the St. George’s Respiratory Questionnaire (SGRQ) scores in the lung of patients with IPF. And, high expression of the 3 genes was associated with higher mortality in the BALF, and shorter transplant-free survival (TFS) and progression-free survival (PFS) time in the blood. Prognostic predictive value of S100A12 was more superior to S100A8 and S100A9 in patients with IPF, and the composited variable [S100A12 + GAP index (gender, age, and physiological index)] may be a more effective predictive index.ConclusionThese results imply that S100A12 might be an efficient disease severity and prognostic biomarker in patients with IPF.
Not seeing a result you expected?
Learn how you can add new datasets to our index.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Processsed data for MASI manuscript