100+ datasets found

b
Bgee gene expression data
bgee.org
Updated May 21, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
The Bgee Team (2024). Bgee gene expression data [Dataset]. https://www.bgee.org
Explore at:
Dataset updated
May 21, 2024
Dataset authored and provided by
The Bgee Team
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
Bgee is a database for retrieval and comparison of gene expression patterns across multiple animal species. It provides an intuitive answer to the question -where is a gene expressed?- and supports research in cancer and agriculture, as well as evolutionary biology.
p
Human Protein Atlas - Tissue
proteinatlas.org
v25.proteinatlas.org
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Human Protein Atlas, Human Protein Atlas - Tissue [Dataset]. https://www.proteinatlas.org/humanproteome/tissue
Explore at:
Dataset authored and provided by
Human Protein Atlas
License
https://www.proteinatlas.org/about/licencehttps://www.proteinatlas.org/about/licence
Description
Tissue methods

This resource of the Human Protein Atlas focuses on the expression profiles in human tissues of genes both on the mRNA and protein level. The protein expression data from 45 normal human tissue types is derived from antibody-based protein profiling using conventional and multiplex immunohistochemistry. All underlying images of immunohistochemistry stained normal tissues are available together with knowledge-based annotation of protein expression levels. The protein data covers 15312 genes (76%) for which there are available antibodies. The mRNA expression data is derived from deep sequencing of RNA (RNA-seq) from 51 different normal tissue types.

More information about the specific content and the generation and analysis of the data in the resource can be found on the Methods Summary. Learn about:

protein localization in tissues at a single-cell level if a gene is enriched in a particular tissue (specificity) which genes have a similar expression profile across tissues (expression cluster)
Human Gene Expression Database Data Package
johnsnowlabs.com
csv
Updated Jan 20, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
John Snow Labs (2021). Human Gene Expression Database Data Package [Dataset]. https://www.johnsnowlabs.com/marketplace/human-gene-expression-database-data-package/
Explore at:
csvAvailable download formats
Dataset updated
Jan 20, 2021
Dataset authored and provided by
John Snow Labs
Description
This data package contains expression profiles for proteins in normal and cancer tissues. It also contains data on sequence based RNA levels in human tissue and cell line.
d
Bgee: dataBase for Gene Expression Evolution
dknet.org
neuinfo.org
+2more
Updated Jan 29, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2022). Bgee: dataBase for Gene Expression Evolution [Dataset]. http://identifiers.org/RRID:SCR_002028
Explore at:
Unique identifier
https://identifiers.org/RRID:SCR_002028
Dataset updated
Jan 29, 2022
Description
Database to retrieve and compare gene expression patterns between animal species. Bgee first maps heterogeneous expression data (currently bulk RNA-Seq, scRNA-Seq, Affymetrix, in situ hybridization, and EST data) to anatomy and development of different species. Bgee is based exclusively on curated healthy wild-type expression data (e.g., no gene knock-out, no treatment, no disease), to provide a comparable reference of gene expression.
p
Human Protein Atlas - Brain
proteinatlas.org
v25.proteinatlas.org
Updated Sep 18, 2017
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Human Protein Atlas (2017). Human Protein Atlas - Brain [Dataset]. https://www.proteinatlas.org/humanproteome/brain
Explore at:
Dataset updated
Sep 18, 2017
Dataset authored and provided by
Human Protein Atlas
License
https://www.proteinatlas.org/about/licencehttps://www.proteinatlas.org/about/licence
Description
Brain methods

This resource provides comprehensive spatial profiling of the Brain, including overview of protein expression in the mammalian brain based on integration of data from human, pig and mouse. Transcriptomics data combined with affinity-based protein in situ localization down to single cell detail is available in this brain-centric sub atlas of the Human Protein Atlas. The data presented are for human genes and their one-to-one orthologues in pig and mouse. Gene summary pages provide the hierarchical expression landscape form 13 main regions of the brain to individual nuclei and subfields for every protein coding gene. For selected proteins, high content images are available to explore the cellular and subcellular protein distribution. In addition, the Brain resource contains lists of genes with elevated expression in one or a group of regions to help the user identify unique protein expression profiles linked to physiology and function.

More information about the specific content and the generation and analysis of the data in this resource can be found on the Methods Summary. Learn about:

Expression levels for all human proteins in regions and subregions of the human brain Expression levels for all proteins with human orthologs in regions and subregions of the pig and mouse brain Brain enriched genes with higher expression in any of the regions of the brain compared to peripheral organs Regional enriched genes with higher expression in a single or few regions of the brain Cell-type and cell-compartment distribution of selected proteins in the human and mouse brain Differences in gene expression between mammalian species

Additional information: In addition to the data provided in the brain resource there is also data on human retina and single cell data containing information on protein expression in human neuronal and non-neuronal cell-types in the central nervous system.
u
Data from: Plant Expression Database
agdatacommons.nal.usda.gov
datasetcatalog.nlm.nih.gov
+2more
bin
Updated Feb 9, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sudhansu S. Dash; John Van Hemert; Lu Hong; Roger P. Wise; Julie A. Dickerson (2024). Plant Expression Database [Dataset]. https://agdatacommons.nal.usda.gov/articles/dataset/Plant_Expression_Database/24661179
Explore at:
binAvailable download formats
Dataset updated
Feb 9, 2024
Dataset provided by
PLEXdb
Authors
Sudhansu S. Dash; John Van Hemert; Lu Hong; Roger P. Wise; Julie A. Dickerson
License
U.S. Government Workshttps://www.usa.gov/government-works
License information was derived automatically
Description
[NOTE: PLEXdb is no longer available online. Oct 2019.] PLEXdb (Plant Expression Database) is a unified gene expression resource for plants and plant pathogens. PLEXdb is a genotype to phenotype, hypothesis building information warehouse, leveraging highly parallel expression data with seamless portals to related genetic, physical, and pathway data. PLEXdb (http://www.plexdb.org), in partnership with community databases, supports comparisons of gene expression across multiple plant and pathogen species, promoting individuals and/or consortia to upload genome-scale data sets to contrast them to previously archived data. These analyses facilitate the interpretation of structure, function and regulation of genes in economically important plants. A list of Gene Atlas experiments highlights data sets that give responses across different developmental stages, conditions and tissues. Tools at PLEXdb allow users to perform complex analyses quickly and easily. The Model Genome Interrogator (MGI) tool supports mapping gene lists onto corresponding genes from model plant organisms, including rice and Arabidopsis. MGI predicts homologies, displays gene structures and supporting information for annotated genes and full-length cDNAs. The gene list-processing wizard guides users through PLEXdb functions for creating, analyzing, annotating and managing gene lists. Users can upload their own lists or create them from the output of PLEXdb tools, and then apply diverse higher level analyses, such as ANOVA and clustering. PLEXdb also provides methods for users to track how gene expression changes across many different experiments using the Gene OscilloScope. This tool can identify interesting expression patterns, such as up-regulation under diverse conditions or checking any gene’s suitability as a steady-state control. Resources in this dataset:Resource Title: Website Pointer for Plant Expression Database, Iowa State University. File Name: Web Page, url: https://www.bcb.iastate.edu/plant-expression-database [NOTE: PLEXdb is no longer available online. Oct 2019.] Project description for the Plant Expression Database (PLEXdb) and integrated tools.
Data from: Comparing RNA-Seq and microarray gene expression data in two...
data.nasa.gov
osdr.nasa.gov
Updated Apr 1, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
nasa.gov (2025). Comparing RNA-Seq and microarray gene expression data in two zones of the Arabidopsis root apex relevant to spaceflight. [Dataset]. https://data.nasa.gov/dataset/comparing-rna-seq-and-microarray-gene-expression-data-in-two-zones-of-the-arabidopsis-root
Explore at:
Dataset updated
Apr 1, 2025
Dataset provided by
NASAhttp://nasa.gov/
Description
Premise of the study: The root apex is an important region involved in environmental sensing, but comprises a very small part of the root. Obtaining root apex transcriptomes is therefore challenging when the samples are limited. The feasibility of using tiny root sections for transcriptome analysis was examined, comparing RNA sequencing (RNA-Seq) to microarrays in characterizing genes that are relevant to spaceflight.Methods:Arabidopsis thaliana Columbia ecotype (Col-0) roots were sectioned into Zone 1 (0.5 mm; root cap and meristematic zone) and Zone 2 (1.5 mm; transition, elongation, and growth-terminating zone). Differential gene expression in each was compared.Results: Both microarrays and RNA-Seq proved applicable to the small samples. A total of 4180 genes were differentially expressed (with fold changes of 2 or greater) between Zone 1 and Zone 2. In addition, 771 unique genes and 19 novel transcriptionally active regions were identified by RNA-Seq that were not detected in microarrays. However, microarrays detected spaceflight-relevant genes that were missed in RNA-Seq. Discussion: Single root tip subsections can be used for transcriptome analysis using either RNA-Seq or microarrays. Both RNA-Seq and microarrays provided novel information. These data suggest that techniques for dealing with small, rare samples from spaceflight can be further enhanced, and that RNA-Seq may miss some spaceflight-relevant changes in gene expression.
r
Gene Expression Database
rrid.site
dknet.org
+2more
Updated Jan 29, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2022). Gene Expression Database [Dataset]. http://identifiers.org/RRID:SCR_006539
Explore at:
Unique identifier
https://identifiers.org/RRID:SCR_006539
Dataset updated
Jan 29, 2022
Description
Community database that collects and integrates the gene expression information in MGI with a primary emphasis on endogenous gene expression during mouse development. The data in GXD are obtained from the literature, from individual laboratories, and from large-scale data providers. All data are annotated and reviewed by GXD curators. GXD stores and integrates different types of expression data (RNA in situ hybridization; Immunohistochemistry; in situ reporter (knock in); RT-PCR; Northern and Western blots; and RNase and Nuclease s1 protection assays) and makes these data freely available in formats appropriate for comprehensive analysis. There is particular emphasis on endogenous gene expression during mouse development. GXD also maintains an index of the literature examining gene expression in the embryonic mouse. It is comprehensive and up-to-date, containing all pertinent journal articles from 1993 to the present and articles from major developmental journals from 1990 to the present. GXD stores primary data from different types of expression assays and by integrating these data, as data accumulate, GXD provides increasingly complete information about the expression profiles of transcripts and proteins in different mouse strains and mutants. GXD describes expression patterns using an extensive, hierarchically-structured dictionary of anatomical terms. In this way, expression results from assays with differing spatial resolution are recorded in a standardized and integrated manner and expression patterns can be queried at different levels of detail. The records are complemented with digitized images of the original expression data. The Anatomical Dictionary for Mouse Development has been developed by our Edinburgh colleagues, as part of the joint Mouse Gene Expression Information Resource project. GXD places the gene expression data in the larger biological context by establishing and maintaining interconnections with many other resources. Integration with MGD enables a combined analysis of genotype, sequence, expression, and phenotype data. Links to PubMed, Online Mendelian Inheritance in Man (OMIM), sequence databases, and databases from other species further enhance the utility of GXD. GXD accepts both published and unpublished data.
n
ncRNA Expression Database
neuinfo.org
scicrunch.org
+2more
Updated Oct 15, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). ncRNA Expression Database [Dataset]. http://identifiers.org/RRID:SCR_008630
Explore at:
Unique identifier
https://identifiers.org/RRID:SCR_008630
Dataset updated
Oct 15, 2024
Description
Database of long noncoding RNA expression that integrates annotated expression data from various sources in human and mouse. The database contains both microarray and in situ hybridization data, and supplies a rich tapestry of ancillary information for featured ncRNAs, including evolutionary conservation, secondary structure evidence, genomic context links and antisense relationships.
f
Table1_Preclinical species gene expression database: Development and...
datasetcatalog.nlm.nih.gov
Updated Jan 17, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Vo, Andy; Krause, Caitlin; Liguori, Michael J.; Kowalkowski, Kenneth; Van Vleet, Terry R.; Suwada, Kinga; Mittelstadt, Scott; Rendino, Lauren; Mahalingaiah, Prathap Kumar; Peterson, Richard; Blomme, Eric A. G. (2023). Table1_Preclinical species gene expression database: Development and meta-analysis.docx [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0001025224
Explore at:
Dataset updated
Jan 17, 2023
Authors
Vo, Andy; Krause, Caitlin; Liguori, Michael J.; Kowalkowski, Kenneth; Van Vleet, Terry R.; Suwada, Kinga; Mittelstadt, Scott; Rendino, Lauren; Mahalingaiah, Prathap Kumar; Peterson, Richard; Blomme, Eric A. G.
Description
The evaluation of toxicity in preclinical species is important for identifying potential safety liabilities of experimental medicines. Toxicology studies provide translational insight into potential adverse clinical findings, but data interpretation may be limited due to our understanding of cross-species biological differences. With the recent technological advances in sequencing and analyzing omics data, gene expression data can be used to predict cross species biological differences and improve experimental design and toxicology data interpretation. However, interpreting the translational significance of toxicogenomics analyses can pose a challenge due to the lack of comprehensive preclinical gene expression datasets. In this work, we performed RNA-sequencing across four preclinical species/strains widely used for safety assessment (CD1 mouse, Sprague Dawley rat, Beagle dog, and Cynomolgus monkey) in ∼50 relevant tissues/organs to establish a comprehensive preclinical gene expression body atlas for both males and females. In addition, we performed a meta-analysis across the large dataset to highlight species and tissue differences that may be relevant for drug safety analyses. Further, we made these databases available to the scientific community. This multi-species, tissue-, and sex-specific transcriptomic database should serve as a valuable resource to enable informed safety decision-making not only during drug development, but also in a variety of disciplines that use these preclinical species.
RNA-seq example data
kaggle.com
zip
Updated Jun 16, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Tuhin Rana (2023). RNA-seq example data [Dataset]. https://www.kaggle.com/datasets/rana2hin/rna-seq-example-data
Explore at:
zip(2193914798 bytes)Available download formats
Dataset updated
Jun 16, 2023
Authors
Tuhin Rana
License
https://cdla.io/sharing-1-0/https://cdla.io/sharing-1-0/
Description
Dataset Description

This dataset contains RNA-seq data from human cells. The data was collected using the Illumina HiSeq 2500 platform. The data includes raw sequencing reads, gene annotations, and phenotypic data for the samples.

Files and Folders

Files can be downloaded using the following command:

wget ftp://ftp.ccb.jhu.edu/pub/RNAseq_protocol/chrX_data.tar.gz

Once the file has been downloaded, it can be extracted using the following command:

tar xvzf chrX_data.tar.gz

This will create a directory called chrX_data containing the following files:

genes/chrX.gtf genome/chrX.fa geuvadis_phenodata.csv indexes/ mergelist.txt samples/

Here are some additional details about the files in the chrX_data directory:

genes/chrX.gtf - This file contains gene annotations for the human X chromosome. It is in the GTF format, which is a standard format for gene annotations. The GTF file contains information about the start and end positions of genes, as well as their transcripts.

genome/chrX.fa - This file contains the reference genome sequence for the human X chromosome. It is in the FASTA format, which is a standard format for storing DNA sequences.

geuvadis_phenodata.csv - This file contains phenotypic data for the samples in the dataset. The phenotypic data includes information such as the age, sex, and disease status of the samples.

indexes/ - This directory contains index files for HISAT2. Index files are used to speed up the alignment of sequencing reads to a reference genome.

mergelist.txt - This file lists the samples to be merged. The samples in the samples/ directory can be merged using a variety of tools, such as BEDTools and STAR.

samples/ - This directory contains the raw sequencing data. The raw sequencing data is in the FASTQ format, which is a standard format for storing sequencing reads.

Usage

This dataset can be used to perform RNA-seq analysis using a variety of tools, such as HISAT2, StringTie, and Ballgown.

Here are some examples of how this dataset can be used:

To identify differentially expressed genes between two groups of samples.

To build a gene expression atlas for a particular tissue or cell type.

To study the expression of genes involved in a particular disease.

source: ftp://ftp.ccb.jhu.edu/pub/RNAseq_protocol/chrX_data.tar.gz
DataSheet2_Preclinical species gene expression database: Development and...
frontiersin.figshare.com
xlsx
Updated Jun 3, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Caitlin Krause; Kinga Suwada; Eric A. G. Blomme; Kenneth Kowalkowski; Michael J. Liguori; Prathap Kumar Mahalingaiah; Scott Mittelstadt; Richard Peterson; Lauren Rendino; Andy Vo; Terry R. Van Vleet (2023). DataSheet2_Preclinical species gene expression database: Development and meta-analysis.xlsx [Dataset]. http://doi.org/10.3389/fgene.2022.1078050.s002
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.3389/fgene.2022.1078050.s002
Dataset updated
Jun 3, 2023
Dataset provided by
Frontiers Mediahttp://www.frontiersin.org/
Authors
Caitlin Krause; Kinga Suwada; Eric A. G. Blomme; Kenneth Kowalkowski; Michael J. Liguori; Prathap Kumar Mahalingaiah; Scott Mittelstadt; Richard Peterson; Lauren Rendino; Andy Vo; Terry R. Van Vleet
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The evaluation of toxicity in preclinical species is important for identifying potential safety liabilities of experimental medicines. Toxicology studies provide translational insight into potential adverse clinical findings, but data interpretation may be limited due to our understanding of cross-species biological differences. With the recent technological advances in sequencing and analyzing omics data, gene expression data can be used to predict cross species biological differences and improve experimental design and toxicology data interpretation. However, interpreting the translational significance of toxicogenomics analyses can pose a challenge due to the lack of comprehensive preclinical gene expression datasets. In this work, we performed RNA-sequencing across four preclinical species/strains widely used for safety assessment (CD1 mouse, Sprague Dawley rat, Beagle dog, and Cynomolgus monkey) in ∼50 relevant tissues/organs to establish a comprehensive preclinical gene expression body atlas for both males and females. In addition, we performed a meta-analysis across the large dataset to highlight species and tissue differences that may be relevant for drug safety analyses. Further, we made these databases available to the scientific community. This multi-species, tissue-, and sex-specific transcriptomic database should serve as a valuable resource to enable informed safety decision-making not only during drug development, but also in a variety of disciplines that use these preclinical species.
r
HUDSEN Human Gene Expression Spatial Database
rrid.site
Updated Jan 29, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2022). HUDSEN Human Gene Expression Spatial Database [Dataset]. http://identifiers.org/RRID:SCR_006325
Explore at:
Unique identifier
https://identifiers.org/RRID:SCR_006325
Dataset updated
Jan 29, 2022
Description
Database of a set of standard 3D virtual models at different stages of development from Carnegie Stages (CS) 12-23 (approximately 26-56 days post conception) in which various anatomical regions have been defined with a set of anatomical terms at various stages of development (known as an ontology). Experimental data is captured and converted to digital format and then mapped to the appropriate 3D model. The ontology is used to define sites of gene expression using a set of standard descriptions and to link the expression data to an ''''anatomical tree''''. Human data from stages CS12 to CS23 can be submitted to the HUDSEN Gene Expression Database. The anatomy ontology currently being used is based on the Edinburgh Human Developmental Anatomy Database which encompasses all developing structures from CS1 to CS20 but is not detailed for developing brain structures. The ontology is being extended and refined (by Prof Luis Puelles, University of Murcia, Spain) and will be incorporated into the HUDSEN database as it is developed. Expression data is annotated using two methods to denote sites of expression in the embryo: spatial annotation and text annotation. Additionally, many aspects of the detection reagent and specimen are also annotated during this process (assignment of IDs, nucleotide sequences for probes etc). There are currently two main ways to search HUDSEN - using a gene/protein name or a named anatomical structure as the query term. The entire contents of the database can be browsed using the data browser. Results may be saved. The data in HUDSEN is generated from both from researchers within the HUDSEN project, and from the wider scientific community. The HUDSEN human gene expression spatial database is a collaboration between the Institute of Human Genetics in Newcastle, UK, and the MRC Human Genetics Unit in Edinburgh, UK, and was developed as part of the Electronic Atlas of the Developing Human Brain (EADHB) project (funded by the NIH Human Brain Project). The database is based on the Edinburgh Mouse Atlas gene expression database (EMAGE), and is designed to be an openly available resource to the research community holding gene expression patterns during early human development.
Data, R code and output Seurat Objects for single cell RNA-seq analysis of...
figshare.com
application/gzip
Updated May 31, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yunshun Chen; Gordon Smyth (2023). Data, R code and output Seurat Objects for single cell RNA-seq analysis of human breast tissues [Dataset]. http://doi.org/10.6084/m9.figshare.17058077.v1
Explore at:
application/gzipAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.17058077.v1
Dataset updated
May 31, 2023
Dataset provided by
Figsharehttp://figshare.com/
figshare
Authors
Yunshun Chen; Gordon Smyth
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset contains all the Seurat objects that were used for generating all the figures in Pal et al. 2021 (https://doi.org/10.15252/embj.2020107333). All the Seurat objects were created under R v3.6.1 using the Seurat package v3.1.1. The detailed information of each object is listed in a table in Chen et al. 2021.
R data set: The Cancer Genome Atlas Gene Expression data
zenodo.org
bin
Updated Jan 24, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Christian Diener; Christian Diener (2020). R data set: The Cancer Genome Atlas Gene Expression data [Dataset]. http://doi.org/10.5281/zenodo.61982
Explore at:
binAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.61982
Dataset updated
Jan 24, 2020
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Christian Diener; Christian Diener
License
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Description
This compound data set comprises the following information from the The Cancer Genome Atlas:

RNA-Seq counts for 60483 genes across 11093 samples

HuEx 1.0 ST gene expression data for 18632 genes across 1211 samples

clinical indicators for 11160 patients

All gene expression data is annotated across ENSEMBL, ENTREZ and symbols. Samples are annotated by TCGA barcodes.

To read the data set into R (requires 6 GB of RAM) use:

tcga <- readRDS("tcga.rds")
Breast Cancer Gene Expression Dataset
kaggle.com
mubashirali.vercel.app
zip
Updated Jan 23, 2026
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mubashir Ali (2026). Breast Cancer Gene Expression Dataset [Dataset]. https://www.kaggle.com/datasets/mubashir1837/breast-cancer-gene-expression-dataset
Explore at:
zip(1911603 bytes)Available download formats
Dataset updated
Jan 23, 2026
Authors
Mubashir Ali
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
Breast Cancer Gene Expression Dataset

This dataset contains RNA-seq gene expression data from 58 breast cancer patients treated with neoadjuvant chemotherapy (NAC). The data is derived from GSE280902 on NCBI GEO.

Files

cleaned_expression.csv: Gene expression matrix with 58 samples (rows) and 28,278 genes (columns). The last column is 'Response' (1 for responder, 0 for non-responder).

labels.csv: Sample labels with response to NAC.

Data Description

Samples: 58 breast cancer patients (29 responders, 29 non-responders to NAC).

Genes: 28,278 protein-coding genes.

Response: 1 = Pathological Complete Response (pCR), 0 = No Response.

Source

GEO Accession: GSE280902

Paper: Guevara-Nieto HM et al. Identification of predictive pretreatment biomarkers for neoadjuvant chemotherapy response in Latino invasive breast cancer patients. Mol Med 2025.

GitHub Repository: Breast Cancer Gene Expression Processed Data

Usage

This dataset can be used for machine learning models to predict NAC response in breast cancer based on gene expression profiles.

License

This project is licensed under the MIT License - see the LICENSE file for details.
d
Data from: Gene Expression Omnibus (GEO)
catalog.data.gov
data.virginia.gov
+2more
Updated Jul 26, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
National Institutes of Health (NIH) (2023). Gene Expression Omnibus (GEO) [Dataset]. https://catalog.data.gov/dataset/gene-expression-omnibus-geo
Explore at:
Dataset updated
Jul 26, 2023
Dataset provided by
National Institutes of Health (NIH)
Description
Gene Expression Omnibus is a public functional genomics data repository supporting MIAME-compliant submissions of array- and sequence-based data. Tools are provided to help users query and download experiments and curated gene expression profiles.
Comparative gene expression analysis in the Arabidopsis thaliana root apex...
catalog.data.gov
datasets.ai
Updated Apr 24, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
National Aeronautics and Space Administration (2025). Comparative gene expression analysis in the Arabidopsis thaliana root apex using RNA-seq and microarray transcriptome profiles [Dataset]. https://catalog.data.gov/dataset/comparative-gene-expression-analysis-in-the-arabidopsis-thaliana-root-apex-using-rna-seq-a-b73a6
Explore at:
Dataset updated
Apr 24, 2025
Dataset provided by
NASAhttp://nasa.gov/
Description
The root apex is an important section of the plant root involved in environmental sensing and cellular development. Analyzing the gene profile of root apex in diverse environments is important and challenging especially when the samples are limiting and precious such as in spaceflight. The feasibility of using tiny root sections for transcriptome analysis was examined in this study. To understand the gene expression profiles of the root apex Arabidopsis thaliana Col-0 roots were sectioned into Zone-I (0.5 mm root cap and meristematic zone) and Zone-II (1.5 mm transition elongation and growth terminating zone). Gene expression was analyzed using microarray and RNA seq. Both the techniques arrays and RNA-Seq identified 4180 common genes as differentially expressed (with > two-fold changes) between the zones. In addition 771 unique genes and 19 novel TARs were identified by RNA-Seq as differentially expressed which were not detected in the arrays. Single root tip zones can be used for full transcriptome analysis; further the root apex zones are functionally very distinct from each other. RNA-Seq provided novel information about the transcripts compared to the arrays. These data will help optimize transcriptome techniques for dealing with small rare samples.
Gene Expression Data
kaggle.com
zip
Updated May 3, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Salehin (2021). Gene Expression Data [Dataset]. https://www.kaggle.com/salehin321/gene-expression-data
Explore at:
zip(6624691 bytes)Available download formats
Dataset updated
May 3, 2021
Authors
Salehin
Description
Context

Gene expression dataset has use in biomedical engineering and survival analysis. This dataset has been collected from The Cancer Genome Atlas portal for my master's project

Content

The dataset contains cell counts for each genes for each patients. There are 4571 columns (these are features) and the row represents samples or patients

Acknowledgements

TCGA portal

Inspiration

Can we predict the survival of the patients using these gene expression data?
r
RNA-sequencing data from: The AML cellular state space unveils NPM1 immune...
researchdata.se
figshare.scilifelab.se
Updated Oct 7, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Henrik Lilljebjörn; Thoas Fioretos (2025). RNA-sequencing data from: The AML cellular state space unveils NPM1 immune evasion subtypes with distinct clinical outcomes, and: The complement receptor C3AR constitutes a novel therapeutic target in NPM1-mutated AML [Dataset]. http://doi.org/10.17044/SCILIFELAB.21557163
Explore at:
Unique identifier
https://doi.org/10.17044/SCILIFELAB.21557163
Dataset updated
Oct 7, 2025
Dataset provided by
Lund University
Authors
Henrik Lilljebjörn; Thoas Fioretos
License
Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
License information was derived automatically
Description
This dataset contains bulk RNA-sequencing (RNA-seq) gene expression data from from 120 AML-samples from the subtypes NPM1 (n=33), AML-MR (n=30), TP53 (n=18), PML::RARA (n=8), CBFB::MYH11 (n=8), AML without class defining mutations (n=8), RUNX1::RUNX1T1 (n=3), KMT2A fusion genes (n=3), AML meeting the criteria for two subtypes (n=2), DEK-NUP214 (n=2), GATA2::MECOM (n=1), and bialleleic CEBPA mutation (n=1). The single cell libraries were constructed from bone marrow (n=102) or peripheral blood (n=18) using the TruSeq RNA Library Prep Kit v2 (Illumina) and sequenced on a NextSeq 500. Reads were aligned against human reference genome hg19 and read counts were determined using RSEM v1.2.30 (https://github.com/deweylab/RSEM) with gencode v19 as gene reference. Data is available as fpkm-values as determined by RSEM. Raw sequencing reads (fastq) are available at the European Genome-Phenome Archive (EGA) under accession ID EGAD50000001576: https://ega-archive.org/datasets/EGAD50000001576.

Facebook

Twitter

Click to copy link

Link copied

Cite

The Bgee Team (2024). Bgee gene expression data [Dataset]. https://www.bgee.org

Bgee gene expression data

The Bgee Team

Explore at:

258 scholarly articles cite this dataset (View in Google Scholar)

Dataset updated

May 21, 2024

Dataset authored and provided by

The Bgee Team

License

CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically

Description

Bgee is a database for retrieval and comparison of gene expression patterns across multiple animal species. It provides an intuitive answer to the question -where is a gene expressed?- and supports research in cancer and agriculture, as well as evolutionary biology.

Clear search

Close search

Google apps

Main menu

Bgee gene expression data

Human Protein Atlas - Tissue

Human Gene Expression Database Data Package

Bgee: dataBase for Gene Expression Evolution

Human Protein Atlas - Brain

Data from: Plant Expression Database

Data from: Comparing RNA-Seq and microarray gene expression data in two...

Gene Expression Database

ncRNA Expression Database

Table1_Preclinical species gene expression database: Development and...

RNA-seq example data

DataSheet2_Preclinical species gene expression database: Development and...

HUDSEN Human Gene Expression Spatial Database

Data, R code and output Seurat Objects for single cell RNA-seq analysis of...

R data set: The Cancer Genome Atlas Gene Expression data

Breast Cancer Gene Expression Dataset

Breast Cancer Gene Expression Dataset

Files

Data Description

Source

Usage

License

Data from: Gene Expression Omnibus (GEO)

Comparative gene expression analysis in the Arabidopsis thaliana root apex...

Gene Expression Data

Context

Content

Acknowledgements

Inspiration

RNA-sequencing data from: The AML cellular state space unveils NPM1 immune...

Bgee gene expression data

The Bgee Team