Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The single cell Alzheimer's Disease Data Portal is an aggregated data portal created as part of the Enfield EU Funded program for the single-cell Generative Pretrained Transformer (scGPT-AD) model research. The data portal contains data from the ssREAD data portal, along with single-cell AD data from latest studies (dharsini et al, pan et al, rexach et al). The data from the individual studies where accessed through the cellXgene data portal, a vast portal for single cell data. The data have been uploaded in two seperate .zip files (part1, part2).
The single cell data follow the Annotated Data format. The core data for each sample is the gene-expression matrix, which refers to the level of expression of each gene in a single cell. Additionally, the dataset contains the `.obs` attributed which includes core cell metadata for each of the sample (cell type, brain region, braak stage, donor age, disease condition, donor gender, etc.), along with the gene names accessed via `.var` attribute.
The source data have been processed to create a unified data portal ready to be used as training dataset for a Transformer model. The main processing steps were:
Total Cells |
2.3M |
AD Cells |
1.2M |
Control Cells |
1.1M |
Unique Genes |
91k |
Donors |
166 |
Data Source |
Unique Genes |
Total Cells |
AD Cells |
Control Cells |
Donors |
Cell Type Label |
Brain Region |
Tissue Type |
Braak Stage |
Donors Id |
Donor Gender |
Donor Age |
rexach et al |
30k |
217k |
118k |
99k |
20 |
✅ |
✘ |
✅ |
✘ |
✅ |
✅ |
✅ |
pan et al |
61k |
43k |
11k |
32k |
7 |
✅ |
✅ |
✅ |
✅ |
✅ |
✅ |
✅ |
dharsini et al |
61k |
425k |
311k |
114k |
46 |
✅ |
✅ |
✅ |
✅ |
✅ |
✅ |
✅ |
ssREAD |
62k |
2.42M |
1.14M |
1.28M |
135 |
✅ |
✅ |
✘ |
✅ |
✅ |
✅ |
✅ |
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This folder contains pre-filtered files of Tabula sapiens per tissue used to generate scvi models stored in scvi-hub. Due to inconsistencies in the cell-type resolution across donors data was filtered. Please refer to pre-processed files as adata object for the trained scvi models which contains gene filtered and minified data for the models.
Data is preprocessed data downloaded from https://cellxgene.cziscience.com/collections/e5f58829-1a66-40b5-a624-9046778e74f5. Please refer to their data usage guide before reusing the data.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Dataset published by Kuppe et al. 2022, which contains data from human cardiac remodelling after myocardial infarction using single-cell gene expression, chromatin accessibility and spatial transcriptomic profiling of multiple physiological zones in myocardium from patients with myocardial infarction and controls.
The dataset available under this link contains the single-cell gene expression part from all control patients. Citation Kuppe, C., Ramirez Flores, R.O., Li, Z. et al. Spatial multi-omic map of human myocardial infarction. Nature 608, 766–777 (2022). Manuscript link https://www.nature.com/articles/s41586-022-05060-x Original data link https://cellxgene.cziscience.com/collections/8191c283-0816-424b-9b61-c3e1d6258a77
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
These are pre-trained models and AnnData datasets based on Tabula sapiens. Models were subsequentially uploaded to scvi-hub and this repository is there to restore the models on hugging face.
Data is preprocessed data downloaded from https://cellxgene.cziscience.com/collections/e5f58829-1a66-40b5-a624-9046778e74f5. Please refer to their data usage guide before reusing the data.
Not seeing a result you expected?
Learn how you can add new datasets to our index.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The single cell Alzheimer's Disease Data Portal is an aggregated data portal created as part of the Enfield EU Funded program for the single-cell Generative Pretrained Transformer (scGPT-AD) model research. The data portal contains data from the ssREAD data portal, along with single-cell AD data from latest studies (dharsini et al, pan et al, rexach et al). The data from the individual studies where accessed through the cellXgene data portal, a vast portal for single cell data. The data have been uploaded in two seperate .zip files (part1, part2).
The single cell data follow the Annotated Data format. The core data for each sample is the gene-expression matrix, which refers to the level of expression of each gene in a single cell. Additionally, the dataset contains the `.obs` attributed which includes core cell metadata for each of the sample (cell type, brain region, braak stage, donor age, disease condition, donor gender, etc.), along with the gene names accessed via `.var` attribute.
The source data have been processed to create a unified data portal ready to be used as training dataset for a Transformer model. The main processing steps were:
Total Cells |
2.3M |
AD Cells |
1.2M |
Control Cells |
1.1M |
Unique Genes |
91k |
Donors |
166 |
Data Source |
Unique Genes |
Total Cells |
AD Cells |
Control Cells |
Donors |
Cell Type Label |
Brain Region |
Tissue Type |
Braak Stage |
Donors Id |
Donor Gender |
Donor Age |
rexach et al |
30k |
217k |
118k |
99k |
20 |
✅ |
✘ |
✅ |
✘ |
✅ |
✅ |
✅ |
pan et al |
61k |
43k |
11k |
32k |
7 |
✅ |
✅ |
✅ |
✅ |
✅ |
✅ |
✅ |
dharsini et al |
61k |
425k |
311k |
114k |
46 |
✅ |
✅ |
✅ |
✅ |
✅ |
✅ |
✅ |
ssREAD |
62k |
2.42M |
1.14M |
1.28M |
135 |
✅ |
✅ |
✘ |
✅ |
✅ |
✅ |
✅ |