5 datasets found
  1. Data from: Linking regulatory variants to target genes by integrating...

    • zenodo.org
    application/gzip
    Updated May 23, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Elizabeth Dorans; Elizabeth Dorans (2024). Linking regulatory variants to target genes by integrating single-cell multiome methods and genomic distance [Dataset]. http://doi.org/10.5281/zenodo.11211926
    Explore at:
    application/gzipAvailable download formats
    Dataset updated
    May 23, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Elizabeth Dorans; Elizabeth Dorans
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    SNP-gene link predictions associated with our paper titled "Linking regulatory variants to target genes by integrating single-cell multiome methods and genomic distance," generated by pgBoost and constituent methods SCENT (Sakaue et al. 2024 Nat Genet), Signac (Stuart et al. 2021 Nat Methods), ArchR (Granja et al. 2021 Nat Genet), and Cicero (Pliner et al. 2018 Mol Cell).

    pgBoost_scores.tsv.gz contains linking predictions made by pgBoost.

    constituent_method_scores.tsv.gz contains linking predictions made by constituent methods.

    Linking scores and percentiles are reported for each method (pgBoost score, SCENT FDR, Signac correlation, ArchR correlation, Cicero co-accessibility). Rank percentiles are computed as: 1 - (rank / n). When multiple links receive the same score, they are assigned the percentile of the top rank. Links unscored by each method (denoted by zeros* in the linking score column) are assigned a percentile equivalent to the percent of links unscored by the focal method. See the Methods section of the paper for further details on computing linking scores and summarizing scores across cell types and data sets.

    *Candidate links tested and assigned a co-accessibility of zero by the Cicero method are given a score of 1e-100 in the "Cicero" column to distinguish between unscored candidate links and candidate links assigned a co-accessibility of zero (see Pliner et al. 2018 Mol Cell).

  2. Data from: An atlas of transcribed enhancers across helper T cell diversity...

    • data-staging.niaid.nih.gov
    zip
    Updated Apr 22, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yasuhiro Murakawa; Akiko Oguchi; Shuichiro Komatsu; Akari Suzuki; Chikashi Terao; Kazuhiko Yamamoto (2024). An atlas of transcribed enhancers across helper T cell diversity for decoding human diseases [Dataset]. http://doi.org/10.5061/dryad.pk0p2ngwx
    Explore at:
    zipAvailable download formats
    Dataset updated
    Apr 22, 2024
    Dataset provided by
    RIKEN Center for Integrative Medical Sciences
    Authors
    Yasuhiro Murakawa; Akiko Oguchi; Shuichiro Komatsu; Akari Suzuki; Chikashi Terao; Kazuhiko Yamamoto
    License

    https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html

    Description

    Transcribed enhancer maps can reveal nuclear interactions underpinning each cell type and connect specific cell types to diseases. Using a 5′ single-cell RNA sequencing approach, we defined transcription start sites of enhancer RNAs and other classes of coding and non-coding RNAs in human CD4+ T cells, revealing cellular heterogeneity and differentiation trajectories. Integration of these datasets with single-cell chromatin profiles showed that active enhancers with bidirectional RNA transcription are highly cell type–specific, and disease heritability is strongly enriched in these enhancers. The resulting cell type–resolved multimodal atlas of bidirectionally transcribed enhancers, which we linked with promoters using fine-scale chromatin contact maps, enabled us to systematically interpret genetic variants associated with a range of immune-mediated diseases. Methods All experiments using human samples were approved by the ethical review committee of RIKEN [approval no. H30-9(13)]. Written informed consent was obtained from all donors. CD4+ T cells were isolated by the immunomagnetic negative selection method. Stained CD4+ T cells were sorted using a FACSAria IIu Cell Sorter (BD Biosciences). Human CD4+ T cells and FACS-sorted heterogenous populations were processed with a Chromium Next GEM Single Cell 5′ kit (10x Genomics). Libraries were sequenced on an Illumina NovaSeq 6000 sequencing platform using 2 × 150 bp paired-end sequencing. Multiome assay (10x Genomics) was performed according to the manufacturer’s instructions. Multiome libraries were pooled and sequenced as above with 10 cycles for i7 index and 24 cycles for i5 index. Micro-C libraries were generated using a Dovetail Micro-C Kit (Cantata Bio, Cat#21006) and were sequenced on an Illumina NovaSeq 6000 platform using 2 × 150 bp paired-end sequencing. Chromium scRNA-seq, snRNA-seq, and CITE-seq data were processed using Cell Ranger Software version 5.0.1 (10x Genomics) and R package Seurat version 5 (4.9.9.9067). Multiome data were processed by Cell Ranger ARC version 2.0.0 (10x Genomics), Seurat version 5 (4.9.9.9067), and Signac version 1.10.0. scRNA-seq, snRNA-seq, and Multiome 3′ snRNA-seq data were integrated using canonical correlation analysis. snATAC peaks were identified from fragment files of each cluster using MACS2 version 2.2.6 with default settings as implemented in Signac version 1.10.0. For ReapTEC, paired-end reads were mapped again using STAR (STARsolo) to obtain reads with unencoded G, which was tagged as a soft-clipped G by STARsolo. Reads were deduplicated, and those with the barcodes of each cell type were extracted. A count file was generated for each transcription start site (TSS) using the “bamToBed” function in BEDTools version 2.30.0. TSS peaks were generated by merging TSSs located within 10 bp of each other. To identify btcEnhs, TSS peak pairs were detected using scripts provided at https://github.com/anderssonrobin/enhancers/blob/master/scripts/bidir_enhancers with minor modifications. Micro-C data were processed with the dovetail_tools pipeline (Cantata Bio). Chromatin loop contacts were identified by the HiCCUPS algorithm using the Juicer Tools package version 2.20.0 and the scale-space representation algorithm using the Mustache package. Loops were called at a 1-kb resolution with SCALE-normalized contact matrices for HiCCUPS and with ICE-normalized contact matrices for Mustache, and were filtered for an FDR < 0.05.

  3. Data from: Single-cell analyses of axolotl forebrain organization,...

    • zenodo.org
    bin, csv
    Updated Mar 28, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Katharina Lust; Ashley Maynard; Tomás Gomes; Jonas Simon Fleck; J. Gray Camp; Elly M. Tanaka; Barbara Treutlein; Katharina Lust; Ashley Maynard; Tomás Gomes; Jonas Simon Fleck; J. Gray Camp; Elly M. Tanaka; Barbara Treutlein (2022). Single-cell analyses of axolotl forebrain organization, neurogenesis, and regeneration [Dataset]. http://doi.org/10.5281/zenodo.6390083
    Explore at:
    bin, csvAvailable download formats
    Dataset updated
    Mar 28, 2022
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Katharina Lust; Ashley Maynard; Tomás Gomes; Jonas Simon Fleck; J. Gray Camp; Elly M. Tanaka; Barbara Treutlein; Katharina Lust; Ashley Maynard; Tomás Gomes; Jonas Simon Fleck; J. Gray Camp; Elly M. Tanaka; Barbara Treutlein
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Preprint: https://doi.org/10.1101/2022.03.21.485045

    Abstract:

    Salamanders are important tetrapod models to study brain organization and regeneration, however the identity and evolutionary conservation of brain cell types is largely unknown. Here, we delineate cell populations in the axolotl telencephalon during homeostasis and regeneration, representing the first single-cell genomic and spatial profiling of an anamniote tetrapod brain. We identify glutamatergic neurons with similarities to amniote neurons of hippocampus, dorsal and lateral cortex, and conserved GABAergic neuron classes. We infer transcriptional dynamics and gene regulatory relationships of postembryonic, region-specific direct and indirect neurogenesis, and unravel conserved signatures. Following brain injury, ependymoglia activate an injury-specific state before reestablishing lost neuron populations and axonal connections. Together, our analyses yield key insights into the organization, evolution, and regeneration of a tetrapod nervous system.

    File description:

    all_nuclei_clustered_highlevel_anno.rds - Seurat object including all snRNA-seq data from uninjured pallium, both from microdissections and whole pallium multiome.

    pallium_metadata_simp.csv - csv file containing a simplified version of the metadata for the uninjured pallium

    Edu_1_2_4_6_8_12_fil_highvarfeat.rds - Seurat object containing all Div-seq data for the pallium injury time course

    divseq_predicted_metadata.csv - csv file containing a simplified version of the metadata for the pallium injury time course

    ep_wpi_srat.rds - Seurat object containing an integrated version of ependymoglia cells from uninjured and injured pallium (see Fig 6 in the preprint).

    D1_113_sub_b.rds - Seurat object containing a Visium data for the axolotl pallium

    multiome_integATAC_SCT.rds - Signac object containing the data used for multiome analysis of the uninjured whole pallium

    predictions_cell2loc.csv - csv file containing cell2location scores for the uninjured pallium cell types in the Visium dataset

  4. Data from: Phospho-seq: Integrated, multi-modal profiling of intracellular...

    • zenodo.org
    application/gzip, bin
    Updated Apr 17, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    John D Blair; John D Blair; Austin Hartman; Austin Hartman; Fides Zenk; Fides Zenk; Carol Dalgarno; Carol Dalgarno; Barbara Treutlein; Barbara Treutlein; Rahul Satija; Rahul Satija (2023). Phospho-seq: Integrated, multi-modal profiling of intracellular protein dynamics in single cells [Dataset]. http://doi.org/10.5281/zenodo.7754315
    Explore at:
    application/gzip, binAvailable download formats
    Dataset updated
    Apr 17, 2023
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    John D Blair; John D Blair; Austin Hartman; Austin Hartman; Fides Zenk; Fides Zenk; Carol Dalgarno; Carol Dalgarno; Barbara Treutlein; Barbara Treutlein; Rahul Satija; Rahul Satija
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Datasets to go along with the publication listed:

    full_object.rds: Brain Organoid Phospho-Seq dataset with ATAC, Protein and imputed RNA data

    rna_object.rds: Reference whole cell scRNA-Seq object on Brain organoids

    multiome_object.rds: Bridge dataset containing RNA and ATAC modalities for Brain organoids

    metacell_allnorm.rds: Metacell object for finding gene-peak-protein linkages in Brain organoid dataset

    fullobject_fragments.tsv.gz: fragment file to go with the full object

    fullobject_fragments.tsv.gz.tbi:index file for the full object fragment file

    multiome_fragments.tsv.gz: fragment file to go with the multiome object

    multiome_fragments.tsv.gz.tbi:index file for the multiome object fragment file

    K562_Stem.rds : object corresponding to the pilot experiment including K562 cells and iPS cells

    K562_stem_fragments.tsv.gz: fragment file to go with the K562_stem object

    K562_stem_fragments.tsv.gz.tbi: index file for the K562_stem object fragment file

    To use the K562 and multiome datasets provided, please use these lines of code to import the object into Signac/Seurat and change the fragment file path to the corresponding downloaded fragment file:

    obj <- readRDS("obj.rds")
    # remove fragment file information
    Fragments(obj) <- NULL
    # Update the path of the fragment file 
    Fragments(obj) <- CreateFragmentObject(path = "download/obj_fragments.tsv.gz", cells = Cells(obj))

    To use the "fullobject" dataset provided, please use these lines of code to import the object into Signac/Seurat and change the fragment file path to the corresponding downloaded fragment file:

    #load the stringr package
    library(stringr)
    #load the object
    obj <- readRDS("obj.rds")
    # remove fragment file information
    Fragments(obj) <- NULL
    #Remove unwanted residual information and rename cells
    obj@reductions$norm.adt.pca <- NULL
    obj@reductions$norm.pca <- NULL
    obj <- RenameCells(obj, new.names = str_remove(Cells(obj), "atac_"))
    # Update the path of the fragment file 
    Fragments(obj) <- CreateFragmentObject(path = "download/obj_fragments.tsv.gz", cells = Cells(obj))

  5. pHGGmap

    • doi.org
    • zenodo.org
    application/gzip, bin +1
    Updated Jan 30, 2026
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Cristian Ruiz-Moreno; Cristian Ruiz-Moreno (2026). pHGGmap [Dataset]. http://doi.org/10.5281/zenodo.17063631
    Explore at:
    bin, png, application/gzipAvailable download formats
    Dataset updated
    Jan 30, 2026
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Cristian Ruiz-Moreno; Cristian Ruiz-Moreno
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Overview

    pHGGmap is a large-scale multimodal map of pediatric-type diffuse high-grade glioma (pHGG) integrating single-cell/single-nucleus RNA sequencing (sc/snRNA-seq), single-nucleus ATAC-seq (snATAC-seq), and single-nucleus multiome (snMultiome). The resource comprises over 800,000 cells from 136 patients and captures malignant, immune, vascular, and neural/glial compartments across disease subtypes and clinical contexts. Our resource resolves in great detail malignant cell hierarchies and myeloid programs that co-structure the tumor microenvironment, and captures conserved multicellular communities that persist across clinical contexts.

    This Zenodo record provides processed gene expression (GEX) and chromatin accessibility (ATAC) data for both the discovery cohort (newly generated samples together with Jessa et al., 2022, and Liu et al., 2022) and a large validation cohort (Filbin et al., 2018; Liu et al., 2022; DeSisto et al., 2024; Sussman et al., 2024; LaBelle et al., 2025) after reference mapping onto the discovery cohort. snMultiome profiles are represented in the GEX and ATAC modalities, while preserving identical cell barcodes to enable direct matching of the assays measured in the same cell. All objects contain quality-controlled cells, harmonized annotations, and clinical metadata.

    Data are shared in multiple formats to support a broad range of users: interactive Loupe Browser files for rapid exploration; Seurat/Signac R objects and AnnData (.h5ad) objects for computational analysis; filtered raw count matrices; per-cell metadata tables; and ATAC fragment files with tabix indices for reprocessing and visualization. Together, these files enable reproducible interrogation of cancer cell programs, myeloid immunomodulatory states, epigenetic regulation, and multicellular communities described in the pHGGmap study.

    If you use these data, please cite:

    Ruiz-Moreno C., Collot R., et al. Cancer-myeloid cell invasive program in pediatric-type diffuse high-grade glioma. bioRxiv (2026)

    (When the peer-reviewed version is available, please cite that version instead.)
  6. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Elizabeth Dorans; Elizabeth Dorans (2024). Linking regulatory variants to target genes by integrating single-cell multiome methods and genomic distance [Dataset]. http://doi.org/10.5281/zenodo.11211926
Organization logo

Data from: Linking regulatory variants to target genes by integrating single-cell multiome methods and genomic distance

Related Article
Explore at:
application/gzipAvailable download formats
Dataset updated
May 23, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Elizabeth Dorans; Elizabeth Dorans
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

SNP-gene link predictions associated with our paper titled "Linking regulatory variants to target genes by integrating single-cell multiome methods and genomic distance," generated by pgBoost and constituent methods SCENT (Sakaue et al. 2024 Nat Genet), Signac (Stuart et al. 2021 Nat Methods), ArchR (Granja et al. 2021 Nat Genet), and Cicero (Pliner et al. 2018 Mol Cell).

pgBoost_scores.tsv.gz contains linking predictions made by pgBoost.

constituent_method_scores.tsv.gz contains linking predictions made by constituent methods.

Linking scores and percentiles are reported for each method (pgBoost score, SCENT FDR, Signac correlation, ArchR correlation, Cicero co-accessibility). Rank percentiles are computed as: 1 - (rank / n). When multiple links receive the same score, they are assigned the percentile of the top rank. Links unscored by each method (denoted by zeros* in the linking score column) are assigned a percentile equivalent to the percent of links unscored by the focal method. See the Methods section of the paper for further details on computing linking scores and summarizing scores across cell types and data sets.

*Candidate links tested and assigned a co-accessibility of zero by the Cicero method are given a score of 1e-100 in the "Cicero" column to distinguish between unscored candidate links and candidate links assigned a co-accessibility of zero (see Pliner et al. 2018 Mol Cell).

Search
Clear search
Close search
Google apps
Main menu