4 datasets found
  1. Data from: Linking regulatory variants to target genes by integrating...

    • zenodo.org
    application/gzip, tsv
    Updated Apr 24, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Elizabeth Dorans; Elizabeth Dorans (2025). Linking regulatory variants to target genes by integrating single-cell multiome methods and genomic distance [Dataset]. http://doi.org/10.5281/zenodo.14957607
    Explore at:
    tsv, application/gzipAvailable download formats
    Dataset updated
    Apr 24, 2025
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Elizabeth Dorans; Elizabeth Dorans
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The below data are associated with our paper entitled "Linking regulatory variants to target genes by integrating single-cell multiome methods and genomic distance."

    1) SNP-gene link predictions generated by pgBoost and existing methods SCENT (Sakaue et al. 2024 Nat Genet), Signac (Stuart et al. 2021 Nat Methods), ArchR (Granja et al. 2021 Nat Genet), and Cicero (Pliner et al. 2018 Mol Cell).

    pgBoost_scores.tsv.gz contains linking predictions made by pgBoost.

    constituent_method_scores.tsv.gz contains linking predictions made by constituent methods.

    **NOTE: promoters (+/- 1kb from TSS) and candidate links >500kb are excluded from linking predictions (see manuscript)**

    Linking scores and percentiles are reported for each method (pgBoost score, SCENT FDR, Signac correlation, ArchR correlation, Cicero co-accessibility). Rank percentiles are computed as: 1 - (rank / n). When multiple links receive the same score, they are assigned the percentile of the top rank. Links unscored by each method (denoted by zeros* in the linking score column) are assigned a percentile equivalent to the percent of links unscored by the focal method. See the Methods section of the paper for further details on computing linking scores and summarizing scores across cell types and data sets.

    *Candidate links tested and assigned a co-accessibility of zero by the Cicero method are given a score of 1e-100 in the "Cicero" column to distinguish between unscored candidate links and candidate links assigned a partial correlation of zero (see Pliner et al. 2018 Mol Cell).

    NOTE: The predictions associated with this release (version 2) were generated using an expanded set of data sets, an expanded training set, and corrected TSS coordinates.

    2) GWAS-derived evaluation SNP-gene link evaluation set.

    gwas_evaluation.tsv: GWAS-derived evaluation SNP-gene link evaluation set. Column 1 provides SNP coordinates in the format

  2. Data from: An atlas of transcribed enhancers across helper T cell diversity...

    • data.niaid.nih.gov
    • datadryad.org
    zip
    Updated Apr 22, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yasuhiro Murakawa; Akiko Oguchi; Shuichiro Komatsu; Akari Suzuki; Chikashi Terao; Kazuhiko Yamamoto (2024). An atlas of transcribed enhancers across helper T cell diversity for decoding human diseases [Dataset]. http://doi.org/10.5061/dryad.pk0p2ngwx
    Explore at:
    zipAvailable download formats
    Dataset updated
    Apr 22, 2024
    Dataset provided by
    RIKEN Center for Integrative Medical Sciences
    Authors
    Yasuhiro Murakawa; Akiko Oguchi; Shuichiro Komatsu; Akari Suzuki; Chikashi Terao; Kazuhiko Yamamoto
    License

    https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html

    Description

    Transcribed enhancer maps can reveal nuclear interactions underpinning each cell type and connect specific cell types to diseases. Using a 5′ single-cell RNA sequencing approach, we defined transcription start sites of enhancer RNAs and other classes of coding and non-coding RNAs in human CD4+ T cells, revealing cellular heterogeneity and differentiation trajectories. Integration of these datasets with single-cell chromatin profiles showed that active enhancers with bidirectional RNA transcription are highly cell type–specific, and disease heritability is strongly enriched in these enhancers. The resulting cell type–resolved multimodal atlas of bidirectionally transcribed enhancers, which we linked with promoters using fine-scale chromatin contact maps, enabled us to systematically interpret genetic variants associated with a range of immune-mediated diseases. Methods All experiments using human samples were approved by the ethical review committee of RIKEN [approval no. H30-9(13)]. Written informed consent was obtained from all donors. CD4+ T cells were isolated by the immunomagnetic negative selection method. Stained CD4+ T cells were sorted using a FACSAria IIu Cell Sorter (BD Biosciences). Human CD4+ T cells and FACS-sorted heterogenous populations were processed with a Chromium Next GEM Single Cell 5′ kit (10x Genomics). Libraries were sequenced on an Illumina NovaSeq 6000 sequencing platform using 2 × 150 bp paired-end sequencing. Multiome assay (10x Genomics) was performed according to the manufacturer’s instructions. Multiome libraries were pooled and sequenced as above with 10 cycles for i7 index and 24 cycles for i5 index. Micro-C libraries were generated using a Dovetail Micro-C Kit (Cantata Bio, Cat#21006) and were sequenced on an Illumina NovaSeq 6000 platform using 2 × 150 bp paired-end sequencing. Chromium scRNA-seq, snRNA-seq, and CITE-seq data were processed using Cell Ranger Software version 5.0.1 (10x Genomics) and R package Seurat version 5 (4.9.9.9067). Multiome data were processed by Cell Ranger ARC version 2.0.0 (10x Genomics), Seurat version 5 (4.9.9.9067), and Signac version 1.10.0. scRNA-seq, snRNA-seq, and Multiome 3′ snRNA-seq data were integrated using canonical correlation analysis. snATAC peaks were identified from fragment files of each cluster using MACS2 version 2.2.6 with default settings as implemented in Signac version 1.10.0. For ReapTEC, paired-end reads were mapped again using STAR (STARsolo) to obtain reads with unencoded G, which was tagged as a soft-clipped G by STARsolo. Reads were deduplicated, and those with the barcodes of each cell type were extracted. A count file was generated for each transcription start site (TSS) using the “bamToBed” function in BEDTools version 2.30.0. TSS peaks were generated by merging TSSs located within 10 bp of each other. To identify btcEnhs, TSS peak pairs were detected using scripts provided at https://github.com/anderssonrobin/enhancers/blob/master/scripts/bidir_enhancers with minor modifications. Micro-C data were processed with the dovetail_tools pipeline (Cantata Bio). Chromatin loop contacts were identified by the HiCCUPS algorithm using the Juicer Tools package version 2.20.0 and the scale-space representation algorithm using the Mustache package. Loops were called at a 1-kb resolution with SCALE-normalized contact matrices for HiCCUPS and with ICE-normalized contact matrices for Mustache, and were filtered for an FDR < 0.05.

  3. Z

    Data from: Single-cell analyses of axolotl forebrain organization,...

    • data.niaid.nih.gov
    • explore.openaire.eu
    • +1more
    Updated Mar 28, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Barbara Treutlein (2022). Single-cell analyses of axolotl forebrain organization, neurogenesis, and regeneration [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_6390082
    Explore at:
    Dataset updated
    Mar 28, 2022
    Dataset provided by
    J. Gray Camp
    Ashley Maynard
    Katharina Lust
    Elly M. Tanaka
    Barbara Treutlein
    Jonas Simon Fleck
    Tomás Gomes
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Preprint: https://doi.org/10.1101/2022.03.21.485045

    Abstract:

    Salamanders are important tetrapod models to study brain organization and regeneration, however the identity and evolutionary conservation of brain cell types is largely unknown. Here, we delineate cell populations in the axolotl telencephalon during homeostasis and regeneration, representing the first single-cell genomic and spatial profiling of an anamniote tetrapod brain. We identify glutamatergic neurons with similarities to amniote neurons of hippocampus, dorsal and lateral cortex, and conserved GABAergic neuron classes. We infer transcriptional dynamics and gene regulatory relationships of postembryonic, region-specific direct and indirect neurogenesis, and unravel conserved signatures. Following brain injury, ependymoglia activate an injury-specific state before reestablishing lost neuron populations and axonal connections. Together, our analyses yield key insights into the organization, evolution, and regeneration of a tetrapod nervous system.

    File description:

    all_nuclei_clustered_highlevel_anno.rds - Seurat object including all snRNA-seq data from uninjured pallium, both from microdissections and whole pallium multiome.

    pallium_metadata_simp.csv - csv file containing a simplified version of the metadata for the uninjured pallium

    Edu_1_2_4_6_8_12_fil_highvarfeat.rds - Seurat object containing all Div-seq data for the pallium injury time course

    divseq_predicted_metadata.csv - csv file containing a simplified version of the metadata for the pallium injury time course

    ep_wpi_srat.rds - Seurat object containing an integrated version of ependymoglia cells from uninjured and injured pallium (see Fig 6 in the preprint).

    D1_113_sub_b.rds - Seurat object containing a Visium data for the axolotl pallium

    multiome_integATAC_SCT.rds - Signac object containing the data used for multiome analysis of the uninjured whole pallium

    predictions_cell2loc.csv - csv file containing cell2location scores for the uninjured pallium cell types in the Visium dataset

  4. Data from: Phospho-seq: Integrated, multi-modal profiling of intracellular...

    • zenodo.org
    application/gzip, bin
    Updated Nov 22, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    John D Blair; John D Blair; Austin Hartman; Austin Hartman; Fides Zenk; Fides Zenk; Carol Dalgarno; Carol Dalgarno; Barbara Treutlein; Barbara Treutlein; Rahul Satija; Rahul Satija; Philipp Wahle; Philipp Wahle; Giovanna Brancati; Giovanna Brancati (2023). Phospho-seq: Integrated, multi-modal profiling of intracellular protein dynamics in single cells [Dataset]. http://doi.org/10.5281/zenodo.10120360
    Explore at:
    bin, application/gzipAvailable download formats
    Dataset updated
    Nov 22, 2023
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    John D Blair; John D Blair; Austin Hartman; Austin Hartman; Fides Zenk; Fides Zenk; Carol Dalgarno; Carol Dalgarno; Barbara Treutlein; Barbara Treutlein; Rahul Satija; Rahul Satija; Philipp Wahle; Philipp Wahle; Giovanna Brancati; Giovanna Brancati
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Datasets to go along with the publication listed:

    full_object.rds: Brain Organoid Phospho-Seq dataset with ATAC, Protein and imputed RNA data

    rna_object.rds: Reference whole cell scRNA-Seq object on Brain organoids

    multiome_object.rds: Bridge dataset containing RNA and ATAC modalities for Brain organoids

    metacell_allnorm.rds: Metacell object for finding gene-peak-protein linkages in Brain organoid dataset

    fullobject_fragments.tsv.gz: fragment file to go with the full object

    fullobject_fragments.tsv.gz.tbi:index file for the full object fragment file

    multiome_fragments.tsv.gz: fragment file to go with the multiome object

    multiome_fragments.tsv.gz.tbi:index file for the multiome object fragment file

    K562_Stem.rds : object corresponding to the pilot experiment including K562 cells and iPS cells

    K562_stem_fragments.tsv.gz: fragment file to go with the K562_stem object

    K562_stem_fragments.tsv.gz.tbi: index file for the K562_stem object fragment file

    retina.rds : object corresponding to the retinal organoid phospho-seq experiment

    retina_fragments.tsv.gz: fragment file to go with the retina object

    retina_fragments.tsv.gz.tbi: index file for the retina object fragment file

    retina_multi.rds : object corresponding to the retinal organoid phospho-seq-multiome experiment

    retina_multi_fragments.tsv.gz: fragment file to go with the retina_multi object

    retina_multi_fragments.tsv.gz.tbi: index file for the retina_multi object fragment file

    To use the K562, multiome, retina and retina_multiome datasets provided, please use these lines of code to import the object into Signac/Seurat and change the fragment file path to the corresponding downloaded fragment file:

    obj <- readRDS("obj.rds") # remove fragment file information Fragments(obj) <- NULL # Update the path of the fragment file Fragments(obj) <- CreateFragmentObject(path = "download/obj_fragments.tsv.gz", cells = Cells(obj))

    To use the K562 and multiome datasets provided, please use these lines of code to import the object into Signac/Seurat and change the fragment file path to the corresponding downloaded fragment file:

    obj <- readRDS("obj.rds") # remove fragment file information Fragments(obj) <- NULL # Update the path of the fragment file Fragments(obj) <- CreateFragmentObject(path = "download/obj_fragments.tsv.gz", cells = Cells(obj))

    To use the "fullobject" dataset provided, please use these lines of code to import the object into Signac/Seurat and change the fragment file path to the corresponding downloaded fragment file:

    #load the stringr package library(stringr) #load the object obj <- readRDS("obj.rds") # remove fragment file information Fragments(obj) <- NULL #Remove unwanted residual information and rename cells obj@reductions$norm.adt.pca <- NULL obj@reductions$norm.pca <- NULL obj <- RenameCells(obj, new.names = str_remove(Cells(obj), "atac_")) # Update the path of the fragment file Fragments(obj) <- CreateFragmentObject(path = "download/obj_fragments.tsv.gz", cells = Cells(obj))

  5. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Elizabeth Dorans; Elizabeth Dorans (2025). Linking regulatory variants to target genes by integrating single-cell multiome methods and genomic distance [Dataset]. http://doi.org/10.5281/zenodo.14957607
Organization logo

Data from: Linking regulatory variants to target genes by integrating single-cell multiome methods and genomic distance

Related Article
Explore at:
tsv, application/gzipAvailable download formats
Dataset updated
Apr 24, 2025
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Elizabeth Dorans; Elizabeth Dorans
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

The below data are associated with our paper entitled "Linking regulatory variants to target genes by integrating single-cell multiome methods and genomic distance."

1) SNP-gene link predictions generated by pgBoost and existing methods SCENT (Sakaue et al. 2024 Nat Genet), Signac (Stuart et al. 2021 Nat Methods), ArchR (Granja et al. 2021 Nat Genet), and Cicero (Pliner et al. 2018 Mol Cell).

pgBoost_scores.tsv.gz contains linking predictions made by pgBoost.

constituent_method_scores.tsv.gz contains linking predictions made by constituent methods.

**NOTE: promoters (+/- 1kb from TSS) and candidate links >500kb are excluded from linking predictions (see manuscript)**

Linking scores and percentiles are reported for each method (pgBoost score, SCENT FDR, Signac correlation, ArchR correlation, Cicero co-accessibility). Rank percentiles are computed as: 1 - (rank / n). When multiple links receive the same score, they are assigned the percentile of the top rank. Links unscored by each method (denoted by zeros* in the linking score column) are assigned a percentile equivalent to the percent of links unscored by the focal method. See the Methods section of the paper for further details on computing linking scores and summarizing scores across cell types and data sets.

*Candidate links tested and assigned a co-accessibility of zero by the Cicero method are given a score of 1e-100 in the "Cicero" column to distinguish between unscored candidate links and candidate links assigned a partial correlation of zero (see Pliner et al. 2018 Mol Cell).

NOTE: The predictions associated with this release (version 2) were generated using an expanded set of data sets, an expanded training set, and corrected TSS coordinates.

2) GWAS-derived evaluation SNP-gene link evaluation set.

gwas_evaluation.tsv: GWAS-derived evaluation SNP-gene link evaluation set. Column 1 provides SNP coordinates in the format

Search
Clear search
Close search
Google apps
Main menu