100+ datasets found

Data from: Benchmark and integration of resources for the estimation of...
zenodo.org
zip
Updated Jan 24, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Luz Garcia-Alonso; Luz Garcia-Alonso (2020). Benchmark and integration of resources for the estimation of human transcription factor activities [Dataset]. http://doi.org/10.1101/337915
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.1101/337915
Dataset updated
Jan 24, 2020
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Luz Garcia-Alonso; Luz Garcia-Alonso
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Data used to benchmark human TF-target datasets via TF activities in 3 benchmark datasets. Described in Garcia-Alonso et al 2019

Check https://github.com/saezlab/TFbenchmark to access the corresponding code.

Study abstract

Prediction of transcription factor (TF) activities from the gene expression of their targets (i.e. TF regulon) is becoming a widely-used approach to characterize the functional status of transcriptional regulatory circuits. Several strategies and datasets have been proposed to link the target genes likely regulated by a TF, each one providing a different level of evidence. The most established ones are: (i) manually curated repositories, (ii) interactions derived from ChIP-seq binding data, (iii) in silico prediction of TF binding on gene promoters, and (iv) reverse-engineered regulons from large gene expression datasets. However, it is not known how these different sources of regulons affect the TF activity estimations, and thereby downstream analysis and interpretation. Here we compared the accuracy and biases of these strategies to define human TF regulons by means of their ability to predict changes in TF activities in three reference benchmark datasets. We assembled a collection of TF-target interactions among 1,541 TFs and evaluated how the different molecular and regulatory properties of the TFs, such as the DNA-binding domain, specificities or mode of interaction with the chromatin, affect the predictions of TF activity changes. We assessed their coverage and found little overlap on the regulons derived from each strategy and better performance by literature-curated information followed by ChIP-seq data. We provide an integrated resource of all TF-target interactions derived through these strategies with a confidence score, as a resource for enhanced prediction of TF activities.
d
Data from: Genome-wide Transcription Factor DNA Binding Sites and Gene...
catalog.data.gov
data.openei.org
+1more
Updated Jan 22, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
National Renewable Energy Laboratory (2025). Genome-wide Transcription Factor DNA Binding Sites and Gene Regulatory Networks in Clostridium thermocellum [Dataset]. https://catalog.data.gov/dataset/genome-wide-transcription-factor-dna-binding-sites-and-gene-regulatory-networks-in-clostri-0fffc
Explore at:
Dataset updated
Jan 22, 2025
Dataset provided by
National Renewable Energy Laboratory
Description
Clostridium thermocellum is a thermophilic bacterium recognized for its natural ability to effectively deconstruct cellulosic biomass. While there is a large body of studies on the genetic engineering of this bacterium and its physiology to-date, there is limited knowledge in the transcriptional regulation in this organism and thermophilic bacteria in general. The study herein is the first report of a high-throughput application of DNA-affinity purification sequencing (DAP-seq) to transcription factors (TFs) from a thermophile. We applied DAP-seq to >90 TFs in C. thermocellum and detected genome-wide binding sites for 11 of them. We then compiled and aligned DNA binding sequences from these TFs to deduce the primary DNA-binding sequence motifs for each TF. These binding motifs are further validated with electrophoretic mobility shift assay (EMSA) and are used to identify individual TFs’ regulatory targets in C. thermocellum. Our results led to the discovery of novel, uncharacterized TFs as well as homologues of previously studied TFs including RexA-, LexA- and LacI-type TFs. We then used these data to reconstruct gene regulatory networks for the 11 TFs individually, which resulted in a global network encompassing the TFs with some interconnections. As gene regulation governs and constrains how bacteria behave, our findings shed light on the roles of TFs delineated by their regulons, and potentially provides a means to enable rational, advanced genetic engineering of C. thermocellum and other organisms alike towards a desired phenotype.
d
Transcriptional Regulatory Element Database
dknet.org
neuinfo.org
+2more
Updated Jan 29, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2022). Transcriptional Regulatory Element Database [Dataset]. http://identifiers.org/RRID:SCR_005661
Explore at:
Unique identifier
https://identifiers.org/RRID:SCR_005661
Dataset updated
Jan 29, 2022
Description
Collects mammalian cis- and trans-regulatory elements together with experimental evidence. Regulatory elements were mapped on to assembled genomes. Resource for gene regulation and function studies. Users can retrieve primers, search TF target genes, retrieve TF motifs, search Gene Regulatory Networks and orthologs, and make use of sequence analysis tools. Uses databases such as Genbank, EPD and DBTSS, and employ promoter finding program FirstEF combined with mRNA/EST information and cross-species comparisons. Manually curated.
Additional file 3: Table S7. of Systematic target function annotation of...
springernature.figshare.com
xlsx
Updated Jun 1, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yong Li; Russ Altman (2023). Additional file 3: Table S7. of Systematic target function annotation of human transcription factors [Dataset]. http://doi.org/10.6084/m9.figshare.5782743.v1
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.5782743.v1
Dataset updated
Jun 1, 2023
Dataset provided by
figshare
Figsharehttp://figshare.com/
Authors
Yong Li; Russ Altman
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The complete transcription factor annotation results. â€“Log10 (P value) are provided in parentheses following the target functions. (XLSX 153Â kb)
f
Table_1_Filtering of Data-Driven Gene Regulatory Networks Using Drosophila...
frontiersin.figshare.com
xlsx
Updated Jun 1, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yesid Cuesta-Astroz; Guilherme Gischkow Rucatti; Leandro Murgas; Carol D. SanMartín; Mario Sanhueza; Alberto J. M. Martin (2023). Table_1_Filtering of Data-Driven Gene Regulatory Networks Using Drosophila melanogaster as a Case Study.XLSX [Dataset]. http://doi.org/10.3389/fgene.2021.649764.s002
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.3389/fgene.2021.649764.s002
Dataset updated
Jun 1, 2023
Dataset provided by
Frontiers
Authors
Yesid Cuesta-Astroz; Guilherme Gischkow Rucatti; Leandro Murgas; Carol D. SanMartín; Mario Sanhueza; Alberto J. M. Martin
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Gene Regulatory Networks (GRNs) allow the study of regulation of gene expression of whole genomes. Among the most relevant advantages of using networks to depict this key process, there is the visual representation of large amounts of information and the application of graph theory to generate new knowledge. Nonetheless, despite the many uses of GRNs, it is still difficult and expensive to assign Transcription Factors (TFs) to the regulation of specific genes. ChIP-Seq allows the determination of TF Binding Sites (TFBSs) over whole genomes, but it is still an expensive technique that can only be applied one TF at a time and requires replicates to reduce its noise. Once TFBSs are determined, the assignment of each TF and its binding sites to the regulation of specific genes is not trivial, and it is often performed by carrying out site-specific experiments that are unfeasible to perform in all possible binding sites. Here, we addressed these relevant issues with a two-step methodology using Drosophila melanogaster as a case study. First, our protocol starts by gathering all transcription factor binding sites (TFBSs) determined with ChIP-Seq experiments available at ENCODE and FlyBase. Then each TFBS is used to assign TFs to the regulation of likely target genes based on the TFBS proximity to the transcription start site of all genes. In the final step, to try to select the most likely regulatory TF from those previously assigned to each gene, we employ GENIE3, a random forest-based method, and more than 9,000 RNA-seq experiments from D. melanogaster. Following, we employed known TF protein-protein interactions to estimate the feasibility of regulatory events in our filtered networks. Finally, we show how known interactions between co-regulatory TFs of each gene increase after the second step of our approach, and thus, the consistency of the TF-gene assignment. Also, we employed our methodology to create a network centered on the Drosophila melanogaster gene Hr96 to demonstrate the role of this transcription factor on mitochondrial gene regulation.
Gene regulatory networks for 38 human tissues
zenodo.org
explore.openaire.eu
bin
Updated Jan 24, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Abhijeet R Sonawane; John Platig; Maud Fagny; Cho-Yi Chen; Joseph N Paulson; Camila M Lopes-Ramos; Dawn L DeMeo; John Quackenbush; Kimberly Glass; Marieke Lydia Kuijjer; Abhijeet R Sonawane; John Platig; Maud Fagny; Cho-Yi Chen; Joseph N Paulson; Camila M Lopes-Ramos; Dawn L DeMeo; John Quackenbush; Kimberly Glass; Marieke Lydia Kuijjer (2020). Gene regulatory networks for 38 human tissues [Dataset]. http://doi.org/10.5281/zenodo.838734
Explore at:
binAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.838734
Dataset updated
Jan 24, 2020
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Abhijeet R Sonawane; John Platig; Maud Fagny; Cho-Yi Chen; Joseph N Paulson; Camila M Lopes-Ramos; Dawn L DeMeo; John Quackenbush; Kimberly Glass; Marieke Lydia Kuijjer; Abhijeet R Sonawane; John Platig; Maud Fagny; Cho-Yi Chen; Joseph N Paulson; Camila M Lopes-Ramos; Dawn L DeMeo; John Quackenbush; Kimberly Glass; Marieke Lydia Kuijjer
License
Attribution-NoDerivs 4.0 (CC BY-ND 4.0)https://creativecommons.org/licenses/by-nd/4.0/
License information was derived automatically
Description
We reconstructed gene regulatory networks for 38 tissues from the Genotype-Tissue Expression project (GTEx), and used these networks to investigate gene expression and regulation across these tissues. In the RData file, we share the following objects:

- edges: an 19,476,492 by 3 data.frame including three columns: TF (the transcription factor's gene symbol), Gene (Ensembl ID), Prior (whether an edge is canonical (1) or non-canonical (0)).

- exp: a 30,243 by 9,435 matrix including normalized expression data for each sample.

- expTS: a 30,243 by 38 matrix including, for each gene and each tissue, information on whether the gene is expressed in a tissue-specific manner in that tissue (1) or not (0).

- genes: a 30,243 by 4 data.frame that includes annotation information (Symbol) for Ensembl gene IDs (Name). This data.frame also includes information on whether genes are also transcription factors (AlsoTF), with options: no, yes/motif (TF with a known DNA-binding motif) yes/nomotif (TF without a known DNA-binding motif). In addition, the multiplicity of the gene (Multiplicity) is given.

- net: a 19,476,492 by 38 matrix that includes edge weights for each tissue. Edge order corresponds to edge order in the the object "edges".

- netTS: a 19,476,492 by 38 matrix that includes information of whether edges are specific to a tissue (1) or not (0).

- samples: a 9,435 by 2 data.frame that includes sample identifiers (matching the identifiers in "exp") and the tissue to which these samples belong.
t
BIOGRID CURATED DATA FOR PUBLICATION: Regulation of a transcription factor...
thebiogrid.org
zip
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
BioGRID Project, BIOGRID CURATED DATA FOR PUBLICATION: Regulation of a transcription factor network by Cdk1 coordinates late cell cycle gene expression. [Dataset]. https://thebiogrid.org/185012/publication/regulation-of-a-transcription-factor-network-by-cdk1-coordinates-late-cell-cycle-gene-expression.html
Explore at:
zipAvailable download formats
Dataset authored and provided by
BioGRID Project
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
Protein-Protein, Genetic, and Chemical Interactions for Landry BD (2014):Regulation of a transcription factor network by Cdk1 coordinates late cell cycle gene expression. curated by BioGRID (https://thebiogrid.org); ABSTRACT: To maintain genome stability, regulators of chromosome segregation must be expressed in coordination with mitotic events. Expression of these late cell cycle genes is regulated by cyclin-dependent kinase (Cdk1), which phosphorylates a network of conserved transcription factors (TFs). However, the effects of Cdk1 phosphorylation on many key TFs are not known. We find that elimination of Cdk1-mediated phosphorylation of four S-phase TFs decreases expression of many late cell cycle genes, delays mitotic progression, and reduces fitness in budding yeast. Blocking phosphorylation impairs degradation of all four TFs. Consequently, phosphorylation-deficient mutants of the repressors Yox1 and Yhp1 exhibit increased promoter occupancy and decreased expression of their target genes. Interestingly, although phosphorylation of the transcriptional activator Hcm1 on its N-terminus promotes its degradation, phosphorylation on its C-terminus is required for its activity, indicating that Cdk1 both activates and inhibits a single TF. We conclude that Cdk1 promotes gene expression by both activating transcriptional activators and inactivating transcriptional repressors. Furthermore, our data suggest that coordinated regulation of the TF network by Cdk1 is necessary for faithful cell division.
S
Systematic Target Function Annotation of Human Transcription Factors
simtk.org
datamed.org
bin +1
Updated Jul 7, 2017
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yong Li (2017). Systematic Target Function Annotation of Human Transcription Factors [Dataset]. https://simtk.org/frs/?group_id=1054
Explore at:
data/images/video(1 MB), bin(51 MB)Available download formats
Dataset updated
Jul 7, 2017
Dataset provided by
Stanford University
Authors
Yong Li
Description
Transcription factors (TFs), the key players in transcriptional regulation, have attracted great experimental attention, yet the functions of most human TFs remain poorly understood. Recent capabilities in genome-wide protein binding profiling have stimulated systematic studies of the hierarchical organization of human gene regulatory network and DNA-binding specificity of TFs, shedding light on combinatorial gene regulation. We show here that these data also enable a systematic annotation of the biological functions and functional diversity of TFs. We compiled a human gene regulatory network for 384 TFs covering the 146,096 TF-target gene relationships, extracted from over 850 ChIP-seq experiments as well as the literature. By integrating this network of TF-TF and TF-target gene relationships with 3,715 functional concepts from six sources of gene function annotations, we obtained over 9,000 confident functional annotations for 279 TFs. We observe extensive connectivity between transcription factors and Mendelian diseases, GWAS phenotypes, and pharmacogenetic pathways. Further, we show that transcription factors link apparently unrelated functions, even when the two functions do not share common genes. Finally, we analyze the pleiotropic functions of TFs and suggest that increased number of upstream regulators contributes to the functional pleiotropy of TFs. Our computational approach is complementary to focused experimental studies on TF functions, and the resulting knowledge can guide experimental design for discovering the unknown roles of TFs in human disease and drug response.

This project includes the following software/data packages:

TF functional annotations

TF-Target Gene raw data : The raw TF-TG relationship is available as two GMT formatted files corresponding to a windows size of 6K bp (TFTG.all.symbol.w6k.qvalue0.01.uniqueTF.gmt) and 20K bp (TFTG.all.symbol.w20k.qvalue0.01.uniqueTF.gmt).
n
REDfly Regulatory Element Database for Drosophilia
neuinfo.org
dknet.org
+1more
Updated Jan 29, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2022). REDfly Regulatory Element Database for Drosophilia [Dataset]. http://identifiers.org/RRID:SCR_006790/resolver?q=&i=rrid
Explore at:
Unique identifier
https://identifiers.org/RRID:SCR_006790 https://identifiers.org/RRID:SCR_006790/resolver?q=&i=rrid
Dataset updated
Jan 29, 2022
Description
Curated collection of known Drosophila transcriptional cis-regulatory modules (CRMs) and transcription factor binding sites (TFBSs). Includes experimentally verified fly regulatory elements along with their DNA sequence, associated genes, and expression patterns they direct. Submission of experimentally verified cis-regulatory elements that are not included in REDfly database are welcome.
f
Data_Sheet_1_CisCross: A gene list enrichment analysis to predict upstream...
frontiersin.figshare.com
xlsx
Updated Jun 13, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Viktoriya V. Lavrekha; Victor G. Levitsky; Anton V. Tsukanov; Anton G. Bogomolov; Dmitry A. Grigorovich; Nadya Omelyanchuk; Elena V. Ubogoeva; Elena V. Zemlyanskaya; Victoria Mironova (2023). Data_Sheet_1_CisCross: A gene list enrichment analysis to predict upstream regulators in Arabidopsis thaliana.xlsx [Dataset]. http://doi.org/10.3389/fpls.2022.942710.s001
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.3389/fpls.2022.942710.s001
Dataset updated
Jun 13, 2023
Dataset provided by
Frontiers
Authors
Viktoriya V. Lavrekha; Victor G. Levitsky; Anton V. Tsukanov; Anton G. Bogomolov; Dmitry A. Grigorovich; Nadya Omelyanchuk; Elena V. Ubogoeva; Elena V. Zemlyanskaya; Victoria Mironova
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Having DNA-binding profiles for a sufficient number of genome-encoded transcription factors (TFs) opens up the perspectives for systematic evaluation of the upstream regulators for the gene lists. Plant Cistrome database, a large collection of TF binding profiles detected using the DAP-seq method, made it possible for Arabidopsis. Here we re-processed raw DAP-seq data with MACS2, the most popular peak caller that leads among other ones according to quality metrics. In the benchmarking study, we confirmed that the improved collection of TF binding profiles supported a more precise gene list enrichment procedure, and resulted in a more relevant ranking of potential upstream regulators. Moreover, we consistently recovered the TF binding profiles that were missing in the previous collection of DAP-seq peak sets. We developed the CisCross web service (https://plamorph.sysbio.ru/ciscross/) that gives more flexibility in the analysis of potential upstream TF regulators for Arabidopsis thaliana genes.
Z
ReMap 2022: a database of Human, Mouse, Drosophila and Arabidopsis...
data.niaid.nih.gov
data-staging.niaid.nih.gov
+1more
Updated Jan 19, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ballester, Benoit (2024). ReMap 2022: a database of Human, Mouse, Drosophila and Arabidopsis regulatory regions from an integrative analysis of DNA-binding sequencing experiments [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_10527087
Explore at:
Dataset updated
Jan 19, 2024
Dataset provided by
INSERM
Authors
Ballester, Benoit
License
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Description
ReMap is a large scale integrative analysis of DNA-binding experiments for Homo sapiens, Mus musculus, Drosophila melanogaster and Arabidopsis thaliana transcriptional regulators. The catalogues are the results of the manual curation of ChIP-seq, ChIP-exo, DAP-seq from public sources (GEO, ENCODE, ENA).

ReMap (https://remap.univ-amu.fr) aims to provide manually curated, high-quality catalogs of regulatory regions resulting from a large-scale integrative anlysis of DNA-binding experiments in Human, Mouse, Fly and Arabidopsis thaliana for hundreds of transcription factors and regulators. In this 2022 update, we have uniformly processed >11 000 DNA-binding sequencing datasets from public sources across four species. The updated Human regulatory atlas includes 8103 datasets covering a total of 1210 transcriptional regulators (TRs) with a catalog of 182 million (M) peaks, while the updated Arabidopsis atlas reaches 4.8M peaks, 423 TRs across 694 datasets. Also, this ReMap release is enriched by two new regulatory catalogs for Mus musculus and Drosophila melanogaster. First, the Mouse regulatory catalog consists of 123M peaks across 648 TRs as a result of the integration and validation of 5503 ChIP-seq datasets. Second, the Drosophila melanogaster catalog contains 16.6M peaks across 550 TRs from the integration of 1205 datasets. The four regulatory catalogs are browsable through track hubs at UCSC, Ensembl and NCBI genome browsers. Finally, ReMap 2022 comes with a new Cis Regulatory Module identification method, improved quality controls, faster search results, and better user experience with an interactive tour and video tutorials on browsing and filtering ReMap catalogs.

We thank our users for past and future feedback to make ReMap useful for the community. The ReMap team welcomes your feedback on the catalogs, use of the website and use of the downloadable files. Please contact benoit.ballester@inserm.fr for development requests.

Reference:

ReMap 2022: a database of Human, Mouse, Drosophila and Arabidopsis regulatory regions from an integrative analysis of DNA-binding sequencing experiments Fayrouz Hammal, Pierre de Langen, Aurélie Bergon, Fabrice Lopez, Benoit BallesterNucleic Acids Research, Volume 50, Issue D1, 7 January 2022, Pages D316–D325, https://doi.org/10.1093/nar/gkab996
n
PRISM (Stanford database)
neuinfo.org
dknet.org
Updated Oct 7, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). PRISM (Stanford database) [Dataset]. http://identifiers.org/RRID:SCR_005375
Explore at:
Unique identifier
https://identifiers.org/RRID:SCR_005375
Dataset updated
Oct 7, 2024
Description
THIS RESOURCE IS NO LONGER IN SERVICE. Documented on May 5,2022.Tool that predicts interactions between transcription factors and their regulated genes from binding motifs. Understanding vertebrate development requires unraveling the cis-regulatory architecture of gene regulation. PRISM provides accurate genome-wide computational predictions of transcription factor binding sites for the human and mouse genomes, and integrates the predictions with GREAT to provide functional biological context. Together, accurate computational binding site prediction and GREAT produce for each transcription factor: 1. putative binding sites, 2. putative target genes, 3. putative biological roles of the transcription factor, and 4. putative cis-regulatory elements through which the factor regulates each target in each functional role., THIS RESOURCE IS NO LONGER IN SERVICE. Documented on September 16,2025.
n
DBTBS
neuinfo.org
scicrunch.org
+2more
Updated Jan 19, 2006
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2006). DBTBS [Dataset]. http://identifiers.org/RRID:SCR_002345
Explore at:
Unique identifier
https://identifiers.org/RRID:SCR_002345
Dataset updated
Jan 19, 2006
Description
Database of experimentally validated gene regulatory relations and the corresponding transcription factor binding sites upstream of Bacillus subtilis genes. The database allows the comparison of systematic experiments with individual experimental results in order to facilitate the elucidation of the complete B. subtilis gene regulatory network. The current version is constructed by surveying 947 references and contains the information of 120 binding factors and 1475 gene regulatory relations. For each promoter, all of its known cis-elements are listed according to their positions, while these cis-elements are aligned to illustrate the consensus sequence for each transcription factor. All probable transcription factors coded in the genome were classified using Pfam motifs. The DBTBS database was reorganized to show operons instead of individual genes as the building blocks of gene regulatory networks. It now contains 463 experimentally known operons, as well as their terminator sequences if identifiable. In addition, 517 transcriptional terminators were identified computationally. (De Hoon, M.J.L. et al., PLoS Comput. Biol. 1, e25 (2005)). A new section was added under "Motif conservation", which presents hexameric motifs found to be conserved to different extents between upstream intergenic regions of genus-specific subgroups of homologous proteins.
Z
Data from: The transcription regulatory code of a plant leaf
data.niaid.nih.gov
resodate.org
+1more
Updated May 20, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Xiaoyu Tu; María Katherine Mejía-Guerra; Jose A Valdes Franco; David Tzeng; Po-Yu Chu; Xiuru Dai; Pinghua Li; Edward S Buckler; Silin Zhong (2020). The transcription regulatory code of a plant leaf [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_3834198
Explore at:
Dataset updated
May 20, 2020
Dataset provided by
7 State Key Laboratory of Agrobiotechnology, School of Life Sciences, The Chinese University of Hong Kong, Hong Kong, China
Institute for Genomic Diversity, Cornell University, Ithaca, NY, USA
Institute for Genomic Diversity, Cornell University, Ithaca, NY, USA; School of Integrative Plant Sciences, Section of Plant Breeding and Genetics, Cornell University, Ithaca, NY, USA; Agricultural Research Service, United States Department of Agriculture, Ithaca, NY, USA
College of Agronomy, Shandong Agricultural University, China
State Key Laboratory of Agrobiotechnology, School of Life Sciences, The Chinese University of Hong Kong, Hong Kong, China
College of Agronomy, Shandong Agricultural University, China; State Key Laboratory of Agrobiotechnology, School of Life Sciences, The Chinese University of 8 Hong Kong, Hong Kong, China
College of Agronomy, Shandong Agricultural University, China.
School of Integrative Plant Sciences, Section of Plant Breeding and Genetics, Cornell University, Ithaca, NY, USA
Authors
Xiaoyu Tu; María Katherine Mejía-Guerra; Jose A Valdes Franco; David Tzeng; Po-Yu Chu; Xiuru Dai; Pinghua Li; Edward S Buckler; Silin Zhong
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The transcription regulatory network underlying essential and complex functionalities inside a eukaryotic cell is defined by the combinatorial actions of transcription factors (TFs). However, TF binding studies in plants are too few in number to produce a general and comparative picture of this complex regulatory network. Here, we used ChIP-seq to determine the binding profiles of 104 TF expressed in the maize leaf (Data can be downloaded from NCBI SRA under accession number PRJNA518749)

With this large dataset, we trained machine-learning models to identify TF sequence preferences. A contrast between Maize and Arabidopsis TF sequence preferences revealed that DNA binding follows the conservation of TF protein families. Finally, the trained models were used to predict and compare the regulatory networks in other grasses species (Sorghum and Rice), which revealed that the edges between TF and TF coding genes are more likely to be maintained (i.e., evolutionarily conserved).

On a practical level, we expect the presented TF binding models to be integrated into pipelines to predict effects of non-coding variants, both common and rare, on TF binding, to pinpoint causal sites. As the possibility of being able to predict and generate novel variation not seen in nature could fundamentally change future plant breeding.

Detail: Each *tar.gz file is a bag-of-k-mer model fitted for a single ZmTF, which can be used for predictions. Information about each ZmTF is included in the table tfids.tsv

For more information about the project: The transcription regulatory code of a plant leaf
d
Data from: Temporal transcriptional logic of dynamic regulatory networks...
datadryad.org
zenodo.org
zip
Updated Jun 1, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Kranthi Varala; Amy Marshall-Colón; Jacopo Cirrone; Matthew D. Brooks; Angelo V. Pasquino; Sophie Léran; Shipra Mittal; Tara M. Rock; Molly B. Edwards; Grace J. Kim; Sandrine Ruffel; W. Richard McCombie; Dennis Shasha; Gloria M. Coruzzi (2019). Temporal transcriptional logic of dynamic regulatory networks underlying nitrogen signaling and use in plants [Dataset]. http://doi.org/10.5061/dryad.248g184
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5061/dryad.248g184
Dataset updated
Jun 1, 2019
Dataset provided by
Dryad
Authors
Kranthi Varala; Amy Marshall-Colón; Jacopo Cirrone; Matthew D. Brooks; Angelo V. Pasquino; Sophie Léran; Shipra Mittal; Tara M. Rock; Molly B. Edwards; Grace J. Kim; Sandrine Ruffel; W. Richard McCombie; Dennis Shasha; Gloria M. Coruzzi
Time period covered
May 6, 2018
Description
This study exploits time, the relatively unexplored fourth dimension of gene regulatory networks (GRNs), to learn the temporal transcriptional logic underlying dynamic nitrogen (N) signaling in plants. Our “just-in-time” analysis of time-series transcriptome data uncovered a temporal cascade of cis elements underlying dynamic N signaling. To infer transcription factor (TF)-target edges in a GRN, we applied a time-based machine learning method to 2,174 dynamic N-responsive genes. We experimentally determined a network precision cutoff, using TF-regulated genome-wide targets of three TF hubs (CRF4, SNZ, and CDF1), used to “prune” the network to 155 TFs and 608 targets. This network precision was reconfirmed using genome-wide TF-target regulation data for four additional TFs (TGA1, HHO5/6, and PHL1) not used in network pruning. These higher-confidence edges in the GRN were further filtered by independent TF-target binding data, used to calculate a TF “N-specificity” index. This refined GRN...
f
Genome-Wide Signatures of Transcription Factor Activity: Connecting...
figshare.com
tiff
Updated Jan 18, 2016
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jing Chen; Zhen Hu; Mukta Phatak; John Reichard; Johannes M. Freudenberg; Siva Sivaganesan; Mario Medvedovic (2016). Genome-Wide Signatures of Transcription Factor Activity: Connecting Transcription Factors, Disease, and Small Molecules [Dataset]. http://doi.org/10.1371/journal.pcbi.1003198
Explore at:
tiffAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pcbi.1003198
Dataset updated
Jan 18, 2016
Dataset provided by
PLOS Computational Biology
Authors
Jing Chen; Zhen Hu; Mukta Phatak; John Reichard; Johannes M. Freudenberg; Siva Sivaganesan; Mario Medvedovic
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Identifying transcription factors (TF) involved in producing a genome-wide transcriptional profile is an essential step in building mechanistic model that can explain observed gene expression data. We developed a statistical framework for constructing genome-wide signatures of TF activity, and for using such signatures in the analysis of gene expression data produced by complex transcriptional regulatory programs. Our framework integrates ChIP-seq data and appropriately matched gene expression profiles to identify True REGulatory (TREG) TF-gene interactions. It provides genome-wide quantification of the likelihood of regulatory TF-gene interaction that can be used to either identify regulated genes, or as genome-wide signature of TF activity. To effectively use ChIP-seq data, we introduce a novel statistical model that integrates information from all binding “peaks” within 2 Mb window around a gene's transcription start site (TSS), and provides gene-level binding scores and probabilities of regulatory interaction. In the second step we integrate these binding scores and regulatory probabilities with gene expression data to assess the likelihood of True REGulatory (TREG) TF-gene interactions. We demonstrate the advantages of TREG framework in identifying genes regulated by two TFs with widely different distribution of functional binding events (ERα and E2f1). We also show that TREG signatures of TF activity vastly improve our ability to detect involvement of ERα in producing complex diseases-related transcriptional profiles. Through a large study of disease-related transcriptional signatures and transcriptional signatures of drug activity, we demonstrate that increase in statistical power associated with the use of TREG signatures makes the crucial difference in identifying key targets for treatment, and drugs to use for treatment. All methods are implemented in an open-source R package treg. The package also contains all data used in the analysis including 494 TREG binding profiles based on ENCODE ChIP-seq data. The treg package can be downloaded at http://GenomicsPortals.org.
Transcription Factors Expressed in Mouse Cochlear Inner and Outer Hair Cells...
plos.figshare.com
datasetcatalog.nlm.nih.gov
qt
Updated May 31, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yi Li; Huizhan Liu; Cody L. Barta; Paul D. Judge; Lidong Zhao; Weiping J. Zhang; Shusheng Gong; Kirk W. Beisel; David Z. Z. He (2023). Transcription Factors Expressed in Mouse Cochlear Inner and Outer Hair Cells [Dataset]. http://doi.org/10.1371/journal.pone.0151291
Explore at:
qtAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0151291
Dataset updated
May 31, 2023
Dataset provided by
PLOShttp://plos.org/
Authors
Yi Li; Huizhan Liu; Cody L. Barta; Paul D. Judge; Lidong Zhao; Weiping J. Zhang; Shusheng Gong; Kirk W. Beisel; David Z. Z. He
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Regulation of gene expression is essential to determining the functional complexity and morphological diversity seen among different cells. Transcriptional regulation is a crucial step in gene expression regulation because the genetic information is directly read from DNA by sequence-specific transcription factors (TFs). Although several mouse TF databases created from genome sequences and transcriptomes are available, a cell type-specific TF database from any normal cell populations is still lacking. We identify cell type-specific TF genes expressed in cochlear inner hair cells (IHCs) and outer hair cells (OHCs) using hair cell-specific transcriptomes from adult mice. IHCs and OHCs are the two types of sensory receptor cells in the mammalian cochlea. We show that 1,563 and 1,616 TF genes are respectively expressed in IHCs and OHCs among 2,230 putative mouse TF genes. While 1,536 are commonly expressed in both populations, 73 genes are differentially expressed (with at least a twofold difference) in IHCs and 13 are differentially expressed in OHCs. Our datasets represent the first cell type-specific TF databases for two populations of sensory receptor cells and are key informational resources for understanding the molecular mechanism underlying the biological properties and phenotypical differences of these cells.
d
PRODORIC
dknet.org
scicrunch.org
+2more
Updated Jan 29, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2022). PRODORIC [Dataset]. http://identifiers.org/RRID:SCR_007074
Explore at:
Unique identifier
https://identifiers.org/RRID:SCR_007074
Dataset updated
Jan 29, 2022
Description
Database about gene regulation and gene expression in prokaryotes. It includes a manually curated and unique collection of transcription factor binding sites. A variety of bioinformatics tools for the prediction, analysis and visualization of regulons and gene reglulatory networks is included. The integrated approach provides information about molecular networks in prokaryotes with focus on pathogenic organisms. In detail this concerns: * transcriptional regulation (transcription factors and their DNA binding sites * signal transduction (two-component systems, phosphylation cascades) * protein interactions (complex formation, oligomerization) * biochemical pathways (chemical reactions) * other regulation events (e.g. codon usage, etc. ...) It aims to be a resource to model protein-host interactions and to be a suitable platform to analyze high-throughput data from proteomis and transcriptomics experiments (systems biology). Currently it mainly contains detailed information about operon and promoter structures including huge collections of transcription factor binding sites. If an appropriate number of regulatory binding sites is available, a position weight matrix (PWM) and a sequence logo is provided, which can be used to predict new binding sites. This data is collected manually by screening the original scientific literature. PRODORIC also handles protein-protein interactions and signal-transduction cascades that commonly occur in form of two-component systems in prokaryotes. Furthermore it contains metabolic network data imported from the KEGG database., THIS RESOURCE IS NO LONGER IN SERVICE. Documented on September 16,2025.
The top ten target genes of transcription factor AHR predicted by KGE-TGI...
plos.figshare.com
datasetcatalog.nlm.nih.gov
xls
Updated Jun 30, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yang-Han Wu; Yu-An Huang; Jian-Qiang Li; Zhu-Hong You; Peng-Wei Hu; Lun Hu; Victor C. M. Leung; Zhi-Hua Du (2023). The top ten target genes of transcription factor AHR predicted by KGE-TGI model. [Dataset]. http://doi.org/10.1371/journal.pcbi.1011207.t008
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pcbi.1011207.t008
Dataset updated
Jun 30, 2023
Dataset provided by
PLOShttp://plos.org/
Authors
Yang-Han Wu; Yu-An Huang; Jian-Qiang Li; Zhu-Hong You; Peng-Wei Hu; Lun Hu; Victor C. M. Leung; Zhi-Hua Du
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The top ten target genes of transcription factor AHR predicted by KGE-TGI model.
f
Data_Sheet_1_Identification of hub genes and transcription factor regulatory...
datasetcatalog.nlm.nih.gov
frontiersin.figshare.com
Updated Oct 28, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Guo, Zhifu; Zeng, ZhenYu; Xu, Qiang; Song, Xiaowei; Tu, Dingyuan; Ma, Chaoqun; Zhao, Xianxian (2022). Data_Sheet_1_Identification of hub genes and transcription factor regulatory network for heart failure using RNA-seq data and robust rank aggregation analysis.PDF [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0000415269
Explore at:
Dataset updated
Oct 28, 2022
Authors
Guo, Zhifu; Zeng, ZhenYu; Xu, Qiang; Song, Xiaowei; Tu, Dingyuan; Ma, Chaoqun; Zhao, Xianxian
Description
BackgroundHeart failure (HF) is the end stage of various cardiovascular diseases with a high mortality rate. Novel diagnostic and therapeutic biomarkers for HF are urgently required. Our research aims to identify HF-related hub genes and regulatory networks using bioinformatics and validation assays.MethodsUsing four RNA-seq datasets in the Gene Expression Omnibus (GEO) database, we screened differentially expressed genes (DEGs) of HF using Removal of Unwanted Variation from RNA-seq data (RUVSeq) and the robust rank aggregation (RRA) method. Then, hub genes were recognized using the STRING database and Cytoscape software with cytoHubba plug-in. Furthermore, reliable hub genes were validated by the GEO microarray datasets and quantitative reverse transcription polymerase chain reaction (qRT-PCR) using heart tissues from patients with HF and non-failing donors (NFDs). In addition, R packages “clusterProfiler” and “GSVA” were utilized for enrichment analysis. Moreover, the transcription factor (TF)–DEG regulatory network was constructed by Cytoscape and verified in a microarray dataset.ResultsA total of 201 robust DEGs were identified in patients with HF and NFDs. STRING and Cytoscape analysis recognized six hub genes, among which ASPN, COL1A1, and FMOD were confirmed as reliable hub genes through microarray datasets and qRT-PCR validation. Functional analysis showed that the DEGs and hub genes were enriched in T-cell-mediated immune response and myocardial glucose metabolism, which were closely associated with myocardial fibrosis. In addition, the TF–DEG regulatory network was constructed, and 13 significant TF–DEG pairs were finally identified.ConclusionOur study integrated different RNA-seq datasets using RUVSeq and the RRA method and identified ASPN, COL1A1, and FMOD as potential diagnostic biomarkers for HF. The results provide new insights into the underlying mechanisms and effective treatments of HF.

Facebook

Twitter

Click to copy link

Link copied

Cite

Luz Garcia-Alonso; Luz Garcia-Alonso (2020). Benchmark and integration of resources for the estimation of human transcription factor activities [Dataset]. http://doi.org/10.1101/337915

Data from: Benchmark and integration of resources for the estimation of human transcription factor activities

Explore at:

10 scholarly articles cite this dataset (View in Google Scholar)

zipAvailable download formats

Unique identifier

https://doi.org/10.1101/337915

Dataset updated

Jan 24, 2020

Dataset provided by

Zenodohttp://zenodo.org/

Authors

Luz Garcia-Alonso; Luz Garcia-Alonso

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Data used to benchmark human TF-target datasets via TF activities in 3 benchmark datasets. Described in Garcia-Alonso et al 2019

Check https://github.com/saezlab/TFbenchmark to access the corresponding code.

Study abstract

Prediction of transcription factor (TF) activities from the gene expression of their targets (i.e. TF regulon) is becoming a widely-used approach to characterize the functional status of transcriptional regulatory circuits. Several strategies and datasets have been proposed to link the target genes likely regulated by a TF, each one providing a different level of evidence. The most established ones are: (i) manually curated repositories, (ii) interactions derived from ChIP-seq binding data, (iii) in silico prediction of TF binding on gene promoters, and (iv) reverse-engineered regulons from large gene expression datasets. However, it is not known how these different sources of regulons affect the TF activity estimations, and thereby downstream analysis and interpretation. Here we compared the accuracy and biases of these strategies to define human TF regulons by means of their ability to predict changes in TF activities in three reference benchmark datasets. We assembled a collection of TF-target interactions among 1,541 TFs and evaluated how the different molecular and regulatory properties of the TFs, such as the DNA-binding domain, specificities or mode of interaction with the chromatin, affect the predictions of TF activity changes. We assessed their coverage and found little overlap on the regulons derived from each strategy and better performance by literature-curated information followed by ChIP-seq data. We provide an integrated resource of all TF-target interactions derived through these strategies with a confidence score, as a resource for enhanced prediction of TF activities.

Clear search

Close search

Google apps

Main menu

Data from: Benchmark and integration of resources for the estimation of...

Data from: Genome-wide Transcription Factor DNA Binding Sites and Gene...

Transcriptional Regulatory Element Database

Additional file 3: Table S7. of Systematic target function annotation of...

Table_1_Filtering of Data-Driven Gene Regulatory Networks Using Drosophila...

Gene regulatory networks for 38 human tissues

BIOGRID CURATED DATA FOR PUBLICATION: Regulation of a transcription factor...

Systematic Target Function Annotation of Human Transcription Factors

REDfly Regulatory Element Database for Drosophilia

Data_Sheet_1_CisCross: A gene list enrichment analysis to predict upstream...

ReMap 2022: a database of Human, Mouse, Drosophila and Arabidopsis...

PRISM (Stanford database)

DBTBS

Data from: The transcription regulatory code of a plant leaf

Data from: Temporal transcriptional logic of dynamic regulatory networks...

Genome-Wide Signatures of Transcription Factor Activity: Connecting...

Transcription Factors Expressed in Mouse Cochlear Inner and Outer Hair Cells...

PRODORIC

The top ten target genes of transcription factor AHR predicted by KGE-TGI...

Data_Sheet_1_Identification of hub genes and transcription factor regulatory...

Data from: Benchmark and integration of resources for the estimation of human transcription factor activities