Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The data provided here are part of a Galaxy tutorial that analyzes ChIP-seq data from a study published by Wu et al., 2014 (DOI:10.1101/gr.164830.113). The goal of this study was to investigate "the dynamics of occupancy and the role in gene regulation of the transcription factor Tal1, a critical regulator of hematopoiesis, at multiple stages of hematopoietic differentiation." To this end, ChIP-seq experiments were performed in multiple mouse cell types including a G1E cell line and megakaryocytes, the two cell types represented here. The dataset contains biological replicate Tal1 ChIP-seq and input control experiments (*.fastqsanger files). Because of the long processing time for the large original files, we have downsampled the original raw data files to include only reads that align to chromosome 19 and a subset of interesting genomic loci (ChIPseq_regions_of_interest_v4.bed) pulled from the Wu et al. publication. Also included is a gene annotation file (RefSeq_gene_annotations_mm10.bed) with gene names added for viewing in a genome browser.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Mapping the chromosomal locations of transcription factors, nucleosomes, histone modifications, chromatin remodeling enzymes, chaperones, and polymerases is one of the key tasks of modern biology, as evidenced by the Encyclopedia of DNA Elements (ENCODE) Project. To this end, chromatin immunoprecipitation followed by high-throughput sequencing (ChIP-seq) is the standard methodology. Mapping such protein-DNA interactions in vivo using ChIP-seq presents multiple challenges not only in sample preparation and sequencing but also for computational analysis. Here, we present step-by-step guidelines for the computational analysis of ChIP-seq data. We address all the major steps in the analysis of ChIP-seq data: sequencing depth selection, quality checking, mapping, data normalization, assessment of reproducibility, peak calling, differential binding analysis, controlling the false discovery rate, peak annotation, visualization, and motif analysis. At each step in our guidelines we discuss some of the software tools most frequently used. We also highlight the challenges and problems associated with each step in ChIP-seq data analysis. We present a concise workflow for the analysis of ChIP-seq data in Figure 1 that complements and expands on the recommendations of the ENCODE and modENCODE projects. Each step in the workflow is described in detail in the following sections.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Shown are the number of peaks called and the total number of bp covered by each peak set for H3K4me3, H3K36me3, and H3K9me3 using the original Sole-search program or the program which has been modified to identify broad regions covered by modified histones. Also shown in the increase in genome coverage (fold difference) that results when using the modified peak calling program. Both the original and the modified program can be accessed at http://chipseq.genomecenter.ucdavis.edu/cgi-bin/chipseq.cgi.
Facebook
TwitterSummary of MACS analysis of the ChIP-seq data.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The data provided here are part of a Galaxy Training Network tutorial that analyzes ChIP-seq data from a study published by Ross-Inness et al., 2012 (DOI:10.1038/nature10730) to identify the binding sites of the Estrogen receptor, a transcription factor known to be associated with different types of breast cancer.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
We downloaded 2,216 ChIP-seq experiment data from the ENCODE Project. The list of the data is in Supplementary Table S8. The data were lifted over from hg19 to hg38. We found overlapping peaks on four different categories: (1) 500bp upstream the promoter region of pcRNA-associated coding genes, (2) 500bp upstream promoter region of pcRNAs, (3) pcRNA genomic loci, and (4) pcRNA genomic loci but not overlapping with promoter region. To understand the correlation of TF binding patterns in the four categories, we made a binary matrix per category that consists of rows of TFs and columns of pcRNA/coding genes. Hence, the matrix contains connections between TF and pcRNA/associate coding genes. The matrix of category 2 is clustered by Euclidian Distance. To check the extent to which promoter sharing or proximity determines TFBS correlation, we also separated the clustered heat-map in the pcRNA bidirectional transcript (BIDIR) subgroup to the other subgroups (Non-BIDIR). To directly compare the TF binding patterns between each category, the other three matrices were sorted by the same order of the clustered matrix. We used the MatLab function corr2 to calculate r-value between category (1) and (2). We performed Monte Carlo simulation to calculate the p-value and test the significance of the r-value.
Facebook
Twitter
According to our latest research, the global ChIP-Seq market size reached USD 1.42 billion in 2024, driven by the rapid adoption of next-generation sequencing technologies and the increasing demand for advanced epigenetic research tools. The market is expected to grow at a robust CAGR of 14.9% from 2025 to 2033, with the forecasted market size projected to reach USD 4.33 billion by 2033. This remarkable growth is primarily attributed to the expanding applications of ChIP-Seq in drug discovery, personalized medicine, and cancer research, as well as continuous technological advancements in sequencing platforms and bioinformatics analysis.
The primary growth factor for the ChIP-Seq market is the surging interest in epigenetics and gene regulation research, which has become a cornerstone of modern molecular biology and precision medicine. Researchers and clinicians are increasingly leveraging chromatin immunoprecipitation sequencing (ChIP-Seq) to unravel complex gene regulatory mechanisms, identify disease-associated biomarkers, and develop targeted therapies. The availability of high-quality antibodies, improvements in library preparation protocols, and the reduction in sequencing costs have further democratized access to ChIP-Seq technologies, enabling a broader range of institutions and laboratories to participate in cutting-edge genomics research. Furthermore, the integration of ChIP-Seq data with other omics datasets, such as transcriptomics and proteomics, is unlocking new frontiers in systems biology and disease modeling, fueling sustained market growth.
Another significant driver for the ChIP-Seq market is the increasing investment by pharmaceutical and biotechnology companies in drug discovery and development processes. ChIP-Seq has emerged as a critical tool for identifying druggable targets, elucidating mechanisms of action, and understanding off-target effects at the chromatin level. The growing emphasis on personalized and precision medicine, particularly in oncology and rare diseases, has spurred demand for comprehensive epigenomic profiling solutions. This trend is further supported by government initiatives and funding programs aimed at accelerating genomics research, fostering collaborations between academia and industry, and establishing large-scale biobanks that utilize ChIP-Seq for functional annotation of the genome.
Technological advancements have played a pivotal role in shaping the trajectory of the ChIP-Seq market. The introduction of automated sample preparation systems, high-throughput sequencing platforms, and sophisticated bioinformatics software has significantly improved the reproducibility, scalability, and cost-effectiveness of ChIP-Seq workflows. Cloud-based data analysis solutions and machine learning algorithms are enabling researchers to handle and interpret massive datasets with greater accuracy and efficiency. These innovations are not only enhancing the quality of ChIP-Seq data but also expanding its utility across diverse applications, including developmental biology, neuroscience, immunology, and environmental genomics. As a result, the market is witnessing a surge in demand for integrated ChIP-Seq solutions that combine instrumentation, consumables, software, and services into seamless, end-to-end offerings.
From a regional perspective, North America continues to dominate the ChIP-Seq market due to its advanced research infrastructure, strong presence of leading biotechnology firms, and substantial government funding for genomics initiatives. However, the Asia Pacific region is rapidly emerging as a key growth engine, fueled by increasing investments in life sciences research, expanding biopharmaceutical industries, and rising awareness of precision medicine. Europe also maintains a significant market share, supported by collaborative research networks and a robust regulatory framework for genomic technologies. Meanwhile, Latin America and the Middle East & Africa are gradually catching up, driven by improvements in healthcare infrastructure and growing participation in international genomics consortia. This dynamic regional landscape underscores the global nature of the ChIP-Seq market and its critical role in advancing biomedical research worldwide.
Facebook
TwitterChromatin immunoprecipitation and sequencing (ChIP-seq) has been widely used to map DNA-binding proteins, histone proteins and their modifications. ChIP-seq data contains redundant reads termed duplicates, referring to those mapping to the same genomic location and strand. There are two main sources of duplicates: polymerase chain reaction (PCR) duplicates and natural duplicates. Unlike natural duplicates that represent true signals from sequencing of independent DNA templates, PCR duplicates are artifacts originating from sequencing of identical copies amplified from the same DNA template. In analysis, duplicates are removed from peak calling and signal quantification. Nevertheless, a significant portion of the duplicates is believed to represent true signals. Obviously, removing all duplicates will underestimate the signal level in peaks and impact the identification of signal changes across samples. Therefore, an in-depth evaluation of the impact from duplicate removal is needed. Using eight public ChIP-seq datasets from three narrow-peak and two broad-peak marks, we tried to understand the distribution of duplicates in the genome, the extent by which duplicate removal impacts peak calling and signal estimation, and the factors associated with duplicate level in peaks. The three PCR-free histone H3 lysine 4 trimethylation (H3K4me3) ChIP-seq data had about 40% duplicates and 97% of them were within peaks. For the other datasets generated with PCR amplification of ChIP DNA, as expected, the narrow-peak marks have a much higher proportion of duplicates than the broad-peak marks. We found that duplicates are enriched in peaks and largely represent true signals, more conspicuous in those with high confidence. Furthermore, duplicate level in peaks is strongly correlated with the target enrichment level estimated using nonredundant reads, which provides the basis to properly allocate duplicates between noise and signal. Our analysis supports the feasibility of retaining the portion of signal duplicates into downstream analysis, thus alleviating the limitation of complete deduplication.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Large sets of genomic regions are generated by the initial analysis of various genome-wide sequencing data, such as ChIP-seq and ATAC-seq experiments. Gene set enrichment (GSE) methods are commonly employed to determine the pathways associated with them. Given the pathways and other gene sets (e.g., GO terms) of significance, it is of great interest to know the extent to which each is driven by binding near transcription start sites (TSS) or near enhancers. Currently, no tool performs such an analysis. Here, we present a method that addresses this question to complement GSE methods for genomic regions. Specifically, the new method tests whether the genomic regions in a gene set are significantly closer to a TSS (or to an enhancer) than expected by chance given the total list of genomic regions, using a non-parametric test. Combining the results from a GSE test with our novel method provides additional information regarding the mode of regulation of each pathway, and additional evidence that the pathway is truly enriched. We illustrate our new method with a large set of ENCODE ChIP-seq data, using the chipenrich Bioconductor package. The results show that our method is a powerful complementary approach to help researchers interpret large sets of genomic regions.
Facebook
TwitterChromatin immunoprecipitation followed by sequencing (ChIP-seq) has revolutionized the studies of epigenomes and the massive increase in ChIP-seq datasets calls for robust and user-friendly computational tools for quantitative ChIP-seq. Quantitative ChIP-seq comparisons have been challenging due to noisiness and variations inherent to ChIP-seq and epigenomes. By employing innovative statistical approaches specially catered to ChIP-seq data distribution and sophisticated simulations along with extensive benchmarking studies, we developed and validated CSSQ as a nimble statistical analysis pipeline capable of differential binding analysis across ChIP-seq datasets with high confidence and sensitivity and low false discovery rate with any defined regions. CSSQ models ChIP-seq data as a finite mixture of Gaussians faithfully that reflects ChIP-seq data distribution. By a combination of Anscombe transformation, k-means clustering, estimated maximum normalization, CSSQ minimizes noise and bias from experimental variations. Further, CSSQ utilizes a non-parametric approach and incorporates comparisons under the null hypothesis by unaudited column permutation to perform robust statistical tests to account for fewer replicates of ChIP-seq datasets. In sum, we present CSSQ as a powerful statistical computational pipeline tailored for ChIP-seq data quantitation and a timely addition to the tool kits of differential binding analysis to decipher epigenomes.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Chromatin immunoprecipitation followed by high-throughput sequencing (ChIP-seq) is rapidly replacing chromatin immunoprecipitation combined with genome-wide tiling array analysis (ChIP-chip) as the preferred approach for mapping transcription-factor binding sites and chromatin modifications. The state of the art for analyzing ChIP-seq data relies on using only reads that map uniquely to a relevant reference genome (uni-reads). This can lead to the omission of up to 30% of alignable reads. We describe a general approach for utilizing reads that map to multiple locations on the reference genome (multi-reads). Our approach is based on allocating multi-reads as fractional counts using a weighted alignment scheme. Using human STAT1 and mouse GATA1 ChIP-seq datasets, we illustrate that incorporation of multi-reads significantly increases sequencing depths, leads to detection of novel peaks that are not otherwise identifiable with uni-reads, and improves detection of peaks in mappable regions. We investigate various genome-wide characteristics of peaks detected only by utilization of multi-reads via computational experiments. Overall, peaks from multi-read analysis have similar characteristics to peaks that are identified by uni-reads except that the majority of them reside in segmental duplications. We further validate a number of GATA1 multi-read only peaks by independent quantitative real-time ChIP analysis and identify novel target genes of GATA1. These computational and experimental results establish that multi-reads can be of critical importance for studying transcription factor binding in highly repetitive regions of genomes with ChIP-seq experiments.
Facebook
Twitterhttps://dataintelo.com/privacy-and-policyhttps://dataintelo.com/privacy-and-policy
According to our latest research, the global ChIP-Seq market size reached USD 1.25 billion in 2024, driven by the increasing adoption of advanced genomic technologies and the growing demand for high-throughput sequencing solutions in biomedical research. The market is projected to exhibit a robust CAGR of 13.8% from 2025 to 2033, reaching an estimated USD 3.82 billion by the end of the forecast period. This impressive growth is primarily attributed to the expanding applications of ChIP-Seq in drug discovery, epigenetics, and cancer research, as well as technological advancements in sequencing platforms and data analysis tools.
The growth trajectory of the ChIP-Seq market is strongly influenced by the rising prevalence of chronic diseases and cancers globally, which has intensified the need for precise genomic profiling and biomarker discovery. Researchers and clinicians are increasingly leveraging ChIP-Seq technology to unravel complex gene regulation mechanisms and epigenetic modifications, facilitating the development of targeted therapies and personalized medicine. The integration of ChIP-Seq with other next-generation sequencing (NGS) methodologies has further enhanced its utility in large-scale genomic studies, enabling comprehensive insights into chromatin structure and transcription factor binding. Continuous investments in genomics research by both public and private entities are fostering innovation and expanding the scope of ChIP-Seq applications, thereby fueling market expansion.
Another significant growth driver for the ChIP-Seq market is the rapid evolution of sequencing technologies and bioinformatics tools. The advent of high-throughput, cost-effective sequencing platforms has made ChIP-Seq more accessible to a broader range of end-users, including academic institutions, biotechnology firms, and pharmaceutical companies. The development of sophisticated data analysis software and cloud-based platforms has addressed the challenges associated with managing and interpreting large-scale ChIP-Seq datasets, enabling more accurate and reproducible results. Furthermore, the growing trend of collaborative research initiatives and consortia focused on epigenomics and functional genomics is accelerating the adoption of ChIP-Seq, particularly in emerging economies with increasing research funding and infrastructure development.
The ChIP-Seq market is also benefiting from the expanding applications of epigenetic research in drug discovery and development. Pharmaceutical and biotechnology companies are utilizing ChIP-Seq to identify novel drug targets, elucidate mechanisms of action, and optimize therapeutic strategies for complex diseases such as cancer and neurological disorders. The technology’s ability to provide high-resolution maps of protein-DNA interactions and histone modifications is proving invaluable in the identification of epigenetic biomarkers and the development of precision medicine approaches. As regulatory agencies emphasize the importance of genomics in clinical trials and drug approval processes, the demand for ChIP-Seq solutions is expected to surge, further propelling market growth.
Regionally, the ChIP-Seq market demonstrates significant growth potential across North America, Europe, and Asia Pacific, with North America currently leading in terms of market share and technological advancement. The presence of a robust biomedical research infrastructure, substantial government funding, and a high concentration of key market players are key factors driving market growth in this region. Europe follows closely, supported by strong academic research networks and increasing investments in genomics. The Asia Pacific region is emerging as a high-growth market, fueled by expanding research initiatives, rising healthcare expenditure, and the growing adoption of advanced sequencing technologies in countries such as China, Japan, and India. Latin America and the Middle East & Africa, while currently smaller in market size, are expected to witness steady growth as research capabilities and healthcare infrastructure continue to develop.
The ChIP-Seq market by product type is segmented into kits, reagents, instruments, software, and services. Kits and reagents collectively account for a substantial portion of the market, driven by their critical role in sample preparation, library construction, and the overall workflow of ChIP-Seq experiments. The increasing demand
Facebook
Twitterhttps://www.datainsightsmarket.com/privacy-policyhttps://www.datainsightsmarket.com/privacy-policy
Discover the booming Chromatin Immunoprecipitation Sequencing (ChIP-seq) market. This comprehensive analysis reveals key drivers, trends, and restraints impacting growth from 2019-2033, with insights into leading companies, applications (hospital, diagnostic centers), and types (DNase-Seq, FAIRE-Seq). Explore market size projections, regional data, and future growth opportunities.
Facebook
TwitterThe experiment contains ChIP-seq data for an rpoS- version of Vibrio cholerae strain A1552, or a derivative encoding rpoS-3xFLAG. In both cases, smooth colony variants were used. The strains were both grown at 37 degrees, in LB medium, to an OD600 of 2.0, and crosslinked with 1 % (v/v) formaldehyde. After sonication, to break open cells and fragment DNA, immunoprecipitations were done using anti-FLAG antibodies. Libraries were prepared using DNA remaining after immunoprecipitation.
Facebook
TwitterRNA-seq is a sensitive and accurate technique to compare steady state levels of RNA between different cellular states. However, as it does not provide an account of transcriptional activity per se, other technologies are needed to more precisely determine acute transcriptional responses. Here, we have developed an easy, sensitive and accurate novel method, iRNA-seq, for genome-wide assessment of transcriptional activity based on analysis of intron coverage from total RNA-seq data. To test our method, we have performed total RNA-seq and RNA polymerase II (RNAPII) ChIP-seq profiling of the acute transcriptional response of human adipocytes to TNFα treatment and analyzed these using iRNA-seq in addition to different publically availbale dataset. Comparison of the results derived from iRNA-seq analyses with results derived using current methods for genome-wide determination of transcriptional activity, i.e. Global Run-On (GRO)-seq and RNA polymerase II (RNAPII) ChIP-seq, demonstrate that iRNA-seq provides very similar results in terms of number of regulated genes and their fold change. However, unlike the current methods that are all very labor-intensive and demanding in terms of sample material and technologies, iRNA-seq is cheap and easy and requires very little sample material. In conclusion, iRNA-seq offers an attractive novel alternative to current methods for determination of changes in transcriptional activity at a genome-wide level. Genome-wide assesment of the acute transcriptional response to TNFa in human SGBS adiposytes using total RNA-seq data end RNAPII ChIP-seq
Facebook
TwitterChIP-seq (chromatin immunoprecipitation followed by sequencing) is commonly used to identify genome-wide protein-DNA interactions. However, ChIP-seq often gives a low yield, which is not ideal for quantitative outcomes. An alternative method to ChIP-seq is ChEC-seq (Chromatin endogenous cleavage with high-throughput sequencing). In this method, the endogenous TF (transcription factor) of interest is fused with MNase (micrococcal nuclease) that non-specifically cleaves DNA near binding sites. Compared to the original ChEC-seq method, the modified version requires far less amplification. Since MACS3 failed to identify peaks in data generated from the modified ChEC-seq method, a new peak finder has been developed specifically for it. There are three functions in the peak_finder/. callpeaks() is used to identify peaks from BAM files. goanalysis() is used to make GO (Gene Ontology) term plots from peaks. bedtomeme() is a wrapper function to perform MEME analysis in R after MEME Suite is inst..., ****EXCERPTED FROM BIORXIV PREPRINT; SEE PREPRINT OR PUBLISHED PAPER FOR REFERENCES AND DETAILS**** Yeast strains All yeast strains were derived from BY4741. A C-terminal micrococcal nuclease fusion was introduced to the protein of interest through transformation and homologous recombination of PCR-amplified DNA. Primers were designed with 50-bp of homology to the 3’ end of the coding sequence of interest. The 3xFLAG-MNase with a KanR marker was amplified from pGZ108 (Zentner et al., 2015) and transformed into BY4741 as previously described. Successful transformation was confirmed by immunoblotting and PCR, followed by sequencing. Lyophilized DNA oligonucleotides were resuspended in molecular-grade water to a concentration of 100 µM. For ligation, the following pair of oligonucleotides were annealed to produce the Y-adapter: Tn5ME-A (5’-TCGTCGGCAGCGTCAGATGTGTATAAGAGACAG-3’) and Y-Adapt-i5 R (5’-CTGTCTCTTATACACATCTTCATAGTAATCATC-3’). For Tn5 Tagmentation, the following i7 oligonucle..., , # DoubleChEC TF binding site finder
ChIP-seq (chromatin immunoprecipitation followed by sequencing) is commonly used to identify genome-wide protein-DNA interactions. However, ChIP-seq often gives a low yield, which is not ideal for quantitative outcomes. An alternative method to ChIP-seq is ChEC-seq (Chromatin endogenous cleavage with high-throughput sequencing). In this method, an endogenous TF (transcription factor) fused to MNase (micrococcal nuclease) cleaves DNA near binding sites. This package is designed to identify high-confidence binding sites from cleavage patterns from ChEC-seq2, a variant form of ChEC-seq.
There are three functions in the peak_finder/. callpeaks() is used to identify peaks from single-end mapped reads input as BAM files. goanalysis() is used to make GO (Gene Ontology) term plots from peaks. bedtomeme() is a wrapper function to perform MEME analysis in R **after [MEME Suite](https://meme-...
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Direct Alternative Splicing Regulator predictor (DASiRe) is a web application that allows non-expert users to perform different types of splicing analysis from RNA-seq experiments and also incorporates ChIP-seq data of a DNA-binding protein of interest to evaluate whether its presence is associated with the splicing changes detected in the RNA-seq dataset.
DASiRe is an accessible web-based platform that performs the analysis of raw RNA-seq and ChIP-seq data to study the relationship between DNA-binding proteins and alternative splicing regulation. It provides a fully integrated pipeline that takes raw reads from RNA-seq and performs extensive splicing analysis by incorporating the three current methodological approaches to study alternative splicing: isoform switching, exon and event-level. Once the initial splicing analysis is finished, DASiRe performs ChIP-seq peak enrichment in the spliced genes detected by each one of the three approaches.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Datasets produced during the validation of CWL-based pipelines, designed for the analysis of data from RNA-Seq, ChIP-Seq and germline variant calling experiments. Specifically, the workflows were tested using publicly available High-throughput (HTS) data from published studies on Chronic Lymphocytic Leukemia (CLL) (accession numbers: E-MTAB-6962, GSE115772) and Genome in a Bottle (GIAB) project samples (accession numbers: SRR6794144, SRR22476789, SRR22476790, SRR22476791).
The supporting data include:
Facebook
TwitterBackground: Germ Cell Cancers (GCC), originating from Primordial Germ Cells /gonocytes, are the most common cancer in young men, subdivided in seminoma (SE) and non-seminoma (NS, stem cell component: embryonal carcinoma (EC)). Somatic mutations are rarely found in GCC. It has been proposed that disruption of the epigenetic constitution, either primarily or secondary (e.g. environmental influences), is involved in cancer, and specifically in GCC. Results: This study aims at identifying epigenetic footprints of SE and EC cell lines in genome-wide profiles by studying the interaction between gene expression, DNA CpG methylation and histone modifications, and their function in GCC and related disruption of germ cell maturation. Two well characterized GCC-derived cell lines were compared, one representative for SE (TCam-2) and the other for EC (NCCIT). Data was acquired using the Illumina HumanHT-12-v4 (gene expression) and HumanMethylation450 BeadChip (methylation) microarrays as well as ChIP sequencing (activating histone modifications (H3K4me3, H3K27ac)). The data show that known germ cell markers are not only present and differentiating between SE and NS at the expression level, but also in the epigenetic landscape. Conclusion: The overall similarity between TCam-2 / NCCIT supports an erased embryonic gem cell arrested in early gonadal development as common origin. Subtle difference in the (integrated) epigenetic and expression profiles indicated TCam-2 to exhibit a more germ cell like profile (enrichment Androgen regulation) while NCCIT proved more pluripotent. The results provide insight into an integrated analysis of the functional genome in GCC cell lines. Two wildtype germ cell cancer (type II germ cell tumor) cell lines were analyzed. TCam-2 (representative for the seminomatous subtype of germ cell cancer) , [1, 2]) and NCCIT (representative of the non-seminomatous (embryonal carcinoma) subtype of germ cell cancer, [3]). Of each cell line two biological replicates were included. Genomic positions reported are based on the GRch37/hg19 assembly. 1. Mizuno, Y., et al., [Establishment and characterization of a new human testicular germ cell tumor cell line (TCam-2)]. Nihon Hinyokika Gakkai Zasshi, 1993. 84(7): p. 1211-8. 2. de Jong, J., et al., Further characterization of the first seminoma cell line TCam-2. Genes Chromosomes Cancer, 2008. 47(3): p. 185-96. 3. Teshima, S., et al., Four new human germ cell tumor cell lines. Lab Invest, 1988. 59(3): p. 328-36.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Abstract:
Spatiotemporal regulation of gene expression is controlled by transcription factor (TF) binding to regulatory elements, resulting in a plethora of cell types and cell states from the same genetic information. Due to the importance of regulatory elements, various sequencing methods have been developed to localise them in genomes, for example using ChIP-seq profiling of the histone mark H3K27ac that marks active regulatory regions. Moreover, multiple tools have been developed to predict TF binding to these regulatory elements based on DNA sequence. As altered gene expression is a hallmark of disease phenotypes, identifying TFs driving such gene expression programs is critical for the identification of novel drug targets.In this study, we curated 84 chromatin profiling experiments (H3K27ac ChIP-seq) where TFs were perturbed through e.g., genetic knockout or overexpression. We ran nine published tools to prioritize TFs using these real-world data sets and evaluated the performance of the methods in identifying the perturbed TFs. This allowed the nomination of three frontrunner tools, namely RcisTarget, MEIRLOP and monaLisa. Our analyses revealed opportunities and commonalities of tools that will help to guide further improvements and developments in the field.
Dataset description:
Contact: Sebastian Steinhauser - sebastian.steinhauser@novartis.com
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The data provided here are part of a Galaxy tutorial that analyzes ChIP-seq data from a study published by Wu et al., 2014 (DOI:10.1101/gr.164830.113). The goal of this study was to investigate "the dynamics of occupancy and the role in gene regulation of the transcription factor Tal1, a critical regulator of hematopoiesis, at multiple stages of hematopoietic differentiation." To this end, ChIP-seq experiments were performed in multiple mouse cell types including a G1E cell line and megakaryocytes, the two cell types represented here. The dataset contains biological replicate Tal1 ChIP-seq and input control experiments (*.fastqsanger files). Because of the long processing time for the large original files, we have downsampled the original raw data files to include only reads that align to chromosome 19 and a subset of interesting genomic loci (ChIPseq_regions_of_interest_v4.bed) pulled from the Wu et al. publication. Also included is a gene annotation file (RefSeq_gene_annotations_mm10.bed) with gene names added for viewing in a genome browser.