100+ datasets found

Table_1_Gaining Insights Into Metabolic Networks Using Chemometrics and...
frontiersin.figshare.com
datasetcatalog.nlm.nih.gov
xlsx
Updated Jun 8, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Julien Boccard; Domitille Schvartz; Santiago Codesido; Mohamed Hanafi; Yoric Gagnebin; Belén Ponte; Fabien Jourdan; Serge Rudaz (2023). Table_1_Gaining Insights Into Metabolic Networks Using Chemometrics and Bioinformatics: Chronic Kidney Disease as a Clinical Model.XLSX [Dataset]. http://doi.org/10.3389/fmolb.2021.682559.s001
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.3389/fmolb.2021.682559.s001
Dataset updated
Jun 8, 2023
Dataset provided by
Frontiers Mediahttp://www.frontiersin.org/
Authors
Julien Boccard; Domitille Schvartz; Santiago Codesido; Mohamed Hanafi; Yoric Gagnebin; Belén Ponte; Fabien Jourdan; Serge Rudaz
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Because of its ability to generate biological hypotheses, metabolomics offers an innovative and promising approach in many fields, including clinical research. However, collecting specimens in this setting can be difficult to standardize, especially when groups of patients with different degrees of disease severity are considered. In addition, despite major technological advances, it remains challenging to measure all the compounds defining the metabolic network of a biological system. In this context, the characterization of samples based on several analytical setups is now recognized as an efficient strategy to improve the coverage of metabolic complexity. For this purpose, chemometrics proposes efficient methods to reduce the dimensionality of these complex datasets spread over several matrices, allowing the integration of different sources or structures of metabolic information. Bioinformatics databases and query tools designed to describe and explore metabolic network models offer extremely useful solutions for the contextualization of potential biomarker subsets, enabling mechanistic hypotheses to be considered rather than simple associations. In this study, network principal component analysis was used to investigate samples collected from three cohorts of patients including multiple stages of chronic kidney disease. Metabolic profiles were measured using a combination of four analytical setups involving different separation modes in liquid chromatography coupled to high resolution mass spectrometry. Based on the chemometric model, specific patterns of metabolites, such as N-acetyl amino acids, could be associated with the different subgroups of patients. Further investigation of the metabolic signatures carried out using genome-scale network modeling confirmed both tryptophan metabolism and nucleotide interconversion as relevant pathways potentially associated with disease severity. Metabolic modules composed of chemically adjacent or close compounds of biological relevance were further investigated using carbon transfer reaction paths. Overall, the proposed integrative data analysis strategy allowed deeper insights into the metabolic routes associated with different groups of patients to be gained. Because of their complementary role in the knowledge discovery process, the association of chemometrics and bioinformatics in a common workflow is therefore shown as an efficient methodology to gain meaningful insights in a clinical context.
f
Data from: Integrative Genomic Analysis Identifies Isoleucine and CodY as...
datasetcatalog.nlm.nih.gov
Updated Sep 6, 2012
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Lobel, Lior; Borovok, Ilya; Herskovits, Anat A.; Sigal, Nadejda; Ruppin, Eytan (2012). Integrative Genomic Analysis Identifies Isoleucine and CodY as Regulators of Listeria monocytogenes Virulence [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0001146369
Explore at:
Dataset updated
Sep 6, 2012
Authors
Lobel, Lior; Borovok, Ilya; Herskovits, Anat A.; Sigal, Nadejda; Ruppin, Eytan
Description
Intracellular bacterial pathogens are metabolically adapted to grow within mammalian cells. While these adaptations are fundamental to the ability to cause disease, we know little about the relationship between the pathogen's metabolism and virulence. Here we used an integrative Metabolic Analysis Tool that combines transcriptome data with genome-scale metabolic models to define the metabolic requirements of Listeria monocytogenes during infection. Twelve metabolic pathways were identified as differentially active during L. monocytogenes growth in macrophage cells. Intracellular replication requires de novo synthesis of histidine, arginine, purine, and branch chain amino acids (BCAAs), as well as catabolism of L-rhamnose and glycerol. The importance of each metabolic pathway during infection was confirmed by generation of gene knockout mutants in the respective pathways. Next, we investigated the association of these metabolic requirements in the regulation of L. monocytogenes virulence. Here we show that limiting BCAA concentrations, primarily isoleucine, results in robust induction of the master virulence activator gene, prfA, and the PrfA-regulated genes. This response was specific and required the nutrient responsive regulator CodY, which is known to bind isoleucine. Further analysis demonstrated that CodY is involved in prfA regulation, playing a role in prfA activation under limiting conditions of BCAAs. This study evidences an additional regulatory mechanism underlying L. monocytogenes virulence, placing CodY at the crossroads of metabolism and virulence.
Data_Sheet_1_STATegra: Multi-Omics Data Integration – A Conceptual Scheme...
frontiersin.figshare.com
zip
Updated Jun 1, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Nuria Planell; Vincenzo Lagani; Patricia Sebastian-Leon; Frans van der Kloet; Ewoud Ewing; Nestoras Karathanasis; Arantxa Urdangarin; Imanol Arozarena; Maja Jagodic; Ioannis Tsamardinos; Sonia Tarazona; Ana Conesa; Jesper Tegner; David Gomez-Cabrero (2023). Data_Sheet_1_STATegra: Multi-Omics Data Integration – A Conceptual Scheme With a Bioinformatics Pipeline.zip [Dataset]. http://doi.org/10.3389/fgene.2021.620453.s001
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.3389/fgene.2021.620453.s001
Dataset updated
Jun 1, 2023
Dataset provided by
Frontiers Mediahttp://www.frontiersin.org/
Authors
Nuria Planell; Vincenzo Lagani; Patricia Sebastian-Leon; Frans van der Kloet; Ewoud Ewing; Nestoras Karathanasis; Arantxa Urdangarin; Imanol Arozarena; Maja Jagodic; Ioannis Tsamardinos; Sonia Tarazona; Ana Conesa; Jesper Tegner; David Gomez-Cabrero
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Technologies for profiling samples using different omics platforms have been at the forefront since the human genome project. Large-scale multi-omics data hold the promise of deciphering different regulatory layers. Yet, while there is a myriad of bioinformatics tools, each multi-omics analysis appears to start from scratch with an arbitrary decision over which tools to use and how to combine them. Therefore, it is an unmet need to conceptualize how to integrate such data and implement and validate pipelines in different cases. We have designed a conceptual framework (STATegra), aiming it to be as generic as possible for multi-omics analysis, combining available multi-omic anlaysis tools (machine learning component analysis, non-parametric data combination, and a multi-omics exploratory analysis) in a step-wise manner. While in several studies, we have previously combined those integrative tools, here, we provide a systematic description of the STATegra framework and its validation using two The Cancer Genome Atlas (TCGA) case studies. For both, the Glioblastoma and the Skin Cutaneous Melanoma (SKCM) cases, we demonstrate an enhanced capacity of the framework (and beyond the individual tools) to identify features and pathways compared to single-omics analysis. Such an integrative multi-omics analysis framework for identifying features and components facilitates the discovery of new biology. Finally, we provide several options for applying the STATegra framework when parametric assumptions are fulfilled and for the case when not all the samples are profiled for all omics. The STATegra framework is built using several tools, which are being integrated step-by-step as OpenSource in the STATegRa Bioconductor package.1
D
An integrative genomic analysis of the Longshanks selection experiment for...
datasetcatalog.nlm.nih.gov
data.niaid.nih.gov
+1more
Updated Jun 6, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Barton, Nick H.; Yancoskie, Michelle N.; Kučka, Marek; Marchini, Marta; Naumann, Ronald; Castro, João P. L.; Skuplik, Isabella; Belohlavy, Stefanie; Hiramatsu, Layla; Cobb, John; Beluch, William H.; Chan, Yingguang Frank; Rolian, Campbell (2019). An integrative genomic analysis of the Longshanks selection experiment for longer limbs in mice [Dataset]. http://doi.org/10.5061/dryad.0q2h6tk
Explore at:
Unique identifier
https://doi.org/10.5061/dryad.0q2h6tk
Dataset updated
Jun 6, 2019
Authors
Barton, Nick H.; Yancoskie, Michelle N.; Kučka, Marek; Marchini, Marta; Naumann, Ronald; Castro, João P. L.; Skuplik, Isabella; Belohlavy, Stefanie; Hiramatsu, Layla; Cobb, John; Beluch, William H.; Chan, Yingguang Frank; Rolian, Campbell
Description
Evolutionary studies are often limited by missing data that are critical to understanding the history of selection. Selection experiments, which reproduce rapid evolution under controlled conditions, are excellent tools to study how genomes evolve under selection. Here we present a genomic dissection of the Longshanks selection experiment, in which mice were selectively bred over 20 generations for longer tibiae relative to body mass, resulting in 13% longer tibiae in two replicates. We synthesized evolutionary theory, genome sequences and molecular genetics to understand the selection response and found that it involved both polygenic adaptation and discrete loci of major effect, with the strongest loci tending to be selected in parallel between replicates. We show that selection may favor de-repression of bone growth through inactivating two limb enhancers of an inhibitor, Nkx3-2. Our integrative genomic analyses thus show that it is possible to connect individual base-pair changes to the overall selection response.
f
Integrative multi-platform meta-analysis of gene expression profiles in...
datasetcatalog.nlm.nih.gov
figshare.com
Updated Apr 4, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Torres, Carolina; Jimenez-Luna, Cristina; Irigoyen, Antonio; Prados, Jose; Guillen-Ponce, Carmen; Rojas, Ignacio; Aranda, Enrique; Benavides, Manuel; Ortuño, Francisco Manuel; Caba, Octavio; Gallego, Javier (2018). Integrative multi-platform meta-analysis of gene expression profiles in pancreatic ductal adenocarcinoma patients for identifying novel diagnostic biomarkers [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0000722752
Explore at:
Dataset updated
Apr 4, 2018
Authors
Torres, Carolina; Jimenez-Luna, Cristina; Irigoyen, Antonio; Prados, Jose; Guillen-Ponce, Carmen; Rojas, Ignacio; Aranda, Enrique; Benavides, Manuel; Ortuño, Francisco Manuel; Caba, Octavio; Gallego, Javier
Description
Applying differentially expressed genes (DEGs) to identify feasible biomarkers in diseases can be a hard task when working with heterogeneous datasets. Expression data are strongly influenced by technology, sample preparation processes, and/or labeling methods. The proliferation of different microarray platforms for measuring gene expression increases the need to develop models able to compare their results, especially when different technologies can lead to signal values that vary greatly. Integrative meta-analysis can significantly improve the reliability and robustness of DEG detection. The objective of this work was to develop an integrative approach for identifying potential cancer biomarkers by integrating gene expression data from two different platforms. Pancreatic ductal adenocarcinoma (PDAC), where there is an urgent need to find new biomarkers due its late diagnosis, is an ideal candidate for testing this technology. Expression data from two different datasets, namely Affymetrix and Illumina (18 and 36 PDAC patients, respectively), as well as from 18 healthy controls, was used for this study. A meta-analysis based on an empirical Bayesian methodology (ComBat) was then proposed to integrate these datasets. DEGs were finally identified from the integrated data by using the statistical programming language R. After our integrative meta-analysis, 5 genes were commonly identified within the individual analyses of the independent datasets. Also, 28 novel genes that were not reported by the individual analyses (‘gained’ genes) were also discovered. Several of these gained genes have been already related to other gastroenterological tumors. The proposed integrative meta-analysis has revealed novel DEGs that may play an important role in PDAC and could be potential biomarkers for diagnosing the disease.
c
Research data supporting 'Integrative Multivariate Analysis of Mouse Liver...
repository.cam.ac.uk
xls
Updated Jan 3, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Cornelius, Mercedes (2025). Research data supporting 'Integrative Multivariate Analysis of Mouse Liver Acini' [Dataset]. http://doi.org/10.17863/CAM.114685
Explore at:
xls(15199 bytes), xls(9476 bytes), xls(15153 bytes), xls(15030 bytes)Available download formats
Unique identifier
https://doi.org/10.17863/CAM.114685
Dataset updated
Jan 3, 2025
Dataset provided by
Apollo
University of Cambridge
Authors
Cornelius, Mercedes
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset contains p-values and statistical significance data derived from analyzing various metabolic and dietary states in mice. The data supports research investigating the effects of diet and metabolic conditions on localized variables in specific regions of mice. The files included are:

PValues_and_Significance_Fasted.xlsx: P-values for variables under a fasted metabolic state.

PValues_and_Significance_CTRL.xlsx: P-values for variables under a control dietary state.

PValues_and_Significance_Western.xlsx: P-values for variables under a western dietary state.

PValues_and_Significance_Interdietary.xlsx: P-values comparing variables between different dietary states.

Data Collection Methods The data was collected by analyzing correlations between variables within localized regions of the mice. These variables were consistent within individuals but showed variation dependent on dietary or metabolic states. Data collection involved the following steps: 1. Selection of experimental groups based on dietary and metabolic conditions. 2. Quantitative measurement of specific variables in localized regions of mice. 3. Statistical analysis to determine the significance of correlations across the groups.

Data Generation and Processing 1. Generation: Measurements were obtained through laboratory analysis using standardized protocols for each dietary/metabolic condition. 2. Processing: - Statistical tests were performed to identify significant correlations (e.g., t-tests, ANOVA). - P-values were computed to quantify the significance of the relationships observed. - Data was compiled into Excel sheets for organization and clarity. Technical and Non-Technical Information - Technical Details: Each file contains tabular data with headers indicating the variable pairs analyzed, their respective p-values, and the significance level (e.g., p<0.05, p<0.01).
f
Data_Sheet_1_QuNex—An integrative platform for reproducible neuroimaging...
datasetcatalog.nlm.nih.gov
Updated Apr 5, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Zerbi, Valerio; Harms, Michael P.; Glasser, Matthew F.; Helmer, Markus; Demšar, Jure; Matkovič, Andraž; Winkler, Anderson; Pan, Lining; Anticevic, Alan; Fonteneau, Clara; Repovš, Grega; Tamayo, Zailyn; Warrington, Shaun; Purg, Nina; Sotiropoulos, Stamatios N.; Murray, John D.; Ji, Jie Lisa; Kraljič, Aleksij; Coalson, Timothy S. (2023). Data_Sheet_1_QuNex—An integrative platform for reproducible neuroimaging analytics.pdf [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0001082900
Explore at:
Dataset updated
Apr 5, 2023
Authors
Zerbi, Valerio; Harms, Michael P.; Glasser, Matthew F.; Helmer, Markus; Demšar, Jure; Matkovič, Andraž; Winkler, Anderson; Pan, Lining; Anticevic, Alan; Fonteneau, Clara; Repovš, Grega; Tamayo, Zailyn; Warrington, Shaun; Purg, Nina; Sotiropoulos, Stamatios N.; Murray, John D.; Ji, Jie Lisa; Kraljič, Aleksij; Coalson, Timothy S.
Description
IntroductionNeuroimaging technology has experienced explosive growth and transformed the study of neural mechanisms across health and disease. However, given the diversity of sophisticated tools for handling neuroimaging data, the field faces challenges in method integration, particularly across multiple modalities and species. Specifically, researchers often have to rely on siloed approaches which limit reproducibility, with idiosyncratic data organization and limited software interoperability.MethodsTo address these challenges, we have developed Quantitative Neuroimaging Environment & Toolbox (QuNex), a platform for consistent end-to-end processing and analytics. QuNex provides several novel functionalities for neuroimaging analyses, including a “turnkey” command for the reproducible deployment of custom workflows, from onboarding raw data to generating analytic features.ResultsThe platform enables interoperable integration of multi-modal, community-developed neuroimaging software through an extension framework with a software development kit (SDK) for seamless integration of community tools. Critically, it supports high-throughput, parallel processing in high-performance compute environments, either locally or in the cloud. Notably, QuNex has successfully processed over 10,000 scans across neuroimaging consortia, including multiple clinical datasets. Moreover, QuNex enables integration of human and non-human workflows via a cohesive translational platform.DiscussionCollectively, this effort stands to significantly impact neuroimaging method integration across acquisition approaches, pipelines, datasets, computational environments, and species. Building on this platform will enable more rapid, scalable, and reproducible impact of neuroimaging technology across health and disease.
Data from: Deep Integrated Network Analysis – a data-driven tool to discover...
zenodo.org
data.niaid.nih.gov
Updated Mar 17, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jaclyn Quin; Jaclyn Quin; Miren Urrutia Iturritza; Miren Urrutia Iturritza; Katherine D. Mosquera; Katherine D. Mosquera; Franziska Hildebrandt; Franziska Hildebrandt; Fredrik Barrenäs; Fredrik Barrenäs; Carl-Johan Ankarklev; Carl-Johan Ankarklev (2025). Deep Integrated Network Analysis – a data-driven tool to discover and characterize disease pathways in the liver [Dataset]. http://doi.org/10.5281/zenodo.15040422
Explore at:
Unique identifier
https://doi.org/10.5281/zenodo.15040422
Dataset updated
Mar 17, 2025
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Jaclyn Quin; Jaclyn Quin; Miren Urrutia Iturritza; Miren Urrutia Iturritza; Katherine D. Mosquera; Katherine D. Mosquera; Franziska Hildebrandt; Franziska Hildebrandt; Fredrik Barrenäs; Fredrik Barrenäs; Carl-Johan Ankarklev; Carl-Johan Ankarklev
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
General description:
Supplementary information belonging to the study "Deep Integrated Network Analysis – a tool to discover and characterize disease pathways in the liver".

Files:

1) Supplementary Figure 1 _ TLN .pdf

Contains the Tree-and-Leaf (TLN) network on which the leaves have been classified according to Gene Ontology Biological Processes.

2) Supplementary Table 1 _ Datasets.xlsx

Contains the list of datasets included in Liver DINA Resource.

For each dataset the GEO series, title, taxonomy, and liver sample count are shown, as well as the classification of dataset condition.

3) Supplementary Table 2 _ Top1000 subset _gene interaction networks.xlsx

Contains the results from the analysis of the 1,000 gene-gene interactions with the highest statistical weight in the Liver DINA Resource.

4) Supplementary Table 3_ TLN modules.xlsx

Contrains the classification of the leafs in the Liver DINA Resource Tree-and-Leaf Network (TLN).
Integrative Bioinformatics Analysis of Genomic and Proteomic Approaches to...
plos.figshare.com
datasetcatalog.nlm.nih.gov
doc
Updated Jun 3, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Rajani Kanth Vangala; Vandana Ravindran; Madan Ghatge; Jayashree Shanker; Prathima Arvind; Hima Bindu; Meghala Shekar; Veena S. Rao (2023). Integrative Bioinformatics Analysis of Genomic and Proteomic Approaches to Understand the Transcriptional Regulatory Program in Coronary Artery Disease Pathways [Dataset]. http://doi.org/10.1371/journal.pone.0057193
Explore at:
docAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0057193
Dataset updated
Jun 3, 2023
Dataset provided by
PLOShttp://plos.org/
Authors
Rajani Kanth Vangala; Vandana Ravindran; Madan Ghatge; Jayashree Shanker; Prathima Arvind; Hima Bindu; Meghala Shekar; Veena S. Rao
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Patients with cardiovascular disease show a panel of differentially regulated serum biomarkers indicative of modulation of several pathways from disease onset to progression. Few of these biomarkers have been proposed for multimarker risk prediction methods. However, the underlying mechanism of the expression changes and modulation of the pathways is not yet addressed in entirety. Our present work focuses on understanding the regulatory mechanisms at transcriptional level by identifying the core and specific transcription factors that regulate the coronary artery disease associated pathways. Using the principles of systems biology we integrated the genomics and proteomics data with computational tools. We selected biomarkers from 7 different pathways based on their association with the disease and assayed 24 biomarkers along with gene expression studies and built network modules which are highly regulated by 5 core regulators PPARG, EGR1, ETV1, KLF7 and ESRRA. These network modules in turn comprise of biomarkers from different pathways showing that the core regulatory transcription factors may work together in differential regulation of several pathways potentially leading to the disease. This kind of analysis can enhance the elucidation of mechanisms in the disease and give better strategies of developing multimarker module based risk predictions.
f
Supplementary Material for: Inferring Gene-Disease Association by an...
datasetcatalog.nlm.nih.gov
Updated Jan 22, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
M. , Deng; H. , Li; J. , Zheng; J. , Wang; Z. , Wang (2019). Supplementary Material for: Inferring Gene-Disease Association by an Integrative Analysis of eQTL Genome-Wide Association Study and Protein-Protein Interaction Data [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0000108726
Explore at:
Dataset updated
Jan 22, 2019
Authors
M. , Deng; H. , Li; J. , Zheng; J. , Wang; Z. , Wang
Description
Objectives: Genome-wide association studies (GWASs) have revealed many candidate SNPs, but the mechanisms by which these SNPs influence diseases are largely unknown. In order to decipher the underlying mechanisms, several methods have been developed to predict disease-associated genes based on the integration of GWAS and eQTL data (e.g., Sherlock and COLOC). A number of studies have also incorporated information from gene networks into GWAS analysis to reprioritize candidate genes. Methods: Motivated by these two different approaches, we have developed a statistical framework to integrate information from GWAS, eQTL, and protein-protein interaction (PPI) data to predict disease-associated genes. Our approach is based on a hidden Markov random field (HMRF) model, and we called the resulting computational algorithm GeP-HMRF (a GWAS-eQTL-PPI-based HMRF). Results: We compared the performance of GeP-HMRF with Sherlock, COLOC, and NetWAS methods on 9 GWAS datasets, using the disease-related genes in the MalaCards database as the standard, and found that GeP-HMRF significantly improves the prediction accuracy. We also applied GeP-HMRF to an age-related macular degeneration disease (AMD) dataset. Among the top 50 genes predicted by GeP-HMRF, 7 are reported by the MalaCards database to be AMD-related with an enrichment p value of 3.61 × 10–119. Among the top 20 genes predicted by GeP-HMRF, CFHR1, CGHR3, HTRA1, and CFH are AMD-related in the MalaCards database, and another 9 genes are supported by the literature. Conclusions: We built a unified statistical model to predict disease-related genes by integrating GWAS, eQTL, and PPI data. Our approach outperforms Sherlock, COLOC, and NetWAS in simulation studies and 9 GWAS datasets. Our approach can be generalized to incorporate other molecular trait data beyond eQTL and other interaction data beyond PPI.
n
LegumeIP
neuinfo.org
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
LegumeIP [Dataset]. http://identifiers.org/RRID:SCR_008906
Explore at:
Unique identifier
https://identifiers.org/RRID:SCR_008906
Description
LegumeIP is an integrative database and bioinformatics platform for comparative genomics and transcriptomics to facilitate the study of gene function and genome evolution in legumes, and ultimately to generate molecular based breeding tools to improve quality of crop legumes. LegumeIP currently hosts large-scale genomics and transcriptomics data, including: * Genomic sequences of three model legumes, i.e. Medicago truncatula, Glycine max (soybean) and Lotus japonicus, including two reference plant species, Arabidopsis thaliana and Poplar trichocarpa, with the annotation based on UniProt TrEMBL, InterProScan, Gene Ontology and KEGG databases. LegumeIP covers a total 222,217 protein-coding gene sequences. * Large-scale gene expression data compiled from 104 array hybridizations from L. japonicas, 156 array hybridizations from M. truncatula gene atlas database, and 14 RNA-Seq-based gene expression profiles from G. max on different tissues including four common tissues: Nodule, Flower, Root and Leaf. * Systematic synteny analysis among M. truncatula, G. max, L. japonicus and A. thaliana. * Reconstruction of gene family and gene family-wide phylogenetic analysis across the five hosted species. LegumeIP features comprehensive search and visualization tools to enable the flexible query on gene annotation, gene family, synteny, relative abundance of gene expression.
f
Data from: Comparative Analysis of Different Label-Free Mass Spectrometry...
datasetcatalog.nlm.nih.gov
acs.figshare.com
Updated Feb 21, 2016
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Nesvizhskii, Alexey I.; Ning, Kang; Fermin, Damian (2016). Comparative Analysis of Different Label-Free Mass Spectrometry Based Protein Abundance Estimates and Their Correlation with RNA-Seq Gene Expression Data [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0001150131
Explore at:
Dataset updated
Feb 21, 2016
Authors
Nesvizhskii, Alexey I.; Ning, Kang; Fermin, Damian
Description
An increasing number of studies involve integrative analysis of gene and protein expression data taking advantage of new technologies such as next-generation transcriptome sequencing (RNA-Seq) and highly sensitive mass spectrometry (MS) instrumentation. Thus, it becomes interesting to revisit the correlative analysis of gene and protein expression data using more recently generated data sets. Furthermore, within the proteomics community there is a substantial interest in comparing the performance of different label-free quantitative proteomic strategies. Gene expression data can be used as an indirect benchmark for such protein-level comparisons. In this work we use publicly available mouse data to perform a joint analysis of genomic and proteomic data obtained on the same organism. First, we perform a comparative analysis of different label-free protein quantification methods (intensity based and spectral count based and using various associated data normalization steps) using several software tools on the proteomic side. Similarly, we perform correlative analysis of gene expression data derived using microarray and RNA-Seq methods on the genomic side. We also investigate the correlation between gene and protein expression data, and various factors affecting the accuracy of quantitation at both levels. It is observed that spectral count based protein abundance metrics, which are easy to extract from any published data, are comparable to intensity based measures with respect to correlation with gene expression data. The results of this work should be useful for designing robust computational pipelines for extraction and joint analysis of gene and protein expression data in the context of integrative studies.
Benchmark Multi-Omics Datasets for Methods Comparison
zenodo.org
resodate.org
+1more
bin, zip
Updated Nov 14, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Gabriel Odom; Gabriel Odom; Lily Wang; Lily Wang (2021). Benchmark Multi-Omics Datasets for Methods Comparison [Dataset]. http://doi.org/10.5281/zenodo.5683002
Explore at:
bin, zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.5683002
Dataset updated
Nov 14, 2021
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Gabriel Odom; Gabriel Odom; Lily Wang; Lily Wang
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Pathway Multi-Omics Simulated Data

These are synthetic variations of the TCGA COADREAD data set (original data available at http://linkedomics.org/data_download/TCGA-COADREAD/). This data set is used as a comprehensive benchmark data set to compare multi-omics tools in the manuscript "pathwayMultiomics: An R package for efficient integrative analysis of multi-omics datasets with matched or un-matched samples".

There are 100 sets (stored as 100 sub-folders, the first 50 in "pt1" and the second 50 in "pt2") of random modifications to centred and scaled copy number, gene expression, and proteomics data saved as compressed data files for the R programming language. These data sets are stored in subfolders labelled "sim001", "sim002", ..., "sim100". Each folder contains the following contents: 1) "indicatorMatricesXXX_ls.RDS" is a list of simple triplet matrices showing which genes (in which pathways) and which samples received the synthetic treatment (where XXX is the simulation run label: 001, 002, ...), (2) "CNV_partitionA_deltaB.RDS" is the synthetically modified copy number variation data (where A represents the proportion of genes in each gene set to receive the synthetic treatment [partition 1 is 20%, 2 is 40%, 3 is 60% and 4 is 80%] and B is the signal strength in units of standard deviations), (3) "RNAseq_partitionA_deltaB.RDS" is the synthetically modified gene expression data (same parameter legend as CNV), and (4) "Prot_partitionA_deltaB.RDS" is the synthetically modified protein expression data (same parameter legend as CNV).

Supplemental Files

The file "cluster_pathway_collection_20201117.gmt" is the collection of gene sets used for the simulation study in Gene Matrix Transpose format. Scripts to create and analyze these data sets available at: https://github.com/TransBioInfoLab/pathwayMultiomics_manuscript_supplement
Cross Methylome Omnibus (CMO) models
zenodo.org
nde-dev.biothings.io
application/gzip
Updated Feb 5, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Chong Wu; Chong Wu (2021). Cross Methylome Omnibus (CMO) models [Dataset]. http://doi.org/10.5281/zenodo.4475935
Explore at:
application/gzipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.4475935
Dataset updated
Feb 5, 2021
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Chong Wu; Chong Wu
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
CMO is a gene-level association test that can identify many significant and novel genes ignored by many benchmark methods. Specifically, CMO integrates genetically regulated DNAm in enhancers, promoters, and the gene body to identify additional disease-associated genes. This repo contains the necessary models for CMO test.

The corresponding software: https://github.com/ChongWuLab/CMO

Thank you for using this software! Let me (cwu3@fsu.edu) know if you have any questions!
e
Data from: CEN-tools: An integrative platform to identify the contexts of...
ebi.ac.uk
Updated Aug 14, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sumana Sharma; Cansu Dincer; Paula Weidemüller; Gavin Wright; Evangelia Petsalaki (2020). CEN-tools: An integrative platform to identify the contexts of essential genes. [Dataset]. https://www.ebi.ac.uk/biostudies/studies/S-BSST479
Explore at:
Dataset updated
Aug 14, 2020
Authors
Sumana Sharma; Cansu Dincer; Paula Weidemüller; Gavin Wright; Evangelia Petsalaki
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
All generated networks and analysis for the paper with the same title.
d
Data from: Metabolomics signatures in type 2 diabetes: a systematic review...
datadryad.org
search.dataone.org
+1more
zip
Updated Mar 31, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yue Sun; Hao-Yu Gao; Zhi-Yuan Fan; Yan He; Yu-Xiang Yan (2020). Metabolomics signatures in type 2 diabetes: a systematic review and integrative analysis [Dataset]. http://doi.org/10.5061/dryad.2fqz612k4
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5061/dryad.2fqz612k4
Dataset updated
Mar 31, 2020
Dataset provided by
Dryad
Authors
Yue Sun; Hao-Yu Gao; Zhi-Yuan Fan; Yan He; Yu-Xiang Yan
Time period covered
Oct 30, 2019
Description
Objective

Metabolic signatures have emerged as valuable signaling molecules in the biochemical process of type 2 diabetes (T2D). To summarize and identify metabolic biomarkers in T2D, we performed a systematic review and meta-analysis of the associations between metabolites and T2D using high-throughput metabolomics techniques.

Methods

We searched relevant studies from MEDLINE (PubMed), Embase, Web of Science, and Cochrane Library as well as Chinese databases (Wanfang, Vip, and CNKI) inception through 31 December 2018. Meta-analysis was conducted using STATA 14.0 under random effect. Besides, bioinformatic analysis was performed to explore molecule mechanism by MetaboAnalyst and R 3.5.2.

Results

Finally, 46 articles were included in this review on metabolites involved amino acids, acylcarnitines, lipids, carbohydrates, organic acids, and others. Results of meta-analysis in prospective studies indicated that isoleucine, leucine, valine, tyrosine, phenylalanine, glutamate, alanine, v...
e
Genome wide analysis of human iPS cell lines generated with non-integrative...
ebi.ac.uk
Updated Dec 30, 2013
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Christian Clausen; Mikkel Rasmussen; Bjoern Holst; Joergen Nielsen; Lis Hasholt; Poul Hyttel (2013). Genome wide analysis of human iPS cell lines generated with non-integrative plasmids and knock-down of P53 gene [Dataset]. https://www.ebi.ac.uk/biostudies/arrayexpress/studies/E-GEOD-48665
Explore at:
Dataset updated
Dec 30, 2013
Authors
Christian Clausen; Mikkel Rasmussen; Bjoern Holst; Joergen Nielsen; Lis Hasholt; Poul Hyttel
Description
The primary aim of this study is to evaluate the effect of transient knock down of P53 as a tool to increase the efficiency of a non-integrative methodology for reprogramming adult human normal dermal fibroblasts. This study demonstrate that transient knockdown of P53 is an efficient way to produce iPSC containing minimal genomic alterations, which meets the increased demand for iPSC in personalized drug screening campaigns. Total RNA was isolated from 3 iPS cell lines generated without P53 knockdown and 3 generated with P53 knockdown. In addition total RNA was isolated from the parental normal human dermal fibroblasts and from a reference human iPS cell line from Systembio (SBI).
f
Additional file 2 of Integrative analyses of single-cell transcriptome and...
datasetcatalog.nlm.nih.gov
springernature.figshare.com
Updated Aug 8, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Wang, Chenfei; Qin, Qian; Long, Henry; Brown, Myles; Huang, Xin; Liu, Tao; Meyer, Clifford A.; Han, Ya; Sun, Dongqing; Xie, Yingtian; Li, Ziyi; Wan, Changxin; Fan, Jingyu; Liu, X. Shirley; Tang, Ming; Qiu, Xintao (2020). Additional file 2 of Integrative analyses of single-cell transcriptome and regulome using MAESTRO [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0000528952
Explore at:
Dataset updated
Aug 8, 2020
Authors
Wang, Chenfei; Qin, Qian; Long, Henry; Brown, Myles; Huang, Xin; Liu, Tao; Meyer, Clifford A.; Han, Ya; Sun, Dongqing; Xie, Yingtian; Li, Ziyi; Wan, Changxin; Fan, Jingyu; Liu, X. Shirley; Tang, Ming; Qiu, Xintao
Description
Additional file 2: Table S2. Running time and memory comparison between MAESTRO and other tools for scATAC-seq analysis.
Data from: Integrative analysis of hepatic transcriptional profiles reveals...
agdatacommons.nal.usda.gov
bin
Updated Mar 12, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
USDA ARS-WHNRC (2025). Integrative analysis of hepatic transcriptional profiles reveals genetic regulation of atherosclerosis in hyperlipidemic Diversity Outbred-F1 mice [Dataset]. https://agdatacommons.nal.usda.gov/articles/dataset/Integrative_analysis_of_hepatic_transcriptional_profiles_reveals_genetic_regulation_of_atherosclerosis_in_hyperlipidemic_Diversity_Outbred-F1_mice/25088717
Explore at:
binAvailable download formats
Dataset updated
Mar 12, 2025
Dataset provided by
National Center for Biotechnology Informationhttp://www.ncbi.nlm.nih.gov/
Authors
USDA ARS-WHNRC
License
https://rightsstatements.org/vocab/UND/1.0/https://rightsstatements.org/vocab/UND/1.0/
Description
Purpose: To investigate the sex-dependence of liver transcriptome in Diversity Outbred (DO)-F1 mice Methods: Total RNA was extracted from snap-frozen liver using miRVana total RNA isolation kit (Thermo Fisher Scientific, Waltham, MA, USA) according to the manufacturer’s protocol. The quality and amount of liver RNA were evaluated using a Bioanalyzer (Agilent, Inc., Santa Clara, CA). The average RNA-integrity score for 162 DO-F1 liver samples was 9.01 ± 0.4. RNA samples from 85 females and 77 males were submitted to the UC Davis DNA Technologies Core at the Genome Center. The RNA-seq libraries were constructed from 1 µg total RNA after poly-A library preparation. To minimize technical variability, all samples were assigned to each lane and the pooled libraries were sequenced on two lanes of the Illumina NovaSeq 6000 sequencing (Illumina Inc., San Diego, CA, USA) to achieve paired-end reads of at least 25 million 150 bp. Only R1 was used in the analysis and only R1 was submitted. Results: Our results demonstrate the tremendous effects of sex on hepatic gene expression. In support of this, genetic loci associated with the transcripts frequently showed sex specificity. We revealed sex-specific candidate genes that were mapped to the quantitative trait loci for aortic lesion area and whose expression was regulated locally regulated via global liver transcriptome. Conclusions: Our study provide a valuable data resource to the research community and show that liver transcriptomic analysis identified diet- or strain-specific pathways to pathogenesis of metabolic syndrome. Overall design: Liver mRNA profiles of 24-week old Diversity Outbred-F1 mice
f
Supplementary Material for: Integrative Bioinformatics Analysis Provides...
datasetcatalog.nlm.nih.gov
karger.figshare.com
Updated Apr 11, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
S. , Qiu; L. -L. , Lv; L. -T. , Zhou; K. -L. , Ma; B. -C. , Liu; H. , Liu; Z. -L. , Li; R. -N. , Tang (2018). Supplementary Material for: Integrative Bioinformatics Analysis Provides Insight into the Molecular Mechanisms of Chronic Kidney Disease [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0000631776
Explore at:
Dataset updated
Apr 11, 2018
Authors
S. , Qiu; L. -L. , Lv; L. -T. , Zhou; K. -L. , Ma; B. -C. , Liu; H. , Liu; Z. -L. , Li; R. -N. , Tang
Description
Background/Aims: Chronic kidney disease (CKD) is a worldwide public health problem. Regardless of the underlying primary disease, CKD tends to progress to end-stage kidney disease, resulting in unsatisfactory and costly treatment. Its common pathogenesis, however, remains unclear. The aim of this study was to provide an unbiased catalog of common gene-expression changes of CKD and reveal the underlying molecular mechanism using an integrative bioinformatics approach. Methods: We systematically collected over 250 Affymetrix microarray datasets from the glomerular and tubulointerstitial compartments of healthy renal tissues and those with various types of established CKD (diabetic kidney disease, hypertensive nephropathy, and glomerular nephropathy). Then, using stringent bioinformatics analysis, shared differentially expressed genes (DEGs) of CKD were obtained. These shared DEGs were further analyzed by the gene ontology (GO) and pathway enrichment analysis. Finally, the protein-protein interaction networks(PINs) were constructed to further refine our results. Results: Our analysis identified 176 and 50 shared DEGs in diseased glomeruli and tubules, respectively, including many transcripts that have not been previously reported to be involved in kidney disease. Enrichment analysis also showed that the glomerular and tubulointerstitial compartments underwent a wide range of unique pathological changes during chronic injury. As revealed by the GO enrichment analysis, shared DEGs in glomeruli were significantly enriched in exosomes. By constructing PINs, we identified several hub genes (e.g. OAS1, JUN, and FOS) and clusters that might play key roles in regulating the development of CKD. Conclusion: Our study not only further reveals the unifying molecular mechanism of CKD pathogenesis but also provides a valuable resource of potential biomarkers and therapeutic targets.

Facebook

Twitter

Click to copy link

Link copied

Cite

Julien Boccard; Domitille Schvartz; Santiago Codesido; Mohamed Hanafi; Yoric Gagnebin; Belén Ponte; Fabien Jourdan; Serge Rudaz (2023). Table_1_Gaining Insights Into Metabolic Networks Using Chemometrics and Bioinformatics: Chronic Kidney Disease as a Clinical Model.XLSX [Dataset]. http://doi.org/10.3389/fmolb.2021.682559.s001

Table_1_Gaining Insights Into Metabolic Networks Using Chemometrics and Bioinformatics: Chronic Kidney Disease as a Clinical Model.XLSX

Explore at:

xlsxAvailable download formats

Unique identifier

https://doi.org/10.3389/fmolb.2021.682559.s001

Dataset updated

Jun 8, 2023

Dataset provided by

Frontiers Mediahttp://www.frontiersin.org/

Authors

Julien Boccard; Domitille Schvartz; Santiago Codesido; Mohamed Hanafi; Yoric Gagnebin; Belén Ponte; Fabien Jourdan; Serge Rudaz

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Because of its ability to generate biological hypotheses, metabolomics offers an innovative and promising approach in many fields, including clinical research. However, collecting specimens in this setting can be difficult to standardize, especially when groups of patients with different degrees of disease severity are considered. In addition, despite major technological advances, it remains challenging to measure all the compounds defining the metabolic network of a biological system. In this context, the characterization of samples based on several analytical setups is now recognized as an efficient strategy to improve the coverage of metabolic complexity. For this purpose, chemometrics proposes efficient methods to reduce the dimensionality of these complex datasets spread over several matrices, allowing the integration of different sources or structures of metabolic information. Bioinformatics databases and query tools designed to describe and explore metabolic network models offer extremely useful solutions for the contextualization of potential biomarker subsets, enabling mechanistic hypotheses to be considered rather than simple associations. In this study, network principal component analysis was used to investigate samples collected from three cohorts of patients including multiple stages of chronic kidney disease. Metabolic profiles were measured using a combination of four analytical setups involving different separation modes in liquid chromatography coupled to high resolution mass spectrometry. Based on the chemometric model, specific patterns of metabolites, such as N-acetyl amino acids, could be associated with the different subgroups of patients. Further investigation of the metabolic signatures carried out using genome-scale network modeling confirmed both tryptophan metabolism and nucleotide interconversion as relevant pathways potentially associated with disease severity. Metabolic modules composed of chemically adjacent or close compounds of biological relevance were further investigated using carbon transfer reaction paths. Overall, the proposed integrative data analysis strategy allowed deeper insights into the metabolic routes associated with different groups of patients to be gained. Because of their complementary role in the knowledge discovery process, the association of chemometrics and bioinformatics in a common workflow is therefore shown as an efficient methodology to gain meaningful insights in a clinical context.

Clear search

Close search

Google apps

Main menu

Table_1_Gaining Insights Into Metabolic Networks Using Chemometrics and...

Data from: Integrative Genomic Analysis Identifies Isoleucine and CodY as...

Data_Sheet_1_STATegra: Multi-Omics Data Integration – A Conceptual Scheme...

An integrative genomic analysis of the Longshanks selection experiment for...

Integrative multi-platform meta-analysis of gene expression profiles in...

Research data supporting 'Integrative Multivariate Analysis of Mouse Liver...

Data_Sheet_1_QuNex—An integrative platform for reproducible neuroimaging...

Data from: Deep Integrated Network Analysis – a data-driven tool to discover...

Integrative Bioinformatics Analysis of Genomic and Proteomic Approaches to...

Supplementary Material for: Inferring Gene-Disease Association by an...

LegumeIP

Data from: Comparative Analysis of Different Label-Free Mass Spectrometry...

Benchmark Multi-Omics Datasets for Methods Comparison

Cross Methylome Omnibus (CMO) models

Data from: CEN-tools: An integrative platform to identify the contexts of...

Data from: Metabolomics signatures in type 2 diabetes: a systematic review...

Genome wide analysis of human iPS cell lines generated with non-integrative...

Additional file 2 of Integrative analyses of single-cell transcriptome and...

Data from: Integrative analysis of hepatic transcriptional profiles reveals...

Supplementary Material for: Integrative Bioinformatics Analysis Provides...

Table_1_Gaining Insights Into Metabolic Networks Using Chemometrics and Bioinformatics: Chronic Kidney Disease as a Clinical Model.XLSXSee More Versions

Table_1_Gaining Insights Into Metabolic Networks Using Chemometrics and Bioinformatics: Chronic Kidney Disease as a Clinical Model.XLSX