100+ datasets found

Additional file 2: Table S2. of ODG: Omics database generator - a tool for...
springernature.figshare.com
xlsx
Updated Jun 4, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Joseph Guhlin; Kevin Silverstein; Peng Zhou; Peter Tiffin; Nevin Young (2023). Additional file 2: Table S2. of ODG: Omics database generator - a tool for generating, querying, and analyzing multi-omics comparative databases to facilitate biological understanding [Dataset]. http://doi.org/10.6084/m9.figshare.c.3850801_D2.v1
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.c.3850801_D2.v1
Dataset updated
Jun 4, 2023
Dataset provided by
figshare
Figsharehttp://figshare.com/
Authors
Joseph Guhlin; Kevin Silverstein; Peng Zhou; Peter Tiffin; Nevin Young
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
PFam Domains and biological process GO categories for the four rhizobia strains. Predicted proteins related to multiple GO biological process categories are joined together with the pipe character. (XLSX 639Â kb)
m
Dataset 2 - Protein Libraries Of Seven Databases From Cnidaria Omics Data...
data.mendeley.com
Updated Dec 6, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Alexandre Barroso (2024). Dataset 2 - Protein Libraries Of Seven Databases From Cnidaria Omics Data After Duplicates Removal and AMPir Prediction [Dataset]. http://doi.org/10.17632/myp4j56gpz.1
Explore at:
Unique identifier
https://doi.org/10.17632/myp4j56gpz.1
Dataset updated
Dec 6, 2024
Authors
Alexandre Barroso
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Non duplicated AMP precursor protein libraries from 7 databases of Cnidaria: Db1 – 6 proteomes derived from sequenced genomes of Anthozoa Db2 – 2 proteomes derived from sequenced genomes of Medusozoa Db3 – 46 whole body/non-specific transcriptomes of Anthozoa Db4 – 24 whole body/non specific transcriptomes of Medusozoa Db5 – 25 transcriptomes specific to the tentacles of Anthozoa Db6 – 7 transcriptomes specific to the tentacles of Medusozoa Db7 – 2 transcriptomes specific to the nematocysts of Anthozoa
f
Data Sheet 2_Visual analysis of multi-omics data.csv
frontiersin.figshare.com
csv
Updated Sep 10, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Austin Swart; Ron Caspi; Suzanne Paley; Peter D. Karp (2024). Data Sheet 2_Visual analysis of multi-omics data.csv [Dataset]. http://doi.org/10.3389/fbinf.2024.1395981.s002
Explore at:
csvAvailable download formats
Unique identifier
https://doi.org/10.3389/fbinf.2024.1395981.s002
Dataset updated
Sep 10, 2024
Dataset provided by
Frontiers
Authors
Austin Swart; Ron Caspi; Suzanne Paley; Peter D. Karp
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
We present a tool for multi-omics data analysis that enables simultaneous visualization of up to four types of omics data on organism-scale metabolic network diagrams. The tool’s interactive web-based metabolic charts depict the metabolic reactions, pathways, and metabolites of a single organism as described in a metabolic pathway database for that organism; the charts are constructed using automated graphical layout algorithms. The multi-omics visualization facility paints each individual omics dataset onto a different “visual channel” of the metabolic-network diagram. For example, a transcriptomics dataset might be displayed by coloring the reaction arrows within the metabolic chart, while a companion proteomics dataset is displayed as reaction arrow thicknesses, and a complementary metabolomics dataset is displayed as metabolite node colors. Once the network diagrams are painted with omics data, semantic zooming provides more details within the diagram as the user zooms in. Datasets containing multiple time points can be displayed in an animated fashion. The tool will also graph data values for individual reactions or metabolites designated by the user. The user can interactively adjust the mapping from data value ranges to the displayed colors and thicknesses to provide more informative diagrams.
Data from: MangroveDB: A comprehensive online database for mangroves based...
zenodo.org
zip
Updated Oct 9, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Chaoqun Xu; Chaoqun Xu (2024). MangroveDB: A comprehensive online database for mangroves based on multi-omics data [Dataset]. http://doi.org/10.5281/zenodo.13907062
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.13907062
Dataset updated
Oct 9, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Chaoqun Xu; Chaoqun Xu
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
Oct 8, 2024
Description
Mangroves are dominant flora of intertidal zones along tropical and subtropical coastline around the world that offer important ecological and economic value. Recently, the genomes of mangroves have been decoded, and massive omics data were generated and deposited in the public databases. Reanalysis of multi-omics data can provide new biological insights excluded in the original studies. However, the requirements for computational resource and lack of bioinformatics skill for experimental researchers limit the effective use of the original data. To fill this gap, we uniformly processed 942 transcriptome data, 386 whole-genome sequencing data, and provided 13 reference genomes and 40 reference transcriptomes for 53 mangroves. Finally, we built an interactive web-based database platform MangroveDB (https://github.com/Jasonxu0109/MangroveDB), which was designed to provide comprehensive gene expression datasets to facilitate their exploration and equipped with several online analysis tools, including principal components analysis, differential gene expression analysis, tissue-specific gene expression analysis, GO and KEGG enrichment analysis. MangroveDB not only provides query functions about genes annotation, but also supports some useful visualization functions for analysis results, such as volcano plot, heatmap, dotplot, PCA plot, bubble plot, population structure etc. In conclusion, MangroveDB is a valuable resource for the mangroves research community to efficiently use the massive public omics datasets.
b
NCI-60 Cancer Cell Lines
bigomics.ch
Updated Nov 8, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
National Cancer Institute (NCI) (2024). NCI-60 Cancer Cell Lines [Dataset]. https://bigomics.ch/blog/top-databases-for-drug-discovery/
Explore at:
Dataset updated
Nov 8, 2024
Dataset authored and provided by
National Cancer Institute (NCI)
Description
A panel of 60 human cancer cell lines used for screening anticancer drugs.
f
DataSheet_1_AppleMDO: A Multi-Dimensional Omics Database for Apple...
datasetcatalog.nlm.nih.gov
Updated Oct 22, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Liu, Yue; Tian, Tian; Xu, Wenying; Ma, Xuelian; Yang, Jiaotong; Su, Zhen; Da, Lingling; She, Jiajie (2019). DataSheet_1_AppleMDO: A Multi-Dimensional Omics Database for Apple Co-Expression Networks and Chromatin States.pdf [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0000136732
Explore at:
Dataset updated
Oct 22, 2019
Authors
Liu, Yue; Tian, Tian; Xu, Wenying; Ma, Xuelian; Yang, Jiaotong; Su, Zhen; Da, Lingling; She, Jiajie
Description
As an economically important crop, apple is one of the most cultivated fruit trees in temperate regions worldwide. Recently, a large number of high-quality transcriptomic and epigenomic datasets for apple were made available to the public, which could be helpful in inferring gene regulatory relationships and thus predicting gene function at the genome level. Through integration of the available apple genomic, transcriptomic, and epigenomic datasets, we constructed co-expression networks, identified functional modules, and predicted chromatin states. A total of 112 RNA-seq datasets were integrated to construct a global network and a conditional network (tissue-preferential network). Furthermore, a total of 1,076 functional modules with closely related gene sets were identified to assess the modularity of biological networks and further subjected to functional enrichment analysis. The results showed that the function of many modules was related to development, secondary metabolism, hormone response, and transcriptional regulation. Transcriptional regulation is closely related to epigenetic marks on chromatin. A total of 20 epigenomic datasets, which included ChIP-seq, DNase-seq, and DNA methylation analysis datasets, were integrated and used to classify chromatin states. Based on the ChromHMM algorithm, the genome was divided into 620,122 fragments, which were classified into 24 states according to the combination of epigenetic marks and enriched-feature regions. Finally, through the collaborative analysis of different omics datasets, the online database AppleMDO (http://bioinformatics.cau.edu.cn/AppleMDO/) was established for cross-referencing and the exploration of possible novel functions of apple genes. In addition, gene annotation information and functional support toolkits were also provided. Our database might be convenient for researchers to develop insights into the function of genes related to important agronomic traits and might serve as a reference for other fruit trees.
Multi-omics for Understanding Climate Change (MUCC) database v2.0.0
zenodo.org
text/x-python
Updated Oct 9, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Emily Bechtold; Kelly Wrighton; Kelly Wrighton; Mike Wilkins; Mike Wilkins; Emily Bechtold (2024). Multi-omics for Understanding Climate Change (MUCC) database v2.0.0 [Dataset]. http://doi.org/10.5281/zenodo.13909730
Explore at:
text/x-pythonAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.13909730
Dataset updated
Oct 9, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Emily Bechtold; Kelly Wrighton; Kelly Wrighton; Mike Wilkins; Mike Wilkins; Emily Bechtold
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This is the Multi-omics for Understanding Climate Change (MUCC database) version 2.0.0. This current version is based on amplicon and metagenomic sequencing of Old Woman Creek (OWC), Prairie Pothole Region(PPR7 and PPR8), Jean Lafitte National Historical Park and Preserve (JLA), AmeriFlux site US-LA2 (LA2), Stordalen Mire (STM-fen and STM-bog), AmeriFlux site-ID US-Twt (TWI), and Peatland Responses Under Changing Environments (SPRUCE) and wetland soils. Additionally, this includes metatranscriptome sequencing from OWC. In the future, this will be expanded to include more data from these sites and from additional wetlands.

OWC, PPR, JLA and LA2 data are deposited in NCBI Bioproject PRJNA1007388

Stordalen Mire MAGs are deposited in BioProject PRJNA386538

AmeriFlux site-ID US-Twt are deposited in SRA SRP003022, SRA SRP010671, SRP010730, SRP010738, SRP010741, SRP010747, SRP010748, SRP010751, SRP010862, SRP010870, and SRP011309.

SPRUCE data are deposited in PRJNA638786 and PRJNA638601

Files and datasets included here:

16S.zip 16S amplicon sequencing data and site metadata for 1,112 samples (fastq files)

MQ_HQ_MAGs.zip Database of 4745 Medium and High Quality MAGs (fast files)

MUCC_v2.0.0_HQMQ_genes.faa.zip MAG amino acid gene sequences derived from DRAM gene calls (fasta file)

MUCC_v2.0.0_HQMQ_annotations.tsv MAG DRAM ANNOTATIONS

owc_metat_table_methanoregula_genes.csv Metatranscriptomic expression per genes in Methanoregula across 133 metatranscriptomes (csv table)

gtdbtk.ar53.decorated.tree newick file for GTDB de novo work flow Methanoregula MAG tree

Newick_gene_trees.zip Trees used in blast identification of methylotrophic gene homologs to curate MR for methylotrophy

fasta_reference_genes.zip FASTA reference files of genes used as BLAST query to mine Methanoregula MAGs for genes involved in detoxification of reactive oxygen species (ROS) and methanogenic metabolism of methylated compounds

protpipeliner.py Python script is a modification of protpipeliner.rb for building RAXML trees
Table1_Unsupervised Multi-Omics Data Integration Methods: A Comprehensive...
frontiersin.figshare.com
docx
Updated Jun 5, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Nasim Vahabi; George Michailidis (2023). Table1_Unsupervised Multi-Omics Data Integration Methods: A Comprehensive Review.DOCX [Dataset]. http://doi.org/10.3389/fgene.2022.854752.s001
Explore at:
docxAvailable download formats
Unique identifier
https://doi.org/10.3389/fgene.2022.854752.s001
Dataset updated
Jun 5, 2023
Dataset provided by
Frontiers Mediahttp://www.frontiersin.org/
Authors
Nasim Vahabi; George Michailidis
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Through the developments of Omics technologies and dissemination of large-scale datasets, such as those from The Cancer Genome Atlas, Alzheimer’s Disease Neuroimaging Initiative, and Genotype-Tissue Expression, it is becoming increasingly possible to study complex biological processes and disease mechanisms more holistically. However, to obtain a comprehensive view of these complex systems, it is crucial to integrate data across various Omics modalities, and also leverage external knowledge available in biological databases. This review aims to provide an overview of multi-Omics data integration methods with different statistical approaches, focusing on unsupervised learning tasks, including disease onset prediction, biomarker discovery, disease subtyping, module discovery, and network/pathway analysis. We also briefly review feature selection methods, multi-Omics data sets, and resources/tools that constitute critical components for carrying out the integration.
h
MLOmics
huggingface.co
Updated May 29, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
AI for Bio Informatics and Care (2025). MLOmics [Dataset]. https://huggingface.co/datasets/AIBIC/MLOmics
Explore at:
Dataset updated
May 29, 2025
Dataset authored and provided by
AI for Bio Informatics and Care
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
MLOmics: Cancer Multi-Omics Database for Machine Learning

Abstract

Framing the investigation of diverse cancers as a machine learning problem has recently shown significant potential in multi-omics analysis and cancer research. Empowering these successful machine learning models are the high-quality training datasets with sufficient data volume and adequate preprocessing. However, while there exist several public data portals including The Cancer Genome Atlas (TCGA)… See the full description on the dataset page: https://huggingface.co/datasets/AIBIC/MLOmics.
a
omics - eNanoMapper database
enm-dev.adma.ai
json
Updated Oct 11, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
omics metadata (2023). omics - eNanoMapper database [Dataset]. https://enm-dev.adma.ai/projects/omics/
Explore at:
jsonAvailable download formats
Dataset updated
Oct 11, 2023
Dataset provided by
Ideaconsult Ltd.
Authors
omics metadata
License
https://enanomapper.adma.ai/about/omicshttps://enanomapper.adma.ai/about/omics
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Description
omics metadata project data: Nanosafety-relevant omics data - a database covering metadata for transcriptomics, proteomics and microRNA expression data relevant to safety assessment analyses of nanomaterials
r
Newtomics
rrid.site
Updated Jan 29, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2022). Newtomics [Dataset]. http://identifiers.org/RRID:SCR_006073
Explore at:
Unique identifier
https://identifiers.org/RRID:SCR_006073
Dataset updated
Jan 29, 2022
Description
Newt-omics is a database, which enables researchers to locate, retrieve and store data sets dedicated to the molecular characterization of newts. Newt-omics is a transcript-centered database, based on an Expressed Sequence Tag (EST) data set from the newt, covering ~50,000 Sanger sequenced transcripts and a set of high-density microarray data, generated from regenerating hearts. Newt-omics also contains a large set of peptides identified by mass spectrometry, which was used to validate 13,810 ESTs as true protein coding. Newt-omics is open to implement additional high-throughput data sets without changing the database structure. Via a user-friendly interface Newt-omics allows access to a huge set of molecular data without the need for prior bioinformatical expertise. The newt Notopthalmus viridescens is the master of regeneration. This organism is known for more than 200 years for its exceptional regenerative capabilities. Newts can completely replace lost appendages like limb and tail, lens and retina and parts of the central nervous system. Moreover, after cardiac injury newts can rebuild the functional myocardium with no scar formation. To date only very limited information from public databases is available. Newt-Omics aims to provide a comprehensive platform of expressed genes during tissue regeneration, including extensive annotations, expression data and experimentally verified peptide sequences with yet no homology to other publicly available gene sequences. The goal is to obtain a detailed understanding of the molecular processes underlying tissue regeneration in the newt, that may lead to the development of approaches, efficiently stimulating regenerative pathways in mammalians. * Number of contigs: 26594 * Number of est in contigs: 48537 * Number of transcripts with verified peptide: 5291 * Number of peptides: 15169
d
Unlocking natural history collections to improve eDNA reference databases...
search.dataone.org
Updated Oct 17, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sarah Schmid; Nicolas Straube; Camille Albouy; Bo Delling; James Maclaine; Michael Matschiner; Peter Rask MÃ¸ller; Annamaria Nocita; Anja PalandaÄ iÄ‡; Lukas RÃ¼ber; Moritz Sonnewald; Nadir Alvarez; StÃ©phanie Manel; LoÃ¯c Pellissier (2025). Unlocking natural history collections to improve eDNA reference databases and biodiversity monitoring [Dataset]. http://doi.org/10.5061/dryad.0zpc8677g
Explore at:
Unique identifier
https://doi.org/10.5061/dryad.0zpc8677g
Dataset updated
Oct 17, 2025
Dataset provided by
Dryad Digital Repository
Authors
Sarah Schmid; Nicolas Straube; Camille Albouy; Bo Delling; James Maclaine; Michael Matschiner; Peter Rask MÃ¸ller; Annamaria Nocita; Anja PalandaÄ iÄ‡; Lukas RÃ¼ber; Moritz Sonnewald; Nadir Alvarez; StÃ©phanie Manel; LoÃ¯c Pellissier
Description
Biodiversity changes due to human activities highlight the need for efficient biodiversity monitoring approaches. Environmental DNA (eDNA) metabarcoding offers a non-invasive method used for biodiversity monitoring and ecosystem assessment, but its accuracy depends on comprehensive DNA reference databases. Natural history collections often contain rare or difficult-to-obtain samples that can serve as a valuable resource to fill gaps in eDNA reference databases. Here, we discuss the utility of specimens from natural history collections in supporting future eDNA applications. Museomicsâ€”the application of -omics techniques to museum specimensâ€”offers a promising avenue for improving eDNA reference databases by increasing species coverage. Furthermore, museomics can provide transferable methodological advancements for extracting genetic material from samples with low and degraded DNA. The integration of natural history collections, museomics, and eDNA approaches has the potential to signific..., Dataset for analyzing the potential of museum specimens to improve the DNA reference database To examine the cumulative number of species sequenced for a given DNA barcode/mitochondrial genome (also referred to as mitogenome) over the years, we retrieved all data available from NCBI using the R package rentrez v1.2.3 (Winter 2017). We searched the nucleotide database for the rRNA 12S, rRNA 16S, rRNA 18S, cytochrome B (cytB), cytochrome oxidase I (COI) barcodes, as well as for the complete mitogenomes for all fish orders. In addition, we also retrieved all the fish species with available data on the sequence read archive (SRA) using the Entrez Direct (Kans 2024), which provides access to the NCBI databases from a Unix terminal window. To highlight the potential of museum specimens for increasing the number of species with an available barcode/mitogenome sequence, we first downloaded all available datasets on the Global Biodiversity Information Facility (GBIF) listing fish specimens store..., , # Unlocking natural history collections to improve eDNA reference databases and biodiversity monitoring

Description of the data and file structure

The dataset consists of a main folder, data.zip.

Various

kit_custom_prices.xlsx - price estimate for DNA extraction and ssDNA library prep using a commercial kit or the custom protocol from Nicolas Straube.

barcodes_data

output from the cumul_barcodes_plot.R script.

species_with_barcodes.csv - list of all fishes (marine + freshwater) with a given barcode available, according to NCBI. (1) species name, (2) NCBI taxon ID, (3) date when the species sequence was first uploaded on NCBI, (4) marker of interest, (5) year the species sequence was first uploaded on NCBI.

occurence_data

contains a different type of list of species (museum, 12S availability, etc.)

combined_gbif_species.csv - output from the script museum_potential/1_process_gbif_datasets.R. Contains all the species of fish found in the main natural ...,
Multi-omics for Understanding Climate Change (MUCC) database v1.0.0
zenodo.org
bin
Updated Feb 5, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Angela A. Oliverio; Angela A. Oliverio; Mikayla A. Borton; Mikayla A. Borton; Adrienne Narrowe; Adrienne Narrowe; Kelly C. Wrighton; Kelly C. Wrighton (2024). Multi-omics for Understanding Climate Change (MUCC) database v1.0.0 [Dataset]. http://doi.org/10.5281/zenodo.10622292
Explore at:
binAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.10622292
Dataset updated
Feb 5, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Angela A. Oliverio; Angela A. Oliverio; Mikayla A. Borton; Mikayla A. Borton; Adrienne Narrowe; Adrienne Narrowe; Kelly C. Wrighton; Kelly C. Wrighton
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This is the Multi-omics for Understanding Climate Change (MUCC database) version 1.0.0. This current version is based on metagenomic and metatranscriptomic sequencing of Old Woman Creek wetland soils, but will be expanded in the future to include data from additional wetlands. Files and datasets included here:

MAGs.zip Dereplicated database of 2,502 MAGs (fasta files)

OWC_HQMQ_DB_genes.faa.gz MAG amino acid gene sequences derived from DRAM gene calls (fasta file)

OWC_HQMQ_DB_ANNOTATIONS_20220208.txt.gz MAG DRAM annotations

owc_metat_table_mags.csv Metatranscriptomic expression per MAG across 133 metatranscriptomes (csv table)

owc_metat_table_mags_genes.csv Metatranscriptomic expression per gene across 133 metatranscriptomes (csv table)

owc_metat_table_mags_genes_annotations.csv corresponding DRAM annotations to #5 for transcribed genes (csv table)

gtdbtk.ar53.decorated.tree newick file for tree in figure S4
Additional file 3 of Galbase: a comprehensive repository for integrating...
springernature.figshare.com
xlsx
Updated Feb 20, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Weiwei Fu; Rui Wang; Naiyi Xu; Jinxin Wang; Ran Li; Hojjat Asadollahpour Nanaei; Qinghua Nie; Xin Zhao; Jianlin Han; Ning Yang; Yu Jiang (2024). Additional file 3 of Galbase: a comprehensive repository for integrating chicken multi-omics data [Dataset]. http://doi.org/10.6084/m9.figshare.19759727.v1
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.19759727.v1
Dataset updated
Feb 20, 2024
Dataset provided by
Figsharehttp://figshare.com/
figshare
Authors
Weiwei Fu; Rui Wang; Naiyi Xu; Jinxin Wang; Ran Li; Hojjat Asadollahpour Nanaei; Qinghua Nie; Xin Zhao; Jianlin Han; Ning Yang; Yu Jiang
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Additional file 3:
f
Table_5_PSDX: A Comprehensive Multi-Omics Association Database of Populus...
datasetcatalog.nlm.nih.gov
Updated May 20, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Gao, Yubang; Liu, Bo; Wang, Huihui; Liu, Sheng; Xu, Xi; Yang, Yongkang; Jaiswal, Pankaj; Liu, Xuqing; Wei, Wentao; Reddy, Anireddy S. N.; Wang, Huiyuan; Li, Wei; Luo, Yunjun; Gu, Lianfeng; Dai, Xiufang (2021). Table_5_PSDX: A Comprehensive Multi-Omics Association Database of Populus trichocarpa With a Focus on the Secondary Growth in Response to Stresses.XLS [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0000829820
Explore at:
Dataset updated
May 20, 2021
Authors
Gao, Yubang; Liu, Bo; Wang, Huihui; Liu, Sheng; Xu, Xi; Yang, Yongkang; Jaiswal, Pankaj; Liu, Xuqing; Wei, Wentao; Reddy, Anireddy S. N.; Wang, Huiyuan; Li, Wei; Luo, Yunjun; Gu, Lianfeng; Dai, Xiufang
Description
Populus trichocarpa (P. trichocarpa) is a model tree for the investigation of wood formation. In recent years, researchers have generated a large number of high-throughput sequencing data in P. trichocarpa. However, no comprehensive database that provides multi-omics associations for the investigation of secondary growth in response to diverse stresses has been reported. Therefore, we developed a public repository that presents comprehensive measurements of gene expression and post-transcriptional regulation by integrating 144 RNA-Seq, 33 ChIP-seq, and six single-molecule real-time (SMRT) isoform sequencing (Iso-seq) libraries prepared from tissues subjected to different stresses. All the samples from different studies were analyzed to obtain gene expression, co-expression network, and differentially expressed genes (DEG) using unified parameters, which allowed comparison of results from different studies and treatments. In addition to gene expression, we also identified and deposited pre-processed data about alternative splicing (AS), alternative polyadenylation (APA) and alternative transcription initiation (ATI). The post-transcriptional regulation, differential expression, and co-expression network datasets were integrated into a new P. trichocarpa Stem Differentiating Xylem (PSDX) database, which further highlights gene families of RNA-binding proteins and stress-related genes. The PSDX also provides tools for data query, visualization, a genome browser, and the BLAST option for sequence-based query. Much of the data is also available for bulk download. The availability of PSDX contributes to the research related to the secondary growth in response to stresses in P. trichocarpa, which will provide new insights that can be useful for the improvement of stress tolerance in woody plants.
Data_Sheet_2_Integrative Analysis Reveals Across-Cancer Expression Patterns...
frontiersin.figshare.com
docx
Updated May 31, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yongfeng Ding; Tingting Zhong; Min Wang; Xueping Xiang; Guoping Ren; Zhongjuan Jia; Qinghui Lin; Qian Liu; Jingwen Dong; Linrong Li; Xiawei Li; Haiping Jiang; Lijun Zhu; Haoran Li; Dejun Shen; Lisong Teng; Chen Li; Jimin Shao (2023). Data_Sheet_2_Integrative Analysis Reveals Across-Cancer Expression Patterns and Clinical Relevance of Ribonucleotide Reductase in Human Cancers.docx [Dataset]. http://doi.org/10.3389/fonc.2019.00956.s002
Explore at:
docxAvailable download formats
Unique identifier
https://doi.org/10.3389/fonc.2019.00956.s002
Dataset updated
May 31, 2023
Dataset provided by
Frontiers Mediahttp://www.frontiersin.org/
Authors
Yongfeng Ding; Tingting Zhong; Min Wang; Xueping Xiang; Guoping Ren; Zhongjuan Jia; Qinghui Lin; Qian Liu; Jingwen Dong; Linrong Li; Xiawei Li; Haiping Jiang; Lijun Zhu; Haoran Li; Dejun Shen; Lisong Teng; Chen Li; Jimin Shao
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Mining cancer-omics databases deepens our understanding of cancer biology and can lead to potential breakthroughs in cancer treatment. Here, we propose an integrative analytical approach to reveal across-cancer expression patterns and identify potential clinical impacts for genes of interest from five representative public databases. Using ribonucleotide reductase (RR), a key enzyme in DNA synthesis and cancer-therapeutic targeting, as an example, we characterized the mRNA expression profiles and inter-component associations of three RR subunit genes and assess their differing pathological and prognostic significance across over 30-types of cancers and their related subtypes. Findings were validated by immunohistochemistry with clinical tissue samples (n = 211) collected from multiple cancer centers in China and with clinical follow-up. Underlying mechanisms were further explored and discussed using co-expression gene network analyses. This framework represents a simple, efficient, accurate, and comprehensive approach for cancer-omics resource analysis and underlines the necessity to separate the tumors by their histological or pathological subtypes during the clinical evaluation of molecular biomarkers.
f
Integration of Proteomics and Transcriptomics Data Sets for the Analysis of...
acs.figshare.com
zip
Updated Jun 1, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Paula Díez; Conrad Droste; Rosa M. Dégano; María González-Muñoz; Nieves Ibarrola; Martín Pérez-Andrés; Alba Garin-Muga; Víctor Segura; Gyorgy Marko-Varga; Joshua LaBaer; Alberto Orfao; Fernando J. Corrales; Javier De Las Rivas; Manuel Fuentes (2023). Integration of Proteomics and Transcriptomics Data Sets for the Analysis of a Lymphoma B‑Cell Line in the Context of the Chromosome-Centric Human Proteome Project [Dataset]. http://doi.org/10.1021/acs.jproteome.5b00474.s001
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.1021/acs.jproteome.5b00474.s001
Dataset updated
Jun 1, 2023
Dataset provided by
ACS Publications
Authors
Paula Díez; Conrad Droste; Rosa M. Dégano; María González-Muñoz; Nieves Ibarrola; Martín Pérez-Andrés; Alba Garin-Muga; Víctor Segura; Gyorgy Marko-Varga; Joshua LaBaer; Alberto Orfao; Fernando J. Corrales; Javier De Las Rivas; Manuel Fuentes
License
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Description
A comprehensive study of the molecular active landscape of human cells can be undertaken to integrate two different but complementary perspectives: transcriptomics, and proteomics. After the genome era, proteomics has emerged as a powerful tool to simultaneously identify and characterize the compendium of thousands of different proteins active in a cell. Thus, the Chromosome-centric Human Proteome Project (C-HPP) is promoting a full characterization of the human proteome combining high-throughput proteomics with the data derived from genome-wide expression profiling of protein-coding genes. Here we present a full proteomic profiling of a human lymphoma B-cell line (Ramos) performed using a nanoUPLC-LTQ-Orbitrap Velos proteomic platform, combined to an in-depth transcriptomic profiling of the same cell type. Data are available via ProteomeXchange with identifier PXD001933. Integration of the proteomic and transcriptomic data sets revealed a 94% overlap in the proteins identified by both -omics approaches. Moreover, functional enrichment analysis of the proteomic profiles showed an enrichment of several functions directly related to the biological and morphological characteristics of B-cells. In turn, about 30% of all protein-coding genes present in the whole human genome were identified as being expressed by the Ramos cells (stable average of 30% genes along all the chromosomes), revealing the size of the protein expression-set present in one specific human cell type. Additionally, the identification of missing proteins in our data sets has been reported, highlighting the power of the approach. Also, a comparison between neXtProt and UniProt database searches has been performed. In summary, our transcriptomic and proteomic experimental profiling provided a high coverage report of the expressed proteome from a human lymphoma B-cell type with a clear insight into the biological processes that characterized these cells. In this way, we demonstrated the usefulness of combining -omics for a comprehensive characterization of specific biological systems.
n
BioCyc
neuinfo.org
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
BioCyc [Dataset]. http://identifiers.org/RRID:SCR_002298
Explore at:
Unique identifier
https://identifiers.org/RRID:SCR_002298
Description
A collection of Pathway/Genome Databases which describes the genome and metabolic pathways of a single organism. The BioCyc collection of Pathway/Genome Databases (PGDBs) provides an electronic reference source on the genomes and metabolic pathways of sequenced organisms. BioCyc PGDBs are generated by software that predicts the metabolic pathway complements of completely sequenced organisms from their genome sequences. They also include the results of a number of other computational inference procedures applied to these genomes, including predictions of which genes code for missing enzymes in metabolic pathways, and predicted operons. The BioCyc Web site provides a suite of software tools for database searching and visualization, for omics data analysis, and for comparative genomics and comparative pathway questions. The databases within the BioCyc collection are organized into tiers according to the amount of manual review and updating they have received. Tier 1 PGDBs have been created through intensive manual efforts, and receive continuous updating. Tier 2 PGDBs were computationally generated by the PathoLogic program, and have undergone moderate amounts of review and updating. Tier 3 PGDBs were computationally generated by the PathoLogic program, and have undergone no review and updating. There are 967 DBs in Tier 3. The downloadable version of BioCyc that includes the Pathway Tools software provides more speed and power than the BioCyc Web site.
b
DiabetesOmic
bioregistry.io
Updated Aug 15, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). DiabetesOmic [Dataset]. https://bioregistry.io/diabetesomic
Explore at:
Dataset updated
Aug 15, 2025
Description
DiabetesOmic is a multi-omics database designed to collect and analyze transcriptional regulatory information across five high-throughput sequencing modalities, including ChIP-seq, RNA-seq, ATAC-seq, scATAC-seq, and scRNA-seq. This database's identifiers each represent a single sample. It contains clinical complication annotations including diabetic nephropathy, retinopathy, and atherosclerosis to enhance translational relevance. It enables the identification of disease-associated regulatory elements, epigenetic modifications, and cell type-specific molecular signatures, providing valuable insights into the molecular mechanisms of diabetes and its complications.
r
CLImate for Maize OMICS: CLIM4OMICS Analytics and Database
resodate.org
data-staging.niaid.nih.gov
Updated Jun 25, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Hasnat Aslam; Parisa Sarzaeim; Francisco Munoz-Arriola (2023). CLImate for Maize OMICS: CLIM4OMICS Analytics and Database [Dataset]. https://resodate.org/resources/aHR0cHM6Ly96ZW5vZG8ub3JnL3JlY29yZHMvODAwMjkwOQ==
Explore at:
Dataset updated
Jun 25, 2023
Dataset provided by
Zenodo
Authors
Hasnat Aslam; Parisa Sarzaeim; Francisco Munoz-Arriola
Description
CLIM4OMICS Analytics and Database is Improved database of G2F data repository that contains OMICs (genetic and phenotypic) and environmental data for maize yield predictability across 84 experimental fields in the U.S. and province of ON in Canada between 2014-2021. The goal of this pipeline is to aggregate, improve, and synthesize multi-dimensional G2F data including Geno-type, Phenotype and Environmental data for GxE modeling. This dataset contains 79,122 phenotype measurements, 378 genotypes of maize lines, environmental data of 178 locations and Python Scripts for Quality control (QC), Consistency control (CC) steps and ML models for GxE interactions. The Environmental data is extracted from NWS, DayMet and NSRDB databases and processed for QC and CC. The environmental dataset contains the minimum temperature (Tmin), average temperature (Tmean), maximum temperature (Tmax), minimum dew point (DPmin), average dew point (DPmean), maximum dew point (DPmax), minimum relative humidity (RHmin), average relative humidity (RHmean), maximum relative humidity (RHmax), minimum solar radiation (SRmin), average solar radiation (SRmean), maximum solar radiation (SRmax), accumulative rainfall (Racc), average wind speed (WSmean), and average wind direction (WDmean). This package also contains the raw G2F data and preprocessing pipeline.

Facebook

Twitter

Click to copy link

Link copied

Cite

Joseph Guhlin; Kevin Silverstein; Peng Zhou; Peter Tiffin; Nevin Young (2023). Additional file 2: Table S2. of ODG: Omics database generator - a tool for generating, querying, and analyzing multi-omics comparative databases to facilitate biological understanding [Dataset]. http://doi.org/10.6084/m9.figshare.c.3850801_D2.v1

Additional file 2: Table S2. of ODG: Omics database generator - a tool for generating, querying, and analyzing multi-omics comparative databases to facilitate biological understanding

Explore at:

xlsxAvailable download formats

Unique identifier

https://doi.org/10.6084/m9.figshare.c.3850801_D2.v1

Dataset updated

Jun 4, 2023

Dataset provided by

figshare
Figsharehttp://figshare.com/

Authors

Joseph Guhlin; Kevin Silverstein; Peng Zhou; Peter Tiffin; Nevin Young

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

PFam Domains and biological process GO categories for the four rhizobia strains. Predicted proteins related to multiple GO biological process categories are joined together with the pipe character. (XLSX 639Â kb)

Clear search

Close search

Google apps

Main menu

Additional file 2: Table S2. of ODG: Omics database generator - a tool for...

Dataset 2 - Protein Libraries Of Seven Databases From Cnidaria Omics Data...

Data Sheet 2_Visual analysis of multi-omics data.csv

Data from: MangroveDB: A comprehensive online database for mangroves based...

NCI-60 Cancer Cell Lines

DataSheet_1_AppleMDO: A Multi-Dimensional Omics Database for Apple...

Multi-omics for Understanding Climate Change (MUCC) database v2.0.0

Table1_Unsupervised Multi-Omics Data Integration Methods: A Comprehensive...

MLOmics

omics - eNanoMapper database

Newtomics

Unlocking natural history collections to improve eDNA reference databases...

Description of the data and file structure

Multi-omics for Understanding Climate Change (MUCC) database v1.0.0

Additional file 3 of Galbase: a comprehensive repository for integrating...

Table_5_PSDX: A Comprehensive Multi-Omics Association Database of Populus...

Data_Sheet_2_Integrative Analysis Reveals Across-Cancer Expression Patterns...

Integration of Proteomics and Transcriptomics Data Sets for the Analysis of...

BioCyc

DiabetesOmic

CLImate for Maize OMICS: CLIM4OMICS Analytics and Database

Additional file 2: Table S2. of ODG: Omics database generator - a tool for generating, querying, and analyzing multi-omics comparative databases to facilitate biological understanding