100+ datasets found
  1. Additional file 2: Table S2. of ODG: Omics database generator - a tool for...

    • springernature.figshare.com
    xlsx
    Updated Jun 4, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Joseph Guhlin; Kevin Silverstein; Peng Zhou; Peter Tiffin; Nevin Young (2023). Additional file 2: Table S2. of ODG: Omics database generator - a tool for generating, querying, and analyzing multi-omics comparative databases to facilitate biological understanding [Dataset]. http://doi.org/10.6084/m9.figshare.c.3850801_D2.v1
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Jun 4, 2023
    Dataset provided by
    figshare
    Figsharehttp://figshare.com/
    Authors
    Joseph Guhlin; Kevin Silverstein; Peng Zhou; Peter Tiffin; Nevin Young
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    PFam Domains and biological process GO categories for the four rhizobia strains. Predicted proteins related to multiple GO biological process categories are joined together with the pipe character. (XLSX 639Â kb)

  2. m

    Dataset 2 - Protein Libraries Of Seven Databases From Cnidaria Omics Data...

    • data.mendeley.com
    Updated Dec 6, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Alexandre Barroso (2024). Dataset 2 - Protein Libraries Of Seven Databases From Cnidaria Omics Data After Duplicates Removal and AMPir Prediction [Dataset]. http://doi.org/10.17632/myp4j56gpz.1
    Explore at:
    Dataset updated
    Dec 6, 2024
    Authors
    Alexandre Barroso
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Non duplicated AMP precursor protein libraries from 7 databases of Cnidaria: Db1 – 6 proteomes derived from sequenced genomes of Anthozoa Db2 – 2 proteomes derived from sequenced genomes of Medusozoa Db3 – 46 whole body/non-specific transcriptomes of Anthozoa Db4 – 24 whole body/non specific transcriptomes of Medusozoa Db5 – 25 transcriptomes specific to the tentacles of Anthozoa Db6 – 7 transcriptomes specific to the tentacles of Medusozoa Db7 – 2 transcriptomes specific to the nematocysts of Anthozoa

  3. f

    Data Sheet 2_Visual analysis of multi-omics data.csv

    • frontiersin.figshare.com
    csv
    Updated Sep 10, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Austin Swart; Ron Caspi; Suzanne Paley; Peter D. Karp (2024). Data Sheet 2_Visual analysis of multi-omics data.csv [Dataset]. http://doi.org/10.3389/fbinf.2024.1395981.s002
    Explore at:
    csvAvailable download formats
    Dataset updated
    Sep 10, 2024
    Dataset provided by
    Frontiers
    Authors
    Austin Swart; Ron Caspi; Suzanne Paley; Peter D. Karp
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    We present a tool for multi-omics data analysis that enables simultaneous visualization of up to four types of omics data on organism-scale metabolic network diagrams. The tool’s interactive web-based metabolic charts depict the metabolic reactions, pathways, and metabolites of a single organism as described in a metabolic pathway database for that organism; the charts are constructed using automated graphical layout algorithms. The multi-omics visualization facility paints each individual omics dataset onto a different “visual channel” of the metabolic-network diagram. For example, a transcriptomics dataset might be displayed by coloring the reaction arrows within the metabolic chart, while a companion proteomics dataset is displayed as reaction arrow thicknesses, and a complementary metabolomics dataset is displayed as metabolite node colors. Once the network diagrams are painted with omics data, semantic zooming provides more details within the diagram as the user zooms in. Datasets containing multiple time points can be displayed in an animated fashion. The tool will also graph data values for individual reactions or metabolites designated by the user. The user can interactively adjust the mapping from data value ranges to the displayed colors and thicknesses to provide more informative diagrams.

  4. Data from: MangroveDB: A comprehensive online database for mangroves based...

    • zenodo.org
    zip
    Updated Oct 9, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Chaoqun Xu; Chaoqun Xu (2024). MangroveDB: A comprehensive online database for mangroves based on multi-omics data [Dataset]. http://doi.org/10.5281/zenodo.13907062
    Explore at:
    zipAvailable download formats
    Dataset updated
    Oct 9, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Chaoqun Xu; Chaoqun Xu
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Oct 8, 2024
    Description

    Mangroves are dominant flora of intertidal zones along tropical and subtropical coastline around the world that offer important ecological and economic value. Recently, the genomes of mangroves have been decoded, and massive omics data were generated and deposited in the public databases. Reanalysis of multi-omics data can provide new biological insights excluded in the original studies. However, the requirements for computational resource and lack of bioinformatics skill for experimental researchers limit the effective use of the original data. To fill this gap, we uniformly processed 942 transcriptome data, 386 whole-genome sequencing data, and provided 13 reference genomes and 40 reference transcriptomes for 53 mangroves. Finally, we built an interactive web-based database platform MangroveDB (https://github.com/Jasonxu0109/MangroveDB), which was designed to provide comprehensive gene expression datasets to facilitate their exploration and equipped with several online analysis tools, including principal components analysis, differential gene expression analysis, tissue-specific gene expression analysis, GO and KEGG enrichment analysis. MangroveDB not only provides query functions about genes annotation, but also supports some useful visualization functions for analysis results, such as volcano plot, heatmap, dotplot, PCA plot, bubble plot, population structure etc. In conclusion, MangroveDB is a valuable resource for the mangroves research community to efficiently use the massive public omics datasets.

  5. b

    NCI-60 Cancer Cell Lines

    • bigomics.ch
    Updated Nov 8, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    National Cancer Institute (NCI) (2024). NCI-60 Cancer Cell Lines [Dataset]. https://bigomics.ch/blog/top-databases-for-drug-discovery/
    Explore at:
    Dataset updated
    Nov 8, 2024
    Dataset authored and provided by
    National Cancer Institute (NCI)
    Description

    A panel of 60 human cancer cell lines used for screening anticancer drugs.

  6. f

    DataSheet_1_AppleMDO: A Multi-Dimensional Omics Database for Apple...

    • datasetcatalog.nlm.nih.gov
    Updated Oct 22, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Liu, Yue; Tian, Tian; Xu, Wenying; Ma, Xuelian; Yang, Jiaotong; Su, Zhen; Da, Lingling; She, Jiajie (2019). DataSheet_1_AppleMDO: A Multi-Dimensional Omics Database for Apple Co-Expression Networks and Chromatin States.pdf [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0000136732
    Explore at:
    Dataset updated
    Oct 22, 2019
    Authors
    Liu, Yue; Tian, Tian; Xu, Wenying; Ma, Xuelian; Yang, Jiaotong; Su, Zhen; Da, Lingling; She, Jiajie
    Description

    As an economically important crop, apple is one of the most cultivated fruit trees in temperate regions worldwide. Recently, a large number of high-quality transcriptomic and epigenomic datasets for apple were made available to the public, which could be helpful in inferring gene regulatory relationships and thus predicting gene function at the genome level. Through integration of the available apple genomic, transcriptomic, and epigenomic datasets, we constructed co-expression networks, identified functional modules, and predicted chromatin states. A total of 112 RNA-seq datasets were integrated to construct a global network and a conditional network (tissue-preferential network). Furthermore, a total of 1,076 functional modules with closely related gene sets were identified to assess the modularity of biological networks and further subjected to functional enrichment analysis. The results showed that the function of many modules was related to development, secondary metabolism, hormone response, and transcriptional regulation. Transcriptional regulation is closely related to epigenetic marks on chromatin. A total of 20 epigenomic datasets, which included ChIP-seq, DNase-seq, and DNA methylation analysis datasets, were integrated and used to classify chromatin states. Based on the ChromHMM algorithm, the genome was divided into 620,122 fragments, which were classified into 24 states according to the combination of epigenetic marks and enriched-feature regions. Finally, through the collaborative analysis of different omics datasets, the online database AppleMDO (http://bioinformatics.cau.edu.cn/AppleMDO/) was established for cross-referencing and the exploration of possible novel functions of apple genes. In addition, gene annotation information and functional support toolkits were also provided. Our database might be convenient for researchers to develop insights into the function of genes related to important agronomic traits and might serve as a reference for other fruit trees.

  7. Multi-omics for Understanding Climate Change (MUCC) database v2.0.0

    • zenodo.org
    text/x-python
    Updated Oct 9, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Emily Bechtold; Kelly Wrighton; Kelly Wrighton; Mike Wilkins; Mike Wilkins; Emily Bechtold (2024). Multi-omics for Understanding Climate Change (MUCC) database v2.0.0 [Dataset]. http://doi.org/10.5281/zenodo.13909730
    Explore at:
    text/x-pythonAvailable download formats
    Dataset updated
    Oct 9, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Emily Bechtold; Kelly Wrighton; Kelly Wrighton; Mike Wilkins; Mike Wilkins; Emily Bechtold
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This is the Multi-omics for Understanding Climate Change (MUCC database) version 2.0.0. This current version is based on amplicon and metagenomic sequencing of Old Woman Creek (OWC), Prairie Pothole Region(PPR7 and PPR8), Jean Lafitte National Historical Park and Preserve (JLA), AmeriFlux site US-LA2 (LA2), Stordalen Mire (STM-fen and STM-bog), AmeriFlux site-ID US-Twt (TWI), and Peatland Responses Under Changing Environments (SPRUCE) and wetland soils. Additionally, this includes metatranscriptome sequencing from OWC. In the future, this will be expanded to include more data from these sites and from additional wetlands.

    OWC, PPR, JLA and LA2 data are deposited in NCBI Bioproject PRJNA1007388

    Stordalen Mire MAGs are deposited in BioProject PRJNA386538

    AmeriFlux site-ID US-Twt are deposited in SRA SRP003022, SRA SRP010671, SRP010730, SRP010738, SRP010741, SRP010747, SRP010748, SRP010751, SRP010862, SRP010870, and SRP011309.

    SPRUCE data are deposited in PRJNA638786 and PRJNA638601

    Files and datasets included here:

    1. 16S.zip 16S amplicon sequencing data and site metadata for 1,112 samples (fastq files)
    2. MQ_HQ_MAGs.zip Database of 4745 Medium and High Quality MAGs (fast files)
    3. MUCC_v2.0.0_HQMQ_genes.faa.zip MAG amino acid gene sequences derived from DRAM gene calls (fasta file)
    4. MUCC_v2.0.0_HQMQ_annotations.tsv MAG DRAM ANNOTATIONS
    5. owc_metat_table_methanoregula_genes.csv Metatranscriptomic expression per genes in Methanoregula across 133 metatranscriptomes (csv table)
    6. gtdbtk.ar53.decorated.tree newick file for GTDB de novo work flow Methanoregula MAG tree
    7. Newick_gene_trees.zip Trees used in blast identification of methylotrophic gene homologs to curate MR for methylotrophy
    8. fasta_reference_genes.zip FASTA reference files of genes used as BLAST query to mine Methanoregula MAGs for genes involved in detoxification of reactive oxygen species (ROS) and methanogenic metabolism of methylated compounds
    9. protpipeliner.py Python script is a modification of protpipeliner.rb for building RAXML trees
  8. Table1_Unsupervised Multi-Omics Data Integration Methods: A Comprehensive...

    • frontiersin.figshare.com
    docx
    Updated Jun 5, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nasim Vahabi; George Michailidis (2023). Table1_Unsupervised Multi-Omics Data Integration Methods: A Comprehensive Review.DOCX [Dataset]. http://doi.org/10.3389/fgene.2022.854752.s001
    Explore at:
    docxAvailable download formats
    Dataset updated
    Jun 5, 2023
    Dataset provided by
    Frontiers Mediahttp://www.frontiersin.org/
    Authors
    Nasim Vahabi; George Michailidis
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Through the developments of Omics technologies and dissemination of large-scale datasets, such as those from The Cancer Genome Atlas, Alzheimer’s Disease Neuroimaging Initiative, and Genotype-Tissue Expression, it is becoming increasingly possible to study complex biological processes and disease mechanisms more holistically. However, to obtain a comprehensive view of these complex systems, it is crucial to integrate data across various Omics modalities, and also leverage external knowledge available in biological databases. This review aims to provide an overview of multi-Omics data integration methods with different statistical approaches, focusing on unsupervised learning tasks, including disease onset prediction, biomarker discovery, disease subtyping, module discovery, and network/pathway analysis. We also briefly review feature selection methods, multi-Omics data sets, and resources/tools that constitute critical components for carrying out the integration.

  9. h

    MLOmics

    • huggingface.co
    Updated May 29, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    AI for Bio Informatics and Care (2025). MLOmics [Dataset]. https://huggingface.co/datasets/AIBIC/MLOmics
    Explore at:
    Dataset updated
    May 29, 2025
    Dataset authored and provided by
    AI for Bio Informatics and Care
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    MLOmics: Cancer Multi-Omics Database for Machine Learning

      Abstract
    

    Framing the investigation of diverse cancers as a machine learning problem has recently shown significant potential in multi-omics analysis and cancer research. Empowering these successful machine learning models are the high-quality training datasets with sufficient data volume and adequate preprocessing. However, while there exist several public data portals including The Cancer Genome Atlas (TCGA)… See the full description on the dataset page: https://huggingface.co/datasets/AIBIC/MLOmics.

  10. a

    omics - eNanoMapper database

    • enm-dev.adma.ai
    json
    Updated Oct 11, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    omics metadata (2023). omics - eNanoMapper database [Dataset]. https://enm-dev.adma.ai/projects/omics/
    Explore at:
    jsonAvailable download formats
    Dataset updated
    Oct 11, 2023
    Dataset provided by
    Ideaconsult Ltd.
    Authors
    omics metadata
    License

    https://enanomapper.adma.ai/about/omicshttps://enanomapper.adma.ai/about/omics

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Description

    omics metadata project data: Nanosafety-relevant omics data - a database covering metadata for transcriptomics, proteomics and microRNA expression data relevant to safety assessment analyses of nanomaterials

  11. r

    Newtomics

    • rrid.site
    Updated Jan 29, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2022). Newtomics [Dataset]. http://identifiers.org/RRID:SCR_006073
    Explore at:
    Dataset updated
    Jan 29, 2022
    Description

    Newt-omics is a database, which enables researchers to locate, retrieve and store data sets dedicated to the molecular characterization of newts. Newt-omics is a transcript-centered database, based on an Expressed Sequence Tag (EST) data set from the newt, covering ~50,000 Sanger sequenced transcripts and a set of high-density microarray data, generated from regenerating hearts. Newt-omics also contains a large set of peptides identified by mass spectrometry, which was used to validate 13,810 ESTs as true protein coding. Newt-omics is open to implement additional high-throughput data sets without changing the database structure. Via a user-friendly interface Newt-omics allows access to a huge set of molecular data without the need for prior bioinformatical expertise. The newt Notopthalmus viridescens is the master of regeneration. This organism is known for more than 200 years for its exceptional regenerative capabilities. Newts can completely replace lost appendages like limb and tail, lens and retina and parts of the central nervous system. Moreover, after cardiac injury newts can rebuild the functional myocardium with no scar formation. To date only very limited information from public databases is available. Newt-Omics aims to provide a comprehensive platform of expressed genes during tissue regeneration, including extensive annotations, expression data and experimentally verified peptide sequences with yet no homology to other publicly available gene sequences. The goal is to obtain a detailed understanding of the molecular processes underlying tissue regeneration in the newt, that may lead to the development of approaches, efficiently stimulating regenerative pathways in mammalians. * Number of contigs: 26594 * Number of est in contigs: 48537 * Number of transcripts with verified peptide: 5291 * Number of peptides: 15169

  12. d

    Unlocking natural history collections to improve eDNA reference databases...

    • search.dataone.org
    Updated Oct 17, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sarah Schmid; Nicolas Straube; Camille Albouy; Bo Delling; James Maclaine; Michael Matschiner; Peter Rask Møller; Annamaria Nocita; Anja PalandaÄ ić; Lukas Rüber; Moritz Sonnewald; Nadir Alvarez; Stéphanie Manel; Loïc Pellissier (2025). Unlocking natural history collections to improve eDNA reference databases and biodiversity monitoring [Dataset]. http://doi.org/10.5061/dryad.0zpc8677g
    Explore at:
    Dataset updated
    Oct 17, 2025
    Dataset provided by
    Dryad Digital Repository
    Authors
    Sarah Schmid; Nicolas Straube; Camille Albouy; Bo Delling; James Maclaine; Michael Matschiner; Peter Rask Møller; Annamaria Nocita; Anja PalandaÄ ić; Lukas Rüber; Moritz Sonnewald; Nadir Alvarez; Stéphanie Manel; Loïc Pellissier
    Description

    Biodiversity changes due to human activities highlight the need for efficient biodiversity monitoring approaches. Environmental DNA (eDNA) metabarcoding offers a non-invasive method used for biodiversity monitoring and ecosystem assessment, but its accuracy depends on comprehensive DNA reference databases. Natural history collections often contain rare or difficult-to-obtain samples that can serve as a valuable resource to fill gaps in eDNA reference databases. Here, we discuss the utility of specimens from natural history collections in supporting future eDNA applications. Museomics—the application of -omics techniques to museum specimens—offers a promising avenue for improving eDNA reference databases by increasing species coverage. Furthermore, museomics can provide transferable methodological advancements for extracting genetic material from samples with low and degraded DNA. The integration of natural history collections, museomics, and eDNA approaches has the potential to signific..., Dataset for analyzing the potential of museum specimens to improve the DNA reference database To examine the cumulative number of species sequenced for a given DNA barcode/mitochondrial genome (also referred to as mitogenome) over the years, we retrieved all data available from NCBI using the R package rentrez v1.2.3 (Winter 2017). We searched the nucleotide database for the rRNA 12S, rRNA 16S, rRNA 18S, cytochrome B (cytB), cytochrome oxidase I (COI) barcodes, as well as for the complete mitogenomes for all fish orders. In addition, we also retrieved all the fish species with available data on the sequence read archive (SRA) using the Entrez Direct (Kans 2024), which provides access to the NCBI databases from a Unix terminal window. To highlight the potential of museum specimens for increasing the number of species with an available barcode/mitogenome sequence, we first downloaded all available datasets on the Global Biodiversity Information Facility (GBIF) listing fish specimens store..., , # Unlocking natural history collections to improve eDNA reference databases and biodiversity monitoring

    Description of the data and file structure

    The dataset consists of a main folder, data.zip.

    Various

    • kit_custom_prices.xlsx - price estimate for DNA extraction and ssDNA library prep using a commercial kit or the custom protocol from Nicolas Straube.

    barcodes_data

    output from the cumul_barcodes_plot.R script.

    • species_with_barcodes.csv - list of all fishes (marine + freshwater) with a given barcode available, according to NCBI. (1) species name, (2) NCBI taxon ID, (3) date when the species sequence was first uploaded on NCBI, (4) marker of interest, (5) year the species sequence was first uploaded on NCBI.

    occurence_data

    contains a different type of list of species (museum, 12S availability, etc.)

    • combined_gbif_species.csv - output from the script museum_potential/1_process_gbif_datasets.R. Contains all the species of fish found in the main natural ...,
  13. Multi-omics for Understanding Climate Change (MUCC) database v1.0.0

    • zenodo.org
    bin
    Updated Feb 5, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Angela A. Oliverio; Angela A. Oliverio; Mikayla A. Borton; Mikayla A. Borton; Adrienne Narrowe; Adrienne Narrowe; Kelly C. Wrighton; Kelly C. Wrighton (2024). Multi-omics for Understanding Climate Change (MUCC) database v1.0.0 [Dataset]. http://doi.org/10.5281/zenodo.10622292
    Explore at:
    binAvailable download formats
    Dataset updated
    Feb 5, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Angela A. Oliverio; Angela A. Oliverio; Mikayla A. Borton; Mikayla A. Borton; Adrienne Narrowe; Adrienne Narrowe; Kelly C. Wrighton; Kelly C. Wrighton
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This is the Multi-omics for Understanding Climate Change (MUCC database) version 1.0.0. This current version is based on metagenomic and metatranscriptomic sequencing of Old Woman Creek wetland soils, but will be expanded in the future to include data from additional wetlands. Files and datasets included here:

    1. MAGs.zip Dereplicated database of 2,502 MAGs (fasta files)
    2. OWC_HQMQ_DB_genes.faa.gz MAG amino acid gene sequences derived from DRAM gene calls (fasta file)
    3. OWC_HQMQ_DB_ANNOTATIONS_20220208.txt.gz MAG DRAM annotations
    4. owc_metat_table_mags.csv Metatranscriptomic expression per MAG across 133 metatranscriptomes (csv table)
    5. owc_metat_table_mags_genes.csv Metatranscriptomic expression per gene across 133 metatranscriptomes (csv table)
    6. owc_metat_table_mags_genes_annotations.csv corresponding DRAM annotations to #5 for transcribed genes (csv table)
    7. gtdbtk.ar53.decorated.tree newick file for tree in figure S4
  14. Additional file 3 of Galbase: a comprehensive repository for integrating...

    • springernature.figshare.com
    xlsx
    Updated Feb 20, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Weiwei Fu; Rui Wang; Naiyi Xu; Jinxin Wang; Ran Li; Hojjat Asadollahpour Nanaei; Qinghua Nie; Xin Zhao; Jianlin Han; Ning Yang; Yu Jiang (2024). Additional file 3 of Galbase: a comprehensive repository for integrating chicken multi-omics data [Dataset]. http://doi.org/10.6084/m9.figshare.19759727.v1
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Feb 20, 2024
    Dataset provided by
    Figsharehttp://figshare.com/
    figshare
    Authors
    Weiwei Fu; Rui Wang; Naiyi Xu; Jinxin Wang; Ran Li; Hojjat Asadollahpour Nanaei; Qinghua Nie; Xin Zhao; Jianlin Han; Ning Yang; Yu Jiang
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Additional file 3:

  15. f

    Table_5_PSDX: A Comprehensive Multi-Omics Association Database of Populus...

    • datasetcatalog.nlm.nih.gov
    Updated May 20, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Gao, Yubang; Liu, Bo; Wang, Huihui; Liu, Sheng; Xu, Xi; Yang, Yongkang; Jaiswal, Pankaj; Liu, Xuqing; Wei, Wentao; Reddy, Anireddy S. N.; Wang, Huiyuan; Li, Wei; Luo, Yunjun; Gu, Lianfeng; Dai, Xiufang (2021). Table_5_PSDX: A Comprehensive Multi-Omics Association Database of Populus trichocarpa With a Focus on the Secondary Growth in Response to Stresses.XLS [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0000829820
    Explore at:
    Dataset updated
    May 20, 2021
    Authors
    Gao, Yubang; Liu, Bo; Wang, Huihui; Liu, Sheng; Xu, Xi; Yang, Yongkang; Jaiswal, Pankaj; Liu, Xuqing; Wei, Wentao; Reddy, Anireddy S. N.; Wang, Huiyuan; Li, Wei; Luo, Yunjun; Gu, Lianfeng; Dai, Xiufang
    Description

    Populus trichocarpa (P. trichocarpa) is a model tree for the investigation of wood formation. In recent years, researchers have generated a large number of high-throughput sequencing data in P. trichocarpa. However, no comprehensive database that provides multi-omics associations for the investigation of secondary growth in response to diverse stresses has been reported. Therefore, we developed a public repository that presents comprehensive measurements of gene expression and post-transcriptional regulation by integrating 144 RNA-Seq, 33 ChIP-seq, and six single-molecule real-time (SMRT) isoform sequencing (Iso-seq) libraries prepared from tissues subjected to different stresses. All the samples from different studies were analyzed to obtain gene expression, co-expression network, and differentially expressed genes (DEG) using unified parameters, which allowed comparison of results from different studies and treatments. In addition to gene expression, we also identified and deposited pre-processed data about alternative splicing (AS), alternative polyadenylation (APA) and alternative transcription initiation (ATI). The post-transcriptional regulation, differential expression, and co-expression network datasets were integrated into a new P. trichocarpa Stem Differentiating Xylem (PSDX) database, which further highlights gene families of RNA-binding proteins and stress-related genes. The PSDX also provides tools for data query, visualization, a genome browser, and the BLAST option for sequence-based query. Much of the data is also available for bulk download. The availability of PSDX contributes to the research related to the secondary growth in response to stresses in P. trichocarpa, which will provide new insights that can be useful for the improvement of stress tolerance in woody plants.

  16. Data_Sheet_2_Integrative Analysis Reveals Across-Cancer Expression Patterns...

    • frontiersin.figshare.com
    docx
    Updated May 31, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yongfeng Ding; Tingting Zhong; Min Wang; Xueping Xiang; Guoping Ren; Zhongjuan Jia; Qinghui Lin; Qian Liu; Jingwen Dong; Linrong Li; Xiawei Li; Haiping Jiang; Lijun Zhu; Haoran Li; Dejun Shen; Lisong Teng; Chen Li; Jimin Shao (2023). Data_Sheet_2_Integrative Analysis Reveals Across-Cancer Expression Patterns and Clinical Relevance of Ribonucleotide Reductase in Human Cancers.docx [Dataset]. http://doi.org/10.3389/fonc.2019.00956.s002
    Explore at:
    docxAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    Frontiers Mediahttp://www.frontiersin.org/
    Authors
    Yongfeng Ding; Tingting Zhong; Min Wang; Xueping Xiang; Guoping Ren; Zhongjuan Jia; Qinghui Lin; Qian Liu; Jingwen Dong; Linrong Li; Xiawei Li; Haiping Jiang; Lijun Zhu; Haoran Li; Dejun Shen; Lisong Teng; Chen Li; Jimin Shao
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Mining cancer-omics databases deepens our understanding of cancer biology and can lead to potential breakthroughs in cancer treatment. Here, we propose an integrative analytical approach to reveal across-cancer expression patterns and identify potential clinical impacts for genes of interest from five representative public databases. Using ribonucleotide reductase (RR), a key enzyme in DNA synthesis and cancer-therapeutic targeting, as an example, we characterized the mRNA expression profiles and inter-component associations of three RR subunit genes and assess their differing pathological and prognostic significance across over 30-types of cancers and their related subtypes. Findings were validated by immunohistochemistry with clinical tissue samples (n = 211) collected from multiple cancer centers in China and with clinical follow-up. Underlying mechanisms were further explored and discussed using co-expression gene network analyses. This framework represents a simple, efficient, accurate, and comprehensive approach for cancer-omics resource analysis and underlines the necessity to separate the tumors by their histological or pathological subtypes during the clinical evaluation of molecular biomarkers.

  17. f

    Integration of Proteomics and Transcriptomics Data Sets for the Analysis of...

    • acs.figshare.com
    zip
    Updated Jun 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Paula Díez; Conrad Droste; Rosa M. Dégano; María González-Muñoz; Nieves Ibarrola; Martín Pérez-Andrés; Alba Garin-Muga; Víctor Segura; Gyorgy Marko-Varga; Joshua LaBaer; Alberto Orfao; Fernando J. Corrales; Javier De Las Rivas; Manuel Fuentes (2023). Integration of Proteomics and Transcriptomics Data Sets for the Analysis of a Lymphoma B‑Cell Line in the Context of the Chromosome-Centric Human Proteome Project [Dataset]. http://doi.org/10.1021/acs.jproteome.5b00474.s001
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    ACS Publications
    Authors
    Paula Díez; Conrad Droste; Rosa M. Dégano; María González-Muñoz; Nieves Ibarrola; Martín Pérez-Andrés; Alba Garin-Muga; Víctor Segura; Gyorgy Marko-Varga; Joshua LaBaer; Alberto Orfao; Fernando J. Corrales; Javier De Las Rivas; Manuel Fuentes
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    A comprehensive study of the molecular active landscape of human cells can be undertaken to integrate two different but complementary perspectives: transcriptomics, and proteomics. After the genome era, proteomics has emerged as a powerful tool to simultaneously identify and characterize the compendium of thousands of different proteins active in a cell. Thus, the Chromosome-centric Human Proteome Project (C-HPP) is promoting a full characterization of the human proteome combining high-throughput proteomics with the data derived from genome-wide expression profiling of protein-coding genes. Here we present a full proteomic profiling of a human lymphoma B-cell line (Ramos) performed using a nanoUPLC-LTQ-Orbitrap Velos proteomic platform, combined to an in-depth transcriptomic profiling of the same cell type. Data are available via ProteomeXchange with identifier PXD001933. Integration of the proteomic and transcriptomic data sets revealed a 94% overlap in the proteins identified by both -omics approaches. Moreover, functional enrichment analysis of the proteomic profiles showed an enrichment of several functions directly related to the biological and morphological characteristics of B-cells. In turn, about 30% of all protein-coding genes present in the whole human genome were identified as being expressed by the Ramos cells (stable average of 30% genes along all the chromosomes), revealing the size of the protein expression-set present in one specific human cell type. Additionally, the identification of missing proteins in our data sets has been reported, highlighting the power of the approach. Also, a comparison between neXtProt and UniProt database searches has been performed. In summary, our transcriptomic and proteomic experimental profiling provided a high coverage report of the expressed proteome from a human lymphoma B-cell type with a clear insight into the biological processes that characterized these cells. In this way, we demonstrated the usefulness of combining -omics for a comprehensive characterization of specific biological systems.

  18. n

    BioCyc

    • neuinfo.org
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    BioCyc [Dataset]. http://identifiers.org/RRID:SCR_002298
    Explore at:
    Description

    A collection of Pathway/Genome Databases which describes the genome and metabolic pathways of a single organism. The BioCyc collection of Pathway/Genome Databases (PGDBs) provides an electronic reference source on the genomes and metabolic pathways of sequenced organisms. BioCyc PGDBs are generated by software that predicts the metabolic pathway complements of completely sequenced organisms from their genome sequences. They also include the results of a number of other computational inference procedures applied to these genomes, including predictions of which genes code for missing enzymes in metabolic pathways, and predicted operons. The BioCyc Web site provides a suite of software tools for database searching and visualization, for omics data analysis, and for comparative genomics and comparative pathway questions. The databases within the BioCyc collection are organized into tiers according to the amount of manual review and updating they have received. Tier 1 PGDBs have been created through intensive manual efforts, and receive continuous updating. Tier 2 PGDBs were computationally generated by the PathoLogic program, and have undergone moderate amounts of review and updating. Tier 3 PGDBs were computationally generated by the PathoLogic program, and have undergone no review and updating. There are 967 DBs in Tier 3. The downloadable version of BioCyc that includes the Pathway Tools software provides more speed and power than the BioCyc Web site.

  19. b

    DiabetesOmic

    • bioregistry.io
    Updated Aug 15, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). DiabetesOmic [Dataset]. https://bioregistry.io/diabetesomic
    Explore at:
    Dataset updated
    Aug 15, 2025
    Description

    DiabetesOmic is a multi-omics database designed to collect and analyze transcriptional regulatory information across five high-throughput sequencing modalities, including ChIP-seq, RNA-seq, ATAC-seq, scATAC-seq, and scRNA-seq. This database's identifiers each represent a single sample. It contains clinical complication annotations including diabetic nephropathy, retinopathy, and atherosclerosis to enhance translational relevance. It enables the identification of disease-associated regulatory elements, epigenetic modifications, and cell type-specific molecular signatures, providing valuable insights into the molecular mechanisms of diabetes and its complications.

  20. r

    CLImate for Maize OMICS: CLIM4OMICS Analytics and Database

    • resodate.org
    • data-staging.niaid.nih.gov
    Updated Jun 25, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Hasnat Aslam; Parisa Sarzaeim; Francisco Munoz-Arriola (2023). CLImate for Maize OMICS: CLIM4OMICS Analytics and Database [Dataset]. https://resodate.org/resources/aHR0cHM6Ly96ZW5vZG8ub3JnL3JlY29yZHMvODAwMjkwOQ==
    Explore at:
    Dataset updated
    Jun 25, 2023
    Dataset provided by
    Zenodo
    Authors
    Hasnat Aslam; Parisa Sarzaeim; Francisco Munoz-Arriola
    Description

    CLIM4OMICS Analytics and Database is Improved database of G2F data repository that contains OMICs (genetic and phenotypic) and environmental data for maize yield predictability across 84 experimental fields in the U.S. and province of ON in Canada between 2014-2021. The goal of this pipeline is to aggregate, improve, and synthesize multi-dimensional G2F data including Geno-type, Phenotype and Environmental data for GxE modeling. This dataset contains 79,122 phenotype measurements, 378 genotypes of maize lines, environmental data of 178 locations and Python Scripts for Quality control (QC), Consistency control (CC) steps and ML models for GxE interactions. The Environmental data is extracted from NWS, DayMet and NSRDB databases and processed for QC and CC. The environmental dataset contains the minimum temperature (Tmin), average temperature (Tmean), maximum temperature (Tmax), minimum dew point (DPmin), average dew point (DPmean), maximum dew point (DPmax), minimum relative humidity (RHmin), average relative humidity (RHmean), maximum relative humidity (RHmax), minimum solar radiation (SRmin), average solar radiation (SRmean), maximum solar radiation (SRmax), accumulative rainfall (Racc), average wind speed (WSmean), and average wind direction (WDmean). This package also contains the raw G2F data and preprocessing pipeline.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Joseph Guhlin; Kevin Silverstein; Peng Zhou; Peter Tiffin; Nevin Young (2023). Additional file 2: Table S2. of ODG: Omics database generator - a tool for generating, querying, and analyzing multi-omics comparative databases to facilitate biological understanding [Dataset]. http://doi.org/10.6084/m9.figshare.c.3850801_D2.v1
Organization logoOrganization logo

Additional file 2: Table S2. of ODG: Omics database generator - a tool for generating, querying, and analyzing multi-omics comparative databases to facilitate biological understanding

Related Article
Explore at:
xlsxAvailable download formats
Dataset updated
Jun 4, 2023
Dataset provided by
figshare
Figsharehttp://figshare.com/
Authors
Joseph Guhlin; Kevin Silverstein; Peng Zhou; Peter Tiffin; Nevin Young
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

PFam Domains and biological process GO categories for the four rhizobia strains. Predicted proteins related to multiple GO biological process categories are joined together with the pipe character. (XLSX 639Â kb)

Search
Clear search
Close search
Google apps
Main menu