100+ datasets found
  1. Data Object 1-1 (Supplemental Data 1-S1)

    • figshare.com
    xlsx
    Updated Nov 12, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Colbie Reed (2023). Data Object 1-1 (Supplemental Data 1-S1) [Dataset]. http://doi.org/10.6084/m9.figshare.24548935.v1
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Nov 12, 2023
    Dataset provided by
    Figsharehttp://figshare.com/
    figshare
    Authors
    Colbie Reed
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Supplemental Data 1-S1. Timeline of important events shaping contemporary bioinformatics and comparative genomics. Timeline is not intended to be absolutely comprehensive of each of the observed fields, their respective histories. See footnotes for key review publications, sources in addition to those listed in Reference column. Field of contributions are color-coded accordingly: purple= computer science/engineering, blue= legislation/government action, biology= green, economic/markets= orange, academic institution= pink

  2. w

    blog-bioinformatics.science - Historical whois Lookup

    • whoisdatacenter.com
    csv
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    AllHeart Web Inc, blog-bioinformatics.science - Historical whois Lookup [Dataset]. https://whoisdatacenter.com/domain/blog-bioinformatics.science/
    Explore at:
    csvAvailable download formats
    Dataset authored and provided by
    AllHeart Web Inc
    License

    https://whoisdatacenter.com/terms-of-use/https://whoisdatacenter.com/terms-of-use/

    Time period covered
    Mar 15, 1985 - Oct 6, 2025
    Description

    Explore the historical Whois records related to blog-bioinformatics.science (Domain). Get insights into ownership history and changes over time.

  3. Biodiversity Informatics at the Natural History Museum

    • figshare.com
    pptx
    Updated Jun 19, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Edward Baker (2023). Biodiversity Informatics at the Natural History Museum [Dataset]. http://doi.org/10.6084/m9.figshare.722897.v1
    Explore at:
    pptxAvailable download formats
    Dataset updated
    Jun 19, 2023
    Dataset provided by
    Figsharehttp://figshare.com/
    figshare
    Authors
    Edward Baker
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Overview of the NHM Informatics Intiative based around the data life cycle.

  4. s

    Test dataset from: GenErode: a bioinformatics pipeline to investigate genome...

    • figshare.scilifelab.se
    • datasetcatalog.nlm.nih.gov
    • +3more
    application/x-gzip
    Updated Jan 15, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Verena Kutschera; Marcin Kierczak; Tom van der Valk; Johanna von Seth; Nicolas Dussex; Edana Lord; Marianne Dehasque; David W. G. Stanton; Payam Emami Khoonsari; Björn Nystedt; Love Dalén; David Díez del molino (2025). Test dataset from: GenErode: a bioinformatics pipeline to investigate genome erosion in endangered and extinct species [Dataset]. http://doi.org/10.17044/scilifelab.19248172.v2
    Explore at:
    application/x-gzipAvailable download formats
    Dataset updated
    Jan 15, 2025
    Dataset provided by
    National Bioinformatics Infrastructure Sweden (Stockholm University & Science for Life Laboratory)
    Authors
    Verena Kutschera; Marcin Kierczak; Tom van der Valk; Johanna von Seth; Nicolas Dussex; Edana Lord; Marianne Dehasque; David W. G. Stanton; Payam Emami Khoonsari; Björn Nystedt; Love Dalén; David Díez del molino
    License

    https://www.gnu.org/licenses/gpl-3.0.htmlhttps://www.gnu.org/licenses/gpl-3.0.html

    Description

    This item contains a test dataset based on Sumatran rhinoceros (Dicerorhinus sumatrensis) whole-genome re-sequencing data that we publish along with the GenErode pipeline (https://github.com/NBISweden/GenErode; Kutschera et al. 2022) and that we reduced in size so that users have the possibility to get familiar with the pipeline before analyzing their own genome-wide datasets. We extracted scaffold ‘Sc9M7eS_2_HRSCAF_41’ of size 40,842,778 bp from the Sumatran rhinoceros genome assembly (Dicerorhinus sumatrensis harrissoni; GenBank accession number GCA_014189135.1) to be used as reference genome in GenErode. Some GenErode steps require the reference genome of a closely related species, so we additionally provide three scaffolds from the White rhinoceros genome assembly (Ceratotherium simum simum; GenBank accession number GCF_000283155.1) with a combined length of 41,195,616 bp that are putatively orthologous to Sumatran rhinoceros scaffold ‘Sc9M7eS_2_HRSCAF_41’, along with gene predictions in GTF format. The repository also contains a Sumatran rhinoceros mitochondrial genome (GenBank accession number NC_012684.1) to be used as reference for the optional mitochondrial mapping step in GenErode. The test dataset contains whole-genome re-sequencing data from three historical and three modern Sumatran rhinoceros samples from the now-extinct Malay Peninsula population from von Seth et al. (2021) that was subsampled to paired-end reads that mapped to Sumatran rhinoceros scaffold ‘Sc9M7eS_2_HRSCAF_41’, along with a small proportion of randomly selected reads that mapped to the Sumatran rhinoceros mitochondrial genome or elsewhere in the genome. For GERP analyses, scaffolds from the genome assemblies of 30 mammalian outgroup species are provided that had reciprocal blast hits to gene predictions from Sumatran rhinoceros scaffold ‘Sc9M7eS_2_HRSCAF_41’. Further, a phylogeny of the White rhinoceros and the 30 outgroup species including divergence time estimates (in billions of years) from timetree.org is available. Finally, the item contains configuration and metadata files that were used for three separate runs of GenErode to generate the results presented in Kutschera et al. (2022). Bash scripts and a workflow description for the test dataset generation are available in the GenErode GitHub repository (https://github.com/NBISweden/GenErode/docs/extras/test_dataset_generation).

    References: Kutschera VE, Kierczak M, van der Valk T, von Seth J, Dussex N, Lord E, et al. GenErode: a bioinformatics pipeline to investigate genome erosion in endangered and extinct species. BMC Bioinformatics 2022;23:228. https://doi.org/10.1186/s12859-022-04757-0 von Seth J, Dussex N, Díez-Del-Molino D, van der Valk T, Kutschera VE, Kierczak M, et al. Genomic insights into the conservation status of the world’s last remaining Sumatran rhinoceros populations. Nature Communications 2021;12:2393.

  5. [DATA_SCIENCE] Interviews PomBase Users, January-February 2016

    • figshare.com
    • data.niaid.nih.gov
    • +2more
    doc
    Updated Jun 3, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sabina Leonelli (2023). [DATA_SCIENCE] Interviews PomBase Users, January-February 2016 [Dataset]. http://doi.org/10.6084/m9.figshare.5484010.v1
    Explore at:
    docAvailable download formats
    Dataset updated
    Jun 3, 2023
    Dataset provided by
    Figsharehttp://figshare.com/
    figshare
    Authors
    Sabina Leonelli
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Here you find the transcripts of interviews collected by Sabina Leonelli as part of the ERC project "The Epistemology of Data-Intensive Science". You also find the information sheet provided to interviewees, which gives you the context for this project. Further information and related publications can be found at www.datastudies.eu. One paper that specifically makes use of these interviews was published by Sabina Leonelli in the journal Philosophy of Science in 2018, under the title "Data in Time: Time-Scales of Data Use in the Life Sciences." The transcripts document yeast researchers' attitudes to data curation and the use of databases in their field. Researchers have consented to have these transcripts made available as Open Data. Other interviewees did not give consent, so those transcripts are held securely by the research team in Exeter.

  6. w

    Swiss-Institute-of-Bioinformatics (Company) - Reverse Whois Lookup

    • whoisdatacenter.com
    csv
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    AllHeart Web Inc, Swiss-Institute-of-Bioinformatics (Company) - Reverse Whois Lookup [Dataset]. https://whoisdatacenter.com/company/Swiss-Institute-of-Bioinformatics/
    Explore at:
    csvAvailable download formats
    Dataset authored and provided by
    AllHeart Web Inc
    License

    https://whoisdatacenter.com/terms-of-use/https://whoisdatacenter.com/terms-of-use/

    Time period covered
    Mar 15, 1985 - Nov 4, 2025
    Description

    Uncover historical ownership history and changes over time by performing a reverse Whois lookup for the company Swiss-Institute-of-Bioinformatics.

  7. Variation data of pan-genome in 1913-based allotetraploid cottons

    • figshare.com
    txt
    Updated Oct 28, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jianying Li (2020). Variation data of pan-genome in 1913-based allotetraploid cottons [Dataset]. http://doi.org/10.6084/m9.figshare.13014314.v4
    Explore at:
    txtAvailable download formats
    Dataset updated
    Oct 28, 2020
    Dataset provided by
    Figsharehttp://figshare.com/
    figshare
    Authors
    Jianying Li
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Variation data of pan-genome in 1913-based allotetraploid cottonsThe variome data sets (SNPs, InDels, SVs. CNVs) in 1,913 cotton accessions, non-reference genome sequences and annotated genes of G. hirsutum and G. barbadense pan-genome.1. The SNPs, InDels calls in hapmap format of 1,913 cotton accession cottons.2. The SVs and CNVs in VCF format 742 cotton accessions.3. The non-reference genome sequences and gene annotations of G. hirsutum and G. barbadense accessions.4. Gene number and presence frequency in G. hirsutum and G. barbadense pan-genomes.

  8. NCBI Nt (Nucleotide) database FASTA file from 2017-10-26

    • zenodo.org
    application/gzip
    Updated Dec 23, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    James Fellows Yates; James Fellows Yates (2020). NCBI Nt (Nucleotide) database FASTA file from 2017-10-26 [Dataset]. http://doi.org/10.5281/zenodo.4382154
    Explore at:
    application/gzipAvailable download formats
    Dataset updated
    Dec 23, 2020
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    James Fellows Yates; James Fellows Yates
    License

    ODC Public Domain Dedication and Licence (PDDL) v1.0http://www.opendatacommons.org/licenses/pddl/1.0/
    License information was derived automatically

    Description

    This FASTA file is the NCBI Nt (Nucleotide) database (public domain) used for holistic metagenomic screening of ancient DNA data at the Department of Archaeogenetics at the Max Planck Institute for the Science of Human History. We offer here the FASTA file used to construct MALT databases (https://uni-tuebingen.de/fakultaeten/mathematisch-naturwissenschaftliche-fakultaet/fachbereiche/informatik/lehrstuehle/algorithms-in-bioinformatics/software/malt/), which are generally too large for uploading. Please see each relevent publications that use the database for MALT database construction commands.

    NCBI does not retain older versions of this database which is why this has been uploaded here. It was downloaded on 2017-10-26 12:39 from: ftp://ftp-trace.ncbi.nih.gov/blast/db/FASTA/nt.gz. The NCBI Nt database is released into the public domain as per https://www.ncbi.nlm.nih.gov/home/about/policies/.

  9. q

    Data from: Bioinformatics is a BLAST: Engaging First-Year Biology Students...

    • qubeshub.org
    Updated Oct 4, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Shem Unger*; Mark Rollins (2022). Bioinformatics is a BLAST: Engaging First-Year Biology Students on Campus Biodiversity Using DNA Barcoding [Dataset]. https://qubeshub.org/community/groups/coursesource/publications?id=3520
    Explore at:
    Dataset updated
    Oct 4, 2022
    Dataset provided by
    QUBES
    Authors
    Shem Unger*; Mark Rollins
    Description

    In order to introduce students to the concept of molecular diversity, we developed a short, engaging online lesson using basic bioinformatics techniques. Students were introduced to basic bioinformatics while learning about local on-campus species diversity by 1) identifying species based on a given sequence (performing Basic Local Alignment Search Tool [BLAST] analysis) and 2) researching and documenting the natural history of each species identified in a concise write-up. To assess the student’s perception of this lesson, we surveyed students using a Likert scale and asking them to elaborate in written reflection on this activity. When combined, student responses indicated that 94% of students agreed this lesson helped them understand DNA barcoding and how it is used to identify species. The majority of students, 89.5%, reported they enjoyed the lesson and mainly provided positive feedback, including “It really opened my eyes to different species on campus by looking at DNA sequences”, “I loved searching information and discovering all this new information from a DNA sequence”, and finally, “the database was fun to navigate and identifying species felt like a cool puzzle.” Our results indicate this lesson both engaged and informed students on the use of DNA barcoding as a tool to identify local species biodiversity.

    Primary Image: DNA Barcoded Specimens. Crane fly, dragonfly, ant, and spider identified using DNA barcoding.

  10. Z

    Bioinformatics Services Market by Type (Sequencing Services, Data Analysis,...

    • zionmarketresearch.com
    pdf
    Updated Nov 12, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Zion Market Research (2025). Bioinformatics Services Market by Type (Sequencing Services, Data Analysis, Drug Discovery Services, Differential Gene Expression Analysis, Database and Management Services, and Other Services), By Application Type (Genomics, Chemoinformatics and Drug Design, Proteomics, Transcriptomics, Metabolomics, and Others), and By End-users (Research Centers & Academic Institutes, Hospitals, Pharmaceutical & Biotechnology Companies, and Others), And By Region - Global And Regional Industry Overview, Market Intelligence, Comprehensive Analysis, Historical Data, And Forecasts 2024 - 2032 [Dataset]. https://www.zionmarketresearch.com/report/bioinformatics-services-market
    Explore at:
    pdfAvailable download formats
    Dataset updated
    Nov 12, 2025
    Dataset authored and provided by
    Zion Market Research
    License

    https://www.zionmarketresearch.com/privacy-policyhttps://www.zionmarketresearch.com/privacy-policy

    Time period covered
    2022 - 2030
    Area covered
    Global
    Description

    Global Bioinformatics Services market size was USD 3.12 billion in 2023 and is grow to around USD 10.87 billion by 2032 with a CAGR of roughly 14.86%.

  11. Data from: where the minor things are: a pan-eukaryotic survey suggests...

    • zenodo.org
    • datadryad.org
    application/gzip, bin
    Updated Sep 23, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Graham Larue; Graham Larue; Scott Roy; Scott Roy (2023). Data from: where the minor things are: a pan-eukaryotic survey suggests neutral processes may dominate minor spliceosomal intron evolution [Dataset]. http://doi.org/10.6071/m36q39
    Explore at:
    application/gzip, binAvailable download formats
    Dataset updated
    Sep 23, 2023
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Graham Larue; Graham Larue; Scott Roy; Scott Roy
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Spliceosomal introns are gene segments removed ("spliced") from RNA transcripts by large ribonucleoprotein machineries called spliceosomes. In some eukaryotes a second spliceosome (the minor/ U12-type) is responsible for processing a tiny minority of introns. Despite its seemingly modest role, minor splicing has persisted for roughly 1.5 billion years of eukaryotic evolution. Identifying and cataloging minor introns in > 3000 eukaryotic genomes, we report diverse evolutionary histories including surprisingly high numbers of minor introns in some fungi and green algae, repeated massive loss, as well as several general biases in the positional and genic distributions of minor introns. We estimate that ancestral minor intron densities were comparable to those of the most minor intron-rich species, suggesting a trend of long-term stasis. Finally, three findings suggest a major role for neutral processes in minor intron evolution. First, we find highly similar patterns of minor and major intron evolution, in contrast to the predictions of both functionalist and deleterious models. Second, we find that observed functional biases among minor intron-containing genes are largely explained by these genes' greater ages. Third, we find no association of intron splicing with cell proliferation in a minor intron-rich fungus, suggesting that regulatory roles are lineage-specific and thus cannot offer a general explanation for minor splicing's persistence. These data constitute the most comprehensive view to date of modern minor introns, their evolutionary history, and the forces shaping minor splicing, and provide a foundation for future studies of these remarkable genomic elements.

  12. Z

    Bioinformatics In IVD Testing Market By The type of test (blood based tests...

    • zionmarketresearch.com
    pdf
    Updated Nov 22, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Zion Market Research (2025). Bioinformatics In IVD Testing Market By The type of test (blood based tests and tissue based tests), By Application (cancer, chronic diseases, cardiovascular diseases, diabetes, and others), By Type (hardware and software) And By Region: - Global and Regional Industry Overview, Market Intelligence, Comprehensive Analysis, Historical Data, and Forecasts, 2024-2032 [Dataset]. https://www.zionmarketresearch.com/report/bioinformatics-in-ivd-testing-market
    Explore at:
    pdfAvailable download formats
    Dataset updated
    Nov 22, 2025
    Dataset authored and provided by
    Zion Market Research
    License

    https://www.zionmarketresearch.com/privacy-policyhttps://www.zionmarketresearch.com/privacy-policy

    Time period covered
    2022 - 2030
    Area covered
    Global
    Description

    Bioinformatics In IVD Testing Market valued at $97.51 Bn in 2023, and is projected to $USD 171.91 Bn by 2032, at a CAGR of 6.44% from 2023 to 2032

  13. d

    Data from: Graph splitting: a graph-based approach for superfamily-scale...

    • search.dataone.org
    • data.niaid.nih.gov
    • +2more
    Updated Jun 14, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Motomu Matsui; Wataru Iwasaki (2025). Graph splitting: a graph-based approach for superfamily-scale phylogenetic tree reconstruction [Dataset]. http://doi.org/10.5061/dryad.ps0qf4r
    Explore at:
    Dataset updated
    Jun 14, 2025
    Dataset provided by
    Dryad Digital Repository
    Authors
    Motomu Matsui; Wataru Iwasaki
    Time period covered
    Jan 1, 2019
    Description

    A protein superfamily contains distantly related proteins that have acquired diverse biological functions through a long evolutionary history. Phylogenetic analysis of the early evolution of protein superfamilies is a key challenge because existing phylogenetic methods show poor performance when protein sequences are too diverged to construct an informative multiple sequence alignment. Here, we propose the Graph Splitting (GS) method, which rapidly reconstructs a protein superfamily-scale phylogenetic tree using a graph-based approach. Evolutionary simulation showed that the GS method can accurately reconstruct phylogenetic trees and be robust to major problems in phylogenetic estimation, such as biased taxon sampling, heterogeneous evolutionary rates, and long-branch attraction when sequences are substantially diverged. Its application to an empirical dataset of the triosephosphate isomerase (TIM)-barrel superfamily suggests rapid evolution of protein-mediated pyrimidine biosynthesis, ...

  14. f

    Data_Sheet_1_Sequence Capture From Historical Museum Specimens: Maximizing...

    • frontiersin.figshare.com
    • datasetcatalog.nlm.nih.gov
    • +1more
    xlsx
    Updated Jun 14, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Emily Roycroft; Craig Moritz; Kevin C. Rowe; Adnan Moussalli; Mark D. B. Eldridge; Roberto Portela Miguez; Maxine P. Piggott; Sally Potter (2023). Data_Sheet_1_Sequence Capture From Historical Museum Specimens: Maximizing Value for Population and Phylogenomic Studies.XLSX [Dataset]. http://doi.org/10.3389/fevo.2022.931644.s002
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Jun 14, 2023
    Dataset provided by
    Frontiers
    Authors
    Emily Roycroft; Craig Moritz; Kevin C. Rowe; Adnan Moussalli; Mark D. B. Eldridge; Roberto Portela Miguez; Maxine P. Piggott; Sally Potter
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The application of high-throughput, short-read sequencing to degraded DNA has greatly increased the feasibility of generating genomic data from historical museum specimens. While many published studies report successful sequencing results from historical specimens; in reality, success and quality of sequence data can be highly variable. To examine predictors of sequencing quality, and methodological approaches to improving data accuracy, we generated and analyzed genomic sequence data from 115 historically collected museum specimens up to 180 years old. Data span both population genomic and phylogenomic scales, including historically collected specimens from 34 specimens of four species of Australian rock-wallabies (genus Petrogale) and 92 samples from 79 specimens of Australo-Papuan murine rodents (subfamily Murinae). For historical rodent specimens, where the focus was sampling for phylogenomics, we found that regardless of specimen age, DNA sequence libraries prepared from toe pad or bone subsamples performed significantly better than those taken from the skin (in terms of proportion of reads on target, number of loci captured, and data accuracy). In total, 93% of DNA libraries from toe pad or bone subsamples resulted in reliable data for phylogenetic inference, compared to 63% of skin subsamples. For skin subsamples, proportion of reads on target weakly correlated with collection year. Then using population genomic data from rock-wallaby skins as a test case, we found substantial improvement in final data quality by mapping to a high-quality “closest sister” de novo assembly from fresh tissues, compared to mapping to a sample-specific historical de novo assembly. Choice of mapping approach also affected final estimates of the number of segregating sites and Watterson's θ, both important parameters for population genomic inference. The incorporation of accurate and reliable sequence data from historical specimens has important outcomes for evolutionary studies at both population and phylogenomic scales. By assessing the outcomes of different approaches to specimen subsampling, library preparation and bioinformatic processing, our results provide a framework for increasing sequencing success for irreplaceable historical specimens.

  15. n

    GOTrack

    • neuinfo.org
    Updated Oct 18, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). GOTrack [Dataset]. http://identifiers.org/RRID:SCR_016399
    Explore at:
    Dataset updated
    Oct 18, 2024
    Description

    Open source web-based system and database that provides access to historical records and trends in the Gene Ontology (GO) and GO annotations (GOA). Used for monitoring changes in the Gene Ontology and their impact on genomic data analysis.

  16. d

    Data from: An improved hypergeometric probability method for identification...

    • search.dataone.org
    • data.niaid.nih.gov
    • +1more
    Updated Apr 16, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Appala Raju Kotaru; Khader Shameer; Pandurangan Sundaramurthy; Ramesh Chandra Joshi (2025). An improved hypergeometric probability method for identification of functionally linked proteins using phylogenetic profiles [Dataset]. http://doi.org/10.5061/dryad.m6t4j
    Explore at:
    Dataset updated
    Apr 16, 2025
    Dataset provided by
    Dryad Digital Repository
    Authors
    Appala Raju Kotaru; Khader Shameer; Pandurangan Sundaramurthy; Ramesh Chandra Joshi
    Time period covered
    Jun 6, 2013
    Description

    Predicting functions of proteins and alternatively spliced isoforms encoded in a genome is one of the important applications of bioinformatics in the post-genome era. Due to the practical limitation of experimental characterization of all proteins encoded in a genome using biochemical studies, bioinformatics methods provide powerful tools for function annotation and prediction. These methods also help minimize the growing sequence-to-function gap. Phylogenetic profiling is a bioinformatics approach to identify the influence of a trait across species and can be employed to infer the evolutionary history of proteins encoded in genomes. Here we propose an improved phylogenetic profile-based method which considers the co-evolution of the reference genome to derive the basic similarity measure, the background phylogeny of target genomes for profile generation and assigning weights to target genomes. The ordering of genomes and the runs of consecutive matches between the proteins were used to...

  17. Z

    Supplementary material 4 from: Marquina D, Roslin T, Łukasik P, Ronquist F...

    • data.niaid.nih.gov
    • data-staging.niaid.nih.gov
    Updated Jul 16, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Marquina, Daniel; Roslin, Tomas; Łukasik, Piotr; Ronquist, Fredrik (2024). Supplementary material 4 from: Marquina D, Roslin T, Łukasik P, Ronquist F (2022) Evaluation of non-destructive DNA extraction protocols for insect metabarcoding: gentler and shorter is better. Metabarcoding and Metagenomics 6: e78871. https://doi.org/10.3897/mbmg.6.78871 [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_6658749
    Explore at:
    Dataset updated
    Jul 16, 2024
    Dataset provided by
    Department of Bioinformatics and Genetics, Swedish Museum of Natural History, Stockholm, Sweden
    University of Helsinki, Helsinki, Finland|Swedish University of Agricultural Sciences, Uppsala, Sweden
    Jagiellonian University, Krakow, Poland|Department of Bioinformatics and Genetics, Swedish Museum of Natural History, Stockholm, Sweden
    Stockholm University, Stockholm, Sweden|Department of Bioinformatics and Genetics, Swedish Museum of Natural History, Stockholm, Sweden
    Authors
    Marquina, Daniel; Roslin, Tomas; Łukasik, Piotr; Ronquist, Fredrik
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Bioinformatic pipeline

  18. A systematic review of the ecological literature on cushion plants

    • commons.datacite.org
    • figshare.com
    Updated Jun 5, 2014
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Anya Reid; Laurent Lamarque; , Ecoblender (2014). A systematic review of the ecological literature on cushion plants [Dataset]. http://doi.org/10.6084/m9.figshare.1047279
    Explore at:
    Dataset updated
    Jun 5, 2014
    Dataset provided by
    DataCitehttps://www.datacite.org/
    Figsharehttp://figshare.com/
    figshare
    Authors
    Anya Reid; Laurent Lamarque; , Ecoblender
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Cushion-forming plant species are found in alpine and polar environments around the world. They modify the microclimate, thereby facilitating other plant species. Similar to the effectiveness of shrubs as a means to study facilitation in arid and semi-arid environments, we explore the potential for cushion plant species to expand the general- ity of research on this contemporary ecological interaction. A systematic review was conducted to determine the number of publications and citation frequency on relevant ecological topics whilst using shrub literature as a baseline to assess relative importance of cushions as a focal point for future ecological research. Although, there are forty times more shrub articles, mean citations per paper is comparable between cushion and shrub literature. Furthermore, the scope of ecological research topics studied us- ing cushions is broad including facilitation, competition, environmental gradients, life history, genetics, reproduction, community, ecosystem and evolution. The preliminary ecological evidence to date also strongly suggests that cushion plants can be keystone species in their ecosystems. Hence, ecological research on net interactions including facilitation and patterns of diversity can be successfully examined using cushion plants, and this is particularly timely given expectations associated with a changing climate in these regions.

  19. Cell_Gene_Expression_Metadata

    • kaggle.com
    zip
    Updated Sep 24, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kazi Aishikuzzaman (2025). Cell_Gene_Expression_Metadata [Dataset]. https://www.kaggle.com/datasets/kaziaishikuzzaman/cell-gene-expression-metadata
    Explore at:
    zip(845887409 bytes)Available download formats
    Dataset updated
    Sep 24, 2025
    Authors
    Kazi Aishikuzzaman
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Overview This dataset contains comprehensive metadata from single-cell gene expression studies, providing researchers with structured information about cellular phenotypes, experimental conditions, and sample characteristics. The data is particularly valuable for bioinformatics research, machine learning applications in genomics, and comparative studies across different cell types and conditions.

    Dataset Description: The dataset comprises metadata associated with single-cell RNA sequencing (scRNA-seq) experiments, including: Cell Type Information: Classification of different cell types and subtypes Experimental Metadata: Details about experimental conditions, protocols, and methodologies Sample Characteristics: Information about biological samples, including tissue origin, developmental stages, and treatment conditions Quality Metrics: Data quality indicators and filtering parameters Annotation Details: Standardized cell type annotations and biological classifications

    Data Source and Licensing This dataset is derived from publicly available single-cell gene expression data, potentially sourced from: CELLxGENE Data Portal (https://cellxgene.cziscience.com/) Gene Expression Omnibus (GEO) European Bioinformatics Institute (EBI) Other public genomics repositories

    License: Creative Commons CC BY 4.0 (or specify the actual license) ✅ Commercial use allowed ✅ Modification allowed ✅ Distribution allowed ✅ Private use allowed ❗ Attribution required

    Research Applications Cell Type Discovery: Identify novel cell types and subtypes Comparative Genomics: Study cellular differences across conditions, tissues, or species Disease Research: Investigate cellular changes in disease states Developmental Biology: Analyze cellular differentiation and development patterns

    Machine Learning Applications Classification Tasks: Predict cell types from gene expression data Clustering Analysis: Discover cellular subpopulations and states Dimensionality Reduction: Apply PCA, t-SNE, UMAP for visualization Biomarker Discovery: Identify genes characteristic of specific cell types

    Educational Use : Teaching bioinformatics and computational biology concepts. Demonstrating single-cell analysis workflows. Training in data preprocessing and quality control.

    Data Quality and Preprocessing : Quality Control: Metadata has been curated and standardized Missing Values: [Specify how missing values are handled] Standardization: Cell type annotations follow established ontologies (e.g., Cell Ontology) Validation: Data has been cross-referenced with original publications

    Usage Guidelines : Getting Started- Load the metadata files using pandas or your preferred data analysis tool. Explore the cell type distributions and experimental conditions. Filter data based on quality metrics as needed. Join with corresponding gene expression data for comprehensive analysis.

    Best Practices Always cite original data sources and publications. Consider batch effects when combining data from different experiments. Validate findings with independent datasets when possible. Follow established bioinformatics workflows for single-cell analysis.

    Citation and Acknowledgments : If you use this dataset in your research, please: Cite this dataset:[Kazi Aishikuzzaman]. (2024). Cell Gene Expression Metadata. Kaggle. https://www.kaggle.com/datasets/kaziaishikuzzaman/cell-gene-expression-metadata

    File Structure : dataset- ─ metadata_summary.csv # Main metadata file ─ cell_type_annotations.csv # Detailed cell type information
    ─ experimental_conditions.csv # Experiment-specific metadata ─ quality_metrics.csv # Data quality indicators ─ README.txt # Detailed file descriptions

    Technical Specifications : File Encoding: UTF-8 Separator: Comma-separated values (CSV) Missing Values: Represented as 'NA' or empty cells Data Types: Mixed (categorical, numerical, text)

    Contact and Support : For questions about this dataset: Kaggle Profile: @kaziaishikuzzaman Dataset Issues: Use Kaggle's discussion section Collaboration: Open to research collaborations and improvements

    Version History : v1.0: Initial release with comprehensive metadata collection [Future versions]: Updates and additional annotations as available

    Related Datasets: Consider exploring these complementary datasets- Single-cell gene expression data (companion to this metadata) Cell atlas datasets from major consortiums Disease-specific single-cell studies Multi-omics datasets with matching cell types

    Keywords: single-cell, RNA-seq, genomics, cell types, metadata, bioinformatics, machine learning, computational biology Category: Biology > Genomics

  20. Supplementary Material to the Publication: Clonal relation between...

    • data.europa.eu
    • zenodo.org
    unknown
    Updated Jul 3, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Zenodo (2025). Supplementary Material to the Publication: Clonal relation between Salmonella enterica subspecies enterica serovar Dublin strains of bovine and food origin in Germany [Dataset]. https://data.europa.eu/data/datasets/oai-zenodo-org-8009331?locale=el
    Explore at:
    unknown(4092)Available download formats
    Dataset updated
    Jul 3, 2025
    Dataset authored and provided by
    Zenodohttp://zenodo.org/
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Germany
    Description

    OHEJP Project: BeOne Salmonella enterica serovar Dublin (S. Dublin) is a host-adapted serovar that causes enteritis and/or systemic diseases in cattle. Because the serovar is not host-specific, it can infect other species, including human beings, causing severe disease and a higher mortality rate than other non-typhoidal serovars. Given that human illnesses are primarily caused by contaminated milk, milk products, and beef, data on the genetic connection between S. Dublin strains from livestock and food should be analyzed. Whole genome sequencing (WGS) was performed on 144 S. Dublin strains from cattle and 30 strains from food. Multilocus sequence typing (MLST) found that the majority of livestock and food isolates were of the sequence type ST-10. As discovered by core-genome Single-Nucleotide Polymorphisms Typing and core-genome MLST, 14 of 30 strains from food origin were clonally related to at least one strain from cattle. Without outliers, the remaining 16 food-borne strains fit into the genomic structure of S. Dublin in Germany. WGS demonstrated to be an effective method not only for learning about the epidemiology of Salmonella strains, but also for detecting clonal relationships between organisms isolated at different stages of production. This study discovered a strong genetic link between S. Dublin strains from cattle and food, and thus the potential to cause human infections. S. Dublin strains from both origins have a nearly comparable collection of virulence factors, emphasizing their ability to produce severe clinical symptoms in animals as well as humans, emphasizing the importance of effective S. Dublin management in a farm to fork strategy.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Colbie Reed (2023). Data Object 1-1 (Supplemental Data 1-S1) [Dataset]. http://doi.org/10.6084/m9.figshare.24548935.v1
Organization logoOrganization logo

Data Object 1-1 (Supplemental Data 1-S1)

Explore at:
xlsxAvailable download formats
Dataset updated
Nov 12, 2023
Dataset provided by
Figsharehttp://figshare.com/
figshare
Authors
Colbie Reed
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Supplemental Data 1-S1. Timeline of important events shaping contemporary bioinformatics and comparative genomics. Timeline is not intended to be absolutely comprehensive of each of the observed fields, their respective histories. See footnotes for key review publications, sources in addition to those listed in Reference column. Field of contributions are color-coded accordingly: purple= computer science/engineering, blue= legislation/government action, biology= green, economic/markets= orange, academic institution= pink

Search
Clear search
Close search
Google apps
Main menu