100+ datasets found

Data Object 1-1 (Supplemental Data 1-S1)
figshare.com
xlsx
Updated Nov 12, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Colbie Reed (2023). Data Object 1-1 (Supplemental Data 1-S1) [Dataset]. http://doi.org/10.6084/m9.figshare.24548935.v1
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.24548935.v1
Dataset updated
Nov 12, 2023
Dataset provided by
Figsharehttp://figshare.com/
figshare
Authors
Colbie Reed
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Supplemental Data 1-S1. Timeline of important events shaping contemporary bioinformatics and comparative genomics. Timeline is not intended to be absolutely comprehensive of each of the observed fields, their respective histories. See footnotes for key review publications, sources in addition to those listed in Reference column. Field of contributions are color-coded accordingly: purple= computer science/engineering, blue= legislation/government action, biology= green, economic/markets= orange, academic institution= pink
w
blog-bioinformatics.science - Historical whois Lookup
whoisdatacenter.com
csv
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
AllHeart Web Inc, blog-bioinformatics.science - Historical whois Lookup [Dataset]. https://whoisdatacenter.com/domain/blog-bioinformatics.science/
Explore at:
csvAvailable download formats
Dataset authored and provided by
AllHeart Web Inc
License
https://whoisdatacenter.com/terms-of-use/https://whoisdatacenter.com/terms-of-use/
Time period covered
Mar 15, 1985 - Oct 6, 2025
Description
Explore the historical Whois records related to blog-bioinformatics.science (Domain). Get insights into ownership history and changes over time.
Biodiversity Informatics at the Natural History Museum
figshare.com
pptx
Updated Jun 19, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Edward Baker (2023). Biodiversity Informatics at the Natural History Museum [Dataset]. http://doi.org/10.6084/m9.figshare.722897.v1
Explore at:
pptxAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.722897.v1
Dataset updated
Jun 19, 2023
Dataset provided by
Figsharehttp://figshare.com/
figshare
Authors
Edward Baker
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Overview of the NHM Informatics Intiative based around the data life cycle.
s
Test dataset from: GenErode: a bioinformatics pipeline to investigate genome...
figshare.scilifelab.se
datasetcatalog.nlm.nih.gov
+3more
application/x-gzip
Updated Jan 15, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Verena Kutschera; Marcin Kierczak; Tom van der Valk; Johanna von Seth; Nicolas Dussex; Edana Lord; Marianne Dehasque; David W. G. Stanton; Payam Emami Khoonsari; Björn Nystedt; Love Dalén; David Díez del molino (2025). Test dataset from: GenErode: a bioinformatics pipeline to investigate genome erosion in endangered and extinct species [Dataset]. http://doi.org/10.17044/scilifelab.19248172.v2
Explore at:
application/x-gzipAvailable download formats
Unique identifier
https://doi.org/10.17044/scilifelab.19248172.v2
Dataset updated
Jan 15, 2025
Dataset provided by
National Bioinformatics Infrastructure Sweden (Stockholm University & Science for Life Laboratory)
Authors
Verena Kutschera; Marcin Kierczak; Tom van der Valk; Johanna von Seth; Nicolas Dussex; Edana Lord; Marianne Dehasque; David W. G. Stanton; Payam Emami Khoonsari; Björn Nystedt; Love Dalén; David Díez del molino
License
https://www.gnu.org/licenses/gpl-3.0.htmlhttps://www.gnu.org/licenses/gpl-3.0.html
Description
This item contains a test dataset based on Sumatran rhinoceros (Dicerorhinus sumatrensis) whole-genome re-sequencing data that we publish along with the GenErode pipeline (https://github.com/NBISweden/GenErode; Kutschera et al. 2022) and that we reduced in size so that users have the possibility to get familiar with the pipeline before analyzing their own genome-wide datasets. We extracted scaffold ‘Sc9M7eS_2_HRSCAF_41’ of size 40,842,778 bp from the Sumatran rhinoceros genome assembly (Dicerorhinus sumatrensis harrissoni; GenBank accession number GCA_014189135.1) to be used as reference genome in GenErode. Some GenErode steps require the reference genome of a closely related species, so we additionally provide three scaffolds from the White rhinoceros genome assembly (Ceratotherium simum simum; GenBank accession number GCF_000283155.1) with a combined length of 41,195,616 bp that are putatively orthologous to Sumatran rhinoceros scaffold ‘Sc9M7eS_2_HRSCAF_41’, along with gene predictions in GTF format. The repository also contains a Sumatran rhinoceros mitochondrial genome (GenBank accession number NC_012684.1) to be used as reference for the optional mitochondrial mapping step in GenErode. The test dataset contains whole-genome re-sequencing data from three historical and three modern Sumatran rhinoceros samples from the now-extinct Malay Peninsula population from von Seth et al. (2021) that was subsampled to paired-end reads that mapped to Sumatran rhinoceros scaffold ‘Sc9M7eS_2_HRSCAF_41’, along with a small proportion of randomly selected reads that mapped to the Sumatran rhinoceros mitochondrial genome or elsewhere in the genome. For GERP analyses, scaffolds from the genome assemblies of 30 mammalian outgroup species are provided that had reciprocal blast hits to gene predictions from Sumatran rhinoceros scaffold ‘Sc9M7eS_2_HRSCAF_41’. Further, a phylogeny of the White rhinoceros and the 30 outgroup species including divergence time estimates (in billions of years) from timetree.org is available. Finally, the item contains configuration and metadata files that were used for three separate runs of GenErode to generate the results presented in Kutschera et al. (2022). Bash scripts and a workflow description for the test dataset generation are available in the GenErode GitHub repository (https://github.com/NBISweden/GenErode/docs/extras/test_dataset_generation).

References: Kutschera VE, Kierczak M, van der Valk T, von Seth J, Dussex N, Lord E, et al. GenErode: a bioinformatics pipeline to investigate genome erosion in endangered and extinct species. BMC Bioinformatics 2022;23:228. https://doi.org/10.1186/s12859-022-04757-0 von Seth J, Dussex N, Díez-Del-Molino D, van der Valk T, Kutschera VE, Kierczak M, et al. Genomic insights into the conservation status of the world’s last remaining Sumatran rhinoceros populations. Nature Communications 2021;12:2393.
[DATA_SCIENCE] Interviews PomBase Users, January-February 2016
figshare.com
data.niaid.nih.gov
+2more
doc
Updated Jun 3, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sabina Leonelli (2023). [DATA_SCIENCE] Interviews PomBase Users, January-February 2016 [Dataset]. http://doi.org/10.6084/m9.figshare.5484010.v1
Explore at:
docAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.5484010.v1
Dataset updated
Jun 3, 2023
Dataset provided by
Figsharehttp://figshare.com/
figshare
Authors
Sabina Leonelli
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Here you find the transcripts of interviews collected by Sabina Leonelli as part of the ERC project "The Epistemology of Data-Intensive Science". You also find the information sheet provided to interviewees, which gives you the context for this project. Further information and related publications can be found at www.datastudies.eu. One paper that specifically makes use of these interviews was published by Sabina Leonelli in the journal Philosophy of Science in 2018, under the title "Data in Time: Time-Scales of Data Use in the Life Sciences." The transcripts document yeast researchers' attitudes to data curation and the use of databases in their field. Researchers have consented to have these transcripts made available as Open Data. Other interviewees did not give consent, so those transcripts are held securely by the research team in Exeter.
w
Swiss-Institute-of-Bioinformatics (Company) - Reverse Whois Lookup
whoisdatacenter.com
csv
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
AllHeart Web Inc, Swiss-Institute-of-Bioinformatics (Company) - Reverse Whois Lookup [Dataset]. https://whoisdatacenter.com/company/Swiss-Institute-of-Bioinformatics/
Explore at:
csvAvailable download formats
Dataset authored and provided by
AllHeart Web Inc
License
https://whoisdatacenter.com/terms-of-use/https://whoisdatacenter.com/terms-of-use/
Time period covered
Mar 15, 1985 - Nov 4, 2025
Description
Uncover historical ownership history and changes over time by performing a reverse Whois lookup for the company Swiss-Institute-of-Bioinformatics.
Variation data of pan-genome in 1913-based allotetraploid cottons
figshare.com
txt
Updated Oct 28, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jianying Li (2020). Variation data of pan-genome in 1913-based allotetraploid cottons [Dataset]. http://doi.org/10.6084/m9.figshare.13014314.v4
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.13014314.v4
Dataset updated
Oct 28, 2020
Dataset provided by
Figsharehttp://figshare.com/
figshare
Authors
Jianying Li
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Variation data of pan-genome in 1913-based allotetraploid cottonsThe variome data sets (SNPs, InDels, SVs. CNVs) in 1,913 cotton accessions, non-reference genome sequences and annotated genes of G. hirsutum and G. barbadense pan-genome.1. The SNPs, InDels calls in hapmap format of 1,913 cotton accession cottons.2. The SVs and CNVs in VCF format 742 cotton accessions.3. The non-reference genome sequences and gene annotations of G. hirsutum and G. barbadense accessions.4. Gene number and presence frequency in G. hirsutum and G. barbadense pan-genomes.
NCBI Nt (Nucleotide) database FASTA file from 2017-10-26
zenodo.org
application/gzip
Updated Dec 23, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
James Fellows Yates; James Fellows Yates (2020). NCBI Nt (Nucleotide) database FASTA file from 2017-10-26 [Dataset]. http://doi.org/10.5281/zenodo.4382154
Explore at:
application/gzipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.4382154
Dataset updated
Dec 23, 2020
Dataset provided by
Zenodohttp://zenodo.org/
Authors
James Fellows Yates; James Fellows Yates
License
ODC Public Domain Dedication and Licence (PDDL) v1.0http://www.opendatacommons.org/licenses/pddl/1.0/
License information was derived automatically
Description
This FASTA file is the NCBI Nt (Nucleotide) database (public domain) used for holistic metagenomic screening of ancient DNA data at the Department of Archaeogenetics at the Max Planck Institute for the Science of Human History. We offer here the FASTA file used to construct MALT databases (https://uni-tuebingen.de/fakultaeten/mathematisch-naturwissenschaftliche-fakultaet/fachbereiche/informatik/lehrstuehle/algorithms-in-bioinformatics/software/malt/), which are generally too large for uploading. Please see each relevent publications that use the database for MALT database construction commands.

NCBI does not retain older versions of this database which is why this has been uploaded here. It was downloaded on 2017-10-26 12:39 from: ftp://ftp-trace.ncbi.nih.gov/blast/db/FASTA/nt.gz. The NCBI Nt database is released into the public domain as per https://www.ncbi.nlm.nih.gov/home/about/policies/.
q
Data from: Bioinformatics is a BLAST: Engaging First-Year Biology Students...
qubeshub.org
Updated Oct 4, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Shem Unger*; Mark Rollins (2022). Bioinformatics is a BLAST: Engaging First-Year Biology Students on Campus Biodiversity Using DNA Barcoding [Dataset]. https://qubeshub.org/community/groups/coursesource/publications?id=3520
Explore at:
Dataset updated
Oct 4, 2022
Dataset provided by
QUBES
Authors
Shem Unger*; Mark Rollins
Description
In order to introduce students to the concept of molecular diversity, we developed a short, engaging online lesson using basic bioinformatics techniques. Students were introduced to basic bioinformatics while learning about local on-campus species diversity by 1) identifying species based on a given sequence (performing Basic Local Alignment Search Tool [BLAST] analysis) and 2) researching and documenting the natural history of each species identified in a concise write-up. To assess the student’s perception of this lesson, we surveyed students using a Likert scale and asking them to elaborate in written reflection on this activity. When combined, student responses indicated that 94% of students agreed this lesson helped them understand DNA barcoding and how it is used to identify species. The majority of students, 89.5%, reported they enjoyed the lesson and mainly provided positive feedback, including “It really opened my eyes to different species on campus by looking at DNA sequences”, “I loved searching information and discovering all this new information from a DNA sequence”, and finally, “the database was fun to navigate and identifying species felt like a cool puzzle.” Our results indicate this lesson both engaged and informed students on the use of DNA barcoding as a tool to identify local species biodiversity.

Primary Image: DNA Barcoded Specimens. Crane fly, dragonfly, ant, and spider identified using DNA barcoding.
Z
Bioinformatics Services Market by Type (Sequencing Services, Data Analysis,...
zionmarketresearch.com
pdf
Updated Nov 12, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Zion Market Research (2025). Bioinformatics Services Market by Type (Sequencing Services, Data Analysis, Drug Discovery Services, Differential Gene Expression Analysis, Database and Management Services, and Other Services), By Application Type (Genomics, Chemoinformatics and Drug Design, Proteomics, Transcriptomics, Metabolomics, and Others), and By End-users (Research Centers & Academic Institutes, Hospitals, Pharmaceutical & Biotechnology Companies, and Others), And By Region - Global And Regional Industry Overview, Market Intelligence, Comprehensive Analysis, Historical Data, And Forecasts 2024 - 2032 [Dataset]. https://www.zionmarketresearch.com/report/bioinformatics-services-market
Explore at:
pdfAvailable download formats
Dataset updated
Nov 12, 2025
Dataset authored and provided by
Zion Market Research
License
https://www.zionmarketresearch.com/privacy-policyhttps://www.zionmarketresearch.com/privacy-policy
Time period covered
2022 - 2030
Area covered
Global
Description
Global Bioinformatics Services market size was USD 3.12 billion in 2023 and is grow to around USD 10.87 billion by 2032 with a CAGR of roughly 14.86%.
Data from: where the minor things are: a pan-eukaryotic survey suggests...
zenodo.org
datadryad.org
application/gzip, bin
Updated Sep 23, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Graham Larue; Graham Larue; Scott Roy; Scott Roy (2023). Data from: where the minor things are: a pan-eukaryotic survey suggests neutral processes may dominate minor spliceosomal intron evolution [Dataset]. http://doi.org/10.6071/m36q39
Explore at:
application/gzip, binAvailable download formats
Unique identifier
https://doi.org/10.6071/m36q39
Dataset updated
Sep 23, 2023
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Graham Larue; Graham Larue; Scott Roy; Scott Roy
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
Spliceosomal introns are gene segments removed ("spliced") from RNA transcripts by large ribonucleoprotein machineries called spliceosomes. In some eukaryotes a second spliceosome (the minor/ U12-type) is responsible for processing a tiny minority of introns. Despite its seemingly modest role, minor splicing has persisted for roughly 1.5 billion years of eukaryotic evolution. Identifying and cataloging minor introns in > 3000 eukaryotic genomes, we report diverse evolutionary histories including surprisingly high numbers of minor introns in some fungi and green algae, repeated massive loss, as well as several general biases in the positional and genic distributions of minor introns. We estimate that ancestral minor intron densities were comparable to those of the most minor intron-rich species, suggesting a trend of long-term stasis. Finally, three findings suggest a major role for neutral processes in minor intron evolution. First, we find highly similar patterns of minor and major intron evolution, in contrast to the predictions of both functionalist and deleterious models. Second, we find that observed functional biases among minor intron-containing genes are largely explained by these genes' greater ages. Third, we find no association of intron splicing with cell proliferation in a minor intron-rich fungus, suggesting that regulatory roles are lineage-specific and thus cannot offer a general explanation for minor splicing's persistence. These data constitute the most comprehensive view to date of modern minor introns, their evolutionary history, and the forces shaping minor splicing, and provide a foundation for future studies of these remarkable genomic elements.
Z
Bioinformatics In IVD Testing Market By The type of test (blood based tests...
zionmarketresearch.com
pdf
Updated Nov 22, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Zion Market Research (2025). Bioinformatics In IVD Testing Market By The type of test (blood based tests and tissue based tests), By Application (cancer, chronic diseases, cardiovascular diseases, diabetes, and others), By Type (hardware and software) And By Region: - Global and Regional Industry Overview, Market Intelligence, Comprehensive Analysis, Historical Data, and Forecasts, 2024-2032 [Dataset]. https://www.zionmarketresearch.com/report/bioinformatics-in-ivd-testing-market
Explore at:
pdfAvailable download formats
Dataset updated
Nov 22, 2025
Dataset authored and provided by
Zion Market Research
License
https://www.zionmarketresearch.com/privacy-policyhttps://www.zionmarketresearch.com/privacy-policy
Time period covered
2022 - 2030
Area covered
Global
Description
Bioinformatics In IVD Testing Market valued at $97.51 Bn in 2023, and is projected to $USD 171.91 Bn by 2032, at a CAGR of 6.44% from 2023 to 2032
d
Data from: Graph splitting: a graph-based approach for superfamily-scale...
search.dataone.org
data.niaid.nih.gov
+2more
Updated Jun 14, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Motomu Matsui; Wataru Iwasaki (2025). Graph splitting: a graph-based approach for superfamily-scale phylogenetic tree reconstruction [Dataset]. http://doi.org/10.5061/dryad.ps0qf4r
Explore at:
Unique identifier
https://doi.org/10.5061/dryad.ps0qf4r
Dataset updated
Jun 14, 2025
Dataset provided by
Dryad Digital Repository
Authors
Motomu Matsui; Wataru Iwasaki
Time period covered
Jan 1, 2019
Description
A protein superfamily contains distantly related proteins that have acquired diverse biological functions through a long evolutionary history. Phylogenetic analysis of the early evolution of protein superfamilies is a key challenge because existing phylogenetic methods show poor performance when protein sequences are too diverged to construct an informative multiple sequence alignment. Here, we propose the Graph Splitting (GS) method, which rapidly reconstructs a protein superfamily-scale phylogenetic tree using a graph-based approach. Evolutionary simulation showed that the GS method can accurately reconstruct phylogenetic trees and be robust to major problems in phylogenetic estimation, such as biased taxon sampling, heterogeneous evolutionary rates, and long-branch attraction when sequences are substantially diverged. Its application to an empirical dataset of the triosephosphate isomerase (TIM)-barrel superfamily suggests rapid evolution of protein-mediated pyrimidine biosynthesis, ...
f
Data_Sheet_1_Sequence Capture From Historical Museum Specimens: Maximizing...
frontiersin.figshare.com
datasetcatalog.nlm.nih.gov
+1more
xlsx
Updated Jun 14, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Emily Roycroft; Craig Moritz; Kevin C. Rowe; Adnan Moussalli; Mark D. B. Eldridge; Roberto Portela Miguez; Maxine P. Piggott; Sally Potter (2023). Data_Sheet_1_Sequence Capture From Historical Museum Specimens: Maximizing Value for Population and Phylogenomic Studies.XLSX [Dataset]. http://doi.org/10.3389/fevo.2022.931644.s002
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.3389/fevo.2022.931644.s002
Dataset updated
Jun 14, 2023
Dataset provided by
Frontiers
Authors
Emily Roycroft; Craig Moritz; Kevin C. Rowe; Adnan Moussalli; Mark D. B. Eldridge; Roberto Portela Miguez; Maxine P. Piggott; Sally Potter
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The application of high-throughput, short-read sequencing to degraded DNA has greatly increased the feasibility of generating genomic data from historical museum specimens. While many published studies report successful sequencing results from historical specimens; in reality, success and quality of sequence data can be highly variable. To examine predictors of sequencing quality, and methodological approaches to improving data accuracy, we generated and analyzed genomic sequence data from 115 historically collected museum specimens up to 180 years old. Data span both population genomic and phylogenomic scales, including historically collected specimens from 34 specimens of four species of Australian rock-wallabies (genus Petrogale) and 92 samples from 79 specimens of Australo-Papuan murine rodents (subfamily Murinae). For historical rodent specimens, where the focus was sampling for phylogenomics, we found that regardless of specimen age, DNA sequence libraries prepared from toe pad or bone subsamples performed significantly better than those taken from the skin (in terms of proportion of reads on target, number of loci captured, and data accuracy). In total, 93% of DNA libraries from toe pad or bone subsamples resulted in reliable data for phylogenetic inference, compared to 63% of skin subsamples. For skin subsamples, proportion of reads on target weakly correlated with collection year. Then using population genomic data from rock-wallaby skins as a test case, we found substantial improvement in final data quality by mapping to a high-quality “closest sister” de novo assembly from fresh tissues, compared to mapping to a sample-specific historical de novo assembly. Choice of mapping approach also affected final estimates of the number of segregating sites and Watterson's θ, both important parameters for population genomic inference. The incorporation of accurate and reliable sequence data from historical specimens has important outcomes for evolutionary studies at both population and phylogenomic scales. By assessing the outcomes of different approaches to specimen subsampling, library preparation and bioinformatic processing, our results provide a framework for increasing sequencing success for irreplaceable historical specimens.
n
GOTrack
neuinfo.org
Updated Oct 18, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). GOTrack [Dataset]. http://identifiers.org/RRID:SCR_016399
Explore at:
Unique identifier
https://identifiers.org/RRID:SCR_016399 https://identifiers.org/RRID:SCR_016399/resolver/mentions
Dataset updated
Oct 18, 2024
Description
Open source web-based system and database that provides access to historical records and trends in the Gene Ontology (GO) and GO annotations (GOA). Used for monitoring changes in the Gene Ontology and their impact on genomic data analysis.
d
Data from: An improved hypergeometric probability method for identification...
search.dataone.org
data.niaid.nih.gov
+1more
Updated Apr 16, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Appala Raju Kotaru; Khader Shameer; Pandurangan Sundaramurthy; Ramesh Chandra Joshi (2025). An improved hypergeometric probability method for identification of functionally linked proteins using phylogenetic profiles [Dataset]. http://doi.org/10.5061/dryad.m6t4j
Explore at:
Unique identifier
https://doi.org/10.5061/dryad.m6t4j
Dataset updated
Apr 16, 2025
Dataset provided by
Dryad Digital Repository
Authors
Appala Raju Kotaru; Khader Shameer; Pandurangan Sundaramurthy; Ramesh Chandra Joshi
Time period covered
Jun 6, 2013
Description
Predicting functions of proteins and alternatively spliced isoforms encoded in a genome is one of the important applications of bioinformatics in the post-genome era. Due to the practical limitation of experimental characterization of all proteins encoded in a genome using biochemical studies, bioinformatics methods provide powerful tools for function annotation and prediction. These methods also help minimize the growing sequence-to-function gap. Phylogenetic profiling is a bioinformatics approach to identify the influence of a trait across species and can be employed to infer the evolutionary history of proteins encoded in genomes. Here we propose an improved phylogenetic profile-based method which considers the co-evolution of the reference genome to derive the basic similarity measure, the background phylogeny of target genomes for profile generation and assigning weights to target genomes. The ordering of genomes and the runs of consecutive matches between the proteins were used to...
Z
Supplementary material 4 from: Marquina D, Roslin T, Łukasik P, Ronquist F...
data.niaid.nih.gov
data-staging.niaid.nih.gov
Updated Jul 16, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Marquina, Daniel; Roslin, Tomas; Łukasik, Piotr; Ronquist, Fredrik (2024). Supplementary material 4 from: Marquina D, Roslin T, Łukasik P, Ronquist F (2022) Evaluation of non-destructive DNA extraction protocols for insect metabarcoding: gentler and shorter is better. Metabarcoding and Metagenomics 6: e78871. https://doi.org/10.3897/mbmg.6.78871 [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_6658749
Explore at:
Dataset updated
Jul 16, 2024
Dataset provided by
Department of Bioinformatics and Genetics, Swedish Museum of Natural History, Stockholm, Sweden
University of Helsinki, Helsinki, Finland|Swedish University of Agricultural Sciences, Uppsala, Sweden
Jagiellonian University, Krakow, Poland|Department of Bioinformatics and Genetics, Swedish Museum of Natural History, Stockholm, Sweden
Stockholm University, Stockholm, Sweden|Department of Bioinformatics and Genetics, Swedish Museum of Natural History, Stockholm, Sweden
Authors
Marquina, Daniel; Roslin, Tomas; Łukasik, Piotr; Ronquist, Fredrik
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
Bioinformatic pipeline
A systematic review of the ecological literature on cushion plants
commons.datacite.org
figshare.com
Updated Jun 5, 2014
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Anya Reid; Laurent Lamarque; , Ecoblender (2014). A systematic review of the ecological literature on cushion plants [Dataset]. http://doi.org/10.6084/m9.figshare.1047279
Explore at:
Unique identifier
https://doi.org/10.6084/m9.figshare.1047279
Dataset updated
Jun 5, 2014
Dataset provided by
DataCitehttps://www.datacite.org/
Figsharehttp://figshare.com/
figshare
Authors
Anya Reid; Laurent Lamarque; , Ecoblender
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Cushion-forming plant species are found in alpine and polar environments around the world. They modify the microclimate, thereby facilitating other plant species. Similar to the effectiveness of shrubs as a means to study facilitation in arid and semi-arid environments, we explore the potential for cushion plant species to expand the general- ity of research on this contemporary ecological interaction. A systematic review was conducted to determine the number of publications and citation frequency on relevant ecological topics whilst using shrub literature as a baseline to assess relative importance of cushions as a focal point for future ecological research. Although, there are forty times more shrub articles, mean citations per paper is comparable between cushion and shrub literature. Furthermore, the scope of ecological research topics studied us- ing cushions is broad including facilitation, competition, environmental gradients, life history, genetics, reproduction, community, ecosystem and evolution. The preliminary ecological evidence to date also strongly suggests that cushion plants can be keystone species in their ecosystems. Hence, ecological research on net interactions including facilitation and patterns of diversity can be successfully examined using cushion plants, and this is particularly timely given expectations associated with a changing climate in these regions.
Cell_Gene_Expression_Metadata
kaggle.com
zip
Updated Sep 24, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Kazi Aishikuzzaman (2025). Cell_Gene_Expression_Metadata [Dataset]. https://www.kaggle.com/datasets/kaziaishikuzzaman/cell-gene-expression-metadata
Explore at:
zip(845887409 bytes)Available download formats
Dataset updated
Sep 24, 2025
Authors
Kazi Aishikuzzaman
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Overview This dataset contains comprehensive metadata from single-cell gene expression studies, providing researchers with structured information about cellular phenotypes, experimental conditions, and sample characteristics. The data is particularly valuable for bioinformatics research, machine learning applications in genomics, and comparative studies across different cell types and conditions.

Dataset Description: The dataset comprises metadata associated with single-cell RNA sequencing (scRNA-seq) experiments, including: Cell Type Information: Classification of different cell types and subtypes Experimental Metadata: Details about experimental conditions, protocols, and methodologies Sample Characteristics: Information about biological samples, including tissue origin, developmental stages, and treatment conditions Quality Metrics: Data quality indicators and filtering parameters Annotation Details: Standardized cell type annotations and biological classifications

Data Source and Licensing This dataset is derived from publicly available single-cell gene expression data, potentially sourced from: CELLxGENE Data Portal (https://cellxgene.cziscience.com/) Gene Expression Omnibus (GEO) European Bioinformatics Institute (EBI) Other public genomics repositories

License: Creative Commons CC BY 4.0 (or specify the actual license) ✅ Commercial use allowed ✅ Modification allowed ✅ Distribution allowed ✅ Private use allowed ❗ Attribution required

Research Applications Cell Type Discovery: Identify novel cell types and subtypes Comparative Genomics: Study cellular differences across conditions, tissues, or species Disease Research: Investigate cellular changes in disease states Developmental Biology: Analyze cellular differentiation and development patterns

Machine Learning Applications Classification Tasks: Predict cell types from gene expression data Clustering Analysis: Discover cellular subpopulations and states Dimensionality Reduction: Apply PCA, t-SNE, UMAP for visualization Biomarker Discovery: Identify genes characteristic of specific cell types

Educational Use : Teaching bioinformatics and computational biology concepts. Demonstrating single-cell analysis workflows. Training in data preprocessing and quality control.

Data Quality and Preprocessing : Quality Control: Metadata has been curated and standardized Missing Values: [Specify how missing values are handled] Standardization: Cell type annotations follow established ontologies (e.g., Cell Ontology) Validation: Data has been cross-referenced with original publications

Usage Guidelines : Getting Started- Load the metadata files using pandas or your preferred data analysis tool. Explore the cell type distributions and experimental conditions. Filter data based on quality metrics as needed. Join with corresponding gene expression data for comprehensive analysis.

Best Practices Always cite original data sources and publications. Consider batch effects when combining data from different experiments. Validate findings with independent datasets when possible. Follow established bioinformatics workflows for single-cell analysis.

Citation and Acknowledgments : If you use this dataset in your research, please: Cite this dataset:[Kazi Aishikuzzaman]. (2024). Cell Gene Expression Metadata. Kaggle. https://www.kaggle.com/datasets/kaziaishikuzzaman/cell-gene-expression-metadata

File Structure : dataset- ─ metadata_summary.csv # Main metadata file ─ cell_type_annotations.csv # Detailed cell type information
─ experimental_conditions.csv # Experiment-specific metadata ─ quality_metrics.csv # Data quality indicators ─ README.txt # Detailed file descriptions

Technical Specifications : File Encoding: UTF-8 Separator: Comma-separated values (CSV) Missing Values: Represented as 'NA' or empty cells Data Types: Mixed (categorical, numerical, text)

Contact and Support : For questions about this dataset: Kaggle Profile: @kaziaishikuzzaman Dataset Issues: Use Kaggle's discussion section Collaboration: Open to research collaborations and improvements

Version History : v1.0: Initial release with comprehensive metadata collection [Future versions]: Updates and additional annotations as available

Related Datasets: Consider exploring these complementary datasets- Single-cell gene expression data (companion to this metadata) Cell atlas datasets from major consortiums Disease-specific single-cell studies Multi-omics datasets with matching cell types

Keywords: single-cell, RNA-seq, genomics, cell types, metadata, bioinformatics, machine learning, computational biology Category: Biology > Genomics
Supplementary Material to the Publication: Clonal relation between...
data.europa.eu
zenodo.org
unknown
Updated Jul 3, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Zenodo (2025). Supplementary Material to the Publication: Clonal relation between Salmonella enterica subspecies enterica serovar Dublin strains of bovine and food origin in Germany [Dataset]. https://data.europa.eu/data/datasets/oai-zenodo-org-8009331?locale=el
Explore at:
unknown(4092)Available download formats
Dataset updated
Jul 3, 2025
Dataset authored and provided by
Zenodohttp://zenodo.org/
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Germany
Description
OHEJP Project: BeOne Salmonella enterica serovar Dublin (S. Dublin) is a host-adapted serovar that causes enteritis and/or systemic diseases in cattle. Because the serovar is not host-specific, it can infect other species, including human beings, causing severe disease and a higher mortality rate than other non-typhoidal serovars. Given that human illnesses are primarily caused by contaminated milk, milk products, and beef, data on the genetic connection between S. Dublin strains from livestock and food should be analyzed. Whole genome sequencing (WGS) was performed on 144 S. Dublin strains from cattle and 30 strains from food. Multilocus sequence typing (MLST) found that the majority of livestock and food isolates were of the sequence type ST-10. As discovered by core-genome Single-Nucleotide Polymorphisms Typing and core-genome MLST, 14 of 30 strains from food origin were clonally related to at least one strain from cattle. Without outliers, the remaining 16 food-borne strains fit into the genomic structure of S. Dublin in Germany. WGS demonstrated to be an effective method not only for learning about the epidemiology of Salmonella strains, but also for detecting clonal relationships between organisms isolated at different stages of production. This study discovered a strong genetic link between S. Dublin strains from cattle and food, and thus the potential to cause human infections. S. Dublin strains from both origins have a nearly comparable collection of virulence factors, emphasizing their ability to produce severe clinical symptoms in animals as well as humans, emphasizing the importance of effective S. Dublin management in a farm to fork strategy.

Facebook

Twitter

Click to copy link

Link copied

Cite

Colbie Reed (2023). Data Object 1-1 (Supplemental Data 1-S1) [Dataset]. http://doi.org/10.6084/m9.figshare.24548935.v1

Data Object 1-1 (Supplemental Data 1-S1)

Explore at:

xlsxAvailable download formats

Unique identifier

https://doi.org/10.6084/m9.figshare.24548935.v1

Dataset updated

Nov 12, 2023

Dataset provided by

Figsharehttp://figshare.com/
figshare

Authors

Colbie Reed

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Supplemental Data 1-S1. Timeline of important events shaping contemporary bioinformatics and comparative genomics. Timeline is not intended to be absolutely comprehensive of each of the observed fields, their respective histories. See footnotes for key review publications, sources in addition to those listed in Reference column. Field of contributions are color-coded accordingly: purple= computer science/engineering, blue= legislation/government action, biology= green, economic/markets= orange, academic institution= pink

Clear search

Close search

Google apps

Main menu

Data Object 1-1 (Supplemental Data 1-S1)

blog-bioinformatics.science - Historical whois Lookup

Biodiversity Informatics at the Natural History Museum

Test dataset from: GenErode: a bioinformatics pipeline to investigate genome...

[DATA_SCIENCE] Interviews PomBase Users, January-February 2016

Swiss-Institute-of-Bioinformatics (Company) - Reverse Whois Lookup

Variation data of pan-genome in 1913-based allotetraploid cottons

NCBI Nt (Nucleotide) database FASTA file from 2017-10-26

Data from: Bioinformatics is a BLAST: Engaging First-Year Biology Students...

Bioinformatics Services Market by Type (Sequencing Services, Data Analysis,...

Data from: where the minor things are: a pan-eukaryotic survey suggests...

Bioinformatics In IVD Testing Market By The type of test (blood based tests...

Data from: Graph splitting: a graph-based approach for superfamily-scale...

Data_Sheet_1_Sequence Capture From Historical Museum Specimens: Maximizing...

GOTrack

Data from: An improved hypergeometric probability method for identification...

Supplementary material 4 from: Marquina D, Roslin T, Łukasik P, Ronquist F...

A systematic review of the ecological literature on cushion plants

Cell_Gene_Expression_Metadata

Supplementary Material to the Publication: Clonal relation between...

Data Object 1-1 (Supplemental Data 1-S1)