100+ datasets found

Protein identifiers and segment boundaries of protein segments similar to...
plos.figshare.com
txt
Updated Nov 14, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ami G. Sangster; Cameron Dufault; Haoning Qu; Denise Le; Julie D. Forman-Kay; Alan M. Moses (2025). Protein identifiers and segment boundaries of protein segments similar to FUS’s SYGQ-rich prion-like domain. [Dataset]. http://doi.org/10.1371/journal.pcbi.1012929.s005
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pcbi.1012929.s005
Dataset updated
Nov 14, 2025
Dataset provided by
PLOShttp://plos.org/
Authors
Ami G. Sangster; Cameron Dufault; Haoning Qu; Denise Le; Julie D. Forman-Kay; Alan M. Moses
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Identifiers are UniProt IDs and boundaries use zero-based indexing. These segments have been corrected for over-segmentation, meaning “POS” contains a list of start and stop boundaries of each segment for each protein. (TSV)
e
NCBIFAM
ebi.ac.uk
Updated Aug 6, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). NCBIFAM [Dataset]. https://www.ebi.ac.uk/interpro/
Explore at:
Dataset updated
Aug 6, 2025
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
NCBIfam is a collection of protein families, featuring curated multiple sequence alignments, hidden Markov models (HMMs) and annotation, which provides a tool for identifying functionally related proteins based on sequence homology. NCBIfam is maintained at the National Center for Biotechnology Information (Bethesda, MD). NCBIfam includes models from TIGRFAMs, another database of protein families developed at The Institute for Genomic Research, then at the J. Craig Venter Institute (Rockville, MD, US).
Bioinformatics Market Growth Analysis - Size and Forecast 2025-2029 |...
technavio.com
pdf
Updated Jun 18, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Technavio (2025). Bioinformatics Market Growth Analysis - Size and Forecast 2025-2029 | Technavio | Technavio [Dataset]. https://www.technavio.com/report/bioinformatics-market-industry-analysis
Explore at:
pdfAvailable download formats
Dataset updated
Jun 18, 2025
Dataset provided by
TechNavio
Authors
Technavio
License
https://www.technavio.com/content/privacy-noticehttps://www.technavio.com/content/privacy-notice
Time period covered
2025 - 2029
Description
snapshot-tab-pane Bioinformatics Market Size 2025-2029The bioinformatics market size is valued to increase by USD 15.98 billion, at a CAGR of 17.4% from 2024 to 2029. Reduction in cost of genetic sequencing will drive the bioinformatics market.Market InsightsNorth America dominated the market and accounted for a 43% growth during the 2025-2029.By Application - Molecular phylogenetics segment was valued at USD 4.48 billion in 2023By Product - Platforms segment accounted for the largest market revenue share in 2023Market Size & ForecastMarket Opportunities: USD 309.88 million Market Future Opportunities 2024: USD 15978.00 millionCAGR from 2024 to 2029 : 17.4%Market SummaryThe market is a dynamic and evolving field that plays a pivotal role in advancing scientific research and innovation in various industries, including healthcare, agriculture, and academia. One of the primary drivers of this market's growth is the rapid reduction in the cost of genetic sequencing, making it increasingly accessible to researchers and organizations worldwide. This affordability has led to an influx of large-scale genomic data, necessitating the development of sophisticated bioinformatics tools for Next-Generation Sequencing (NGS) data analysis. Another significant trend in the market is the shortage of trained laboratory professionals capable of handling and interpreting complex genomic data.This skills gap creates a demand for user-friendly bioinformatics software and services that can streamline data analysis and interpretation, enabling researchers to focus on scientific discovery rather than data processing. For instance, a leading pharmaceutical company could leverage bioinformatics tools to optimize its drug discovery pipeline by analyzing large genomic datasets to identify potential drug targets and predict their efficacy. By integrating these tools into its workflow, the company can reduce the time and cost associated with traditional drug discovery methods, ultimately bringing new therapies to market more efficiently. Despite its numerous benefits, the market faces challenges such as data security and privacy concerns, data standardization, and the need for interoperability between different software platforms.Addressing these challenges will require collaboration between industry stakeholders, regulatory bodies, and academic institutions to establish best practices and develop standardized protocols for data sharing and analysis.What will be the size of the Bioinformatics Market during the forecast period?Get Key Insights on Market Forecast (PDF) Request Free SampleBioinformatics, a dynamic and evolving market, is witnessing significant growth as businesses increasingly rely on high-performance computing, gene annotation, and bioinformatics software to decipher regulatory elements, gene expression regulation, and genomic variation. Machine learning algorithms, phylogenetic trees, and ontology development are integral tools for disease modeling and protein interactions. cloud computing platforms facilitate the storage and analysis of vast biological databases and sequence datas, enabling data mining techniques and statistical modeling for sequence assembly and drug discovery pipelines. Proteomic analysis, protein folding, and computational biology are crucial components of this domain, with biomedical ontologies and data integration platforms enhancing research efficiency.The integration of gene annotation and machine learning algorithms, for instance, has led to a 25% increase in accurate disease diagnosis within leading healthcare organizations. This trend underscores the importance of investing in advanced bioinformatics solutions for improved regulatory compliance, budgeting, and product strategy.Unpacking the Bioinformatics Market LandscapeBioinformatics, an essential discipline at the intersection of biology and computer science, continues to revolutionize the scientific landscape. Evolutionary bioinformatics, with its molecular dynamics simulation and systems biology approaches, enables a deeper understanding of biological processes, leading to improved ROI in research and development. For instance, next-generation sequencing technologies have reduced sequencing costs by a factor of ten, enabling genome-wide association studies and transcriptome sequencing on a previously unimaginable scale. In clinical bioinformatics, homology modeling techniques and protein-protein interaction analysis facilitate drug target identification, enhancing compliance with regulatory requirements. Phylogenetic analysis tools and comparative genomics studies contribute to the discovery of novel biomarkers and the development of personalized treatments. Bioimage informatics and proteomic data integration employ advanced sequence alignment algorithms and fun
n
Bioinformatics Links Directory
neuinfo.org
scicrunch.org
+3more
Updated Jan 29, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2022). Bioinformatics Links Directory [Dataset]. http://identifiers.org/RRID:SCR_008018
Explore at:
Unique identifier
https://identifiers.org/RRID:SCR_008018
Dataset updated
Jan 29, 2022
Description
Database of curated links to molecular resources, tools and databases selected on the basis of recommendations from bioinformatics experts in the field. This resource relies on input from its community of bioinformatics users for suggestions. Starting in 2003, it has also started listing all links contained in the NAR Webserver issue. The different types of information available in this portal: * Computer Related: This category contains links to resources relating to programming languages often used in bioinformatics. Other tools of the trade, such as web development and database resources, are also included here. * Sequence Comparison: Tools and resources for the comparison of sequences including sequence similarity searching, alignment tools, and general comparative genomics resources. * DNA: This category contains links to useful resources for DNA sequence analyses such as tools for comparative sequence analysis and sequence assembly. Links to programs for sequence manipulation, primer design, and sequence retrieval and submission are also listed here. * Education: Links to information about the techniques, materials, people, places, and events of the greater bioinformatics community. Included are current news headlines, literature sources, educational material and links to bioinformatics courses and workshops. * Expression: Links to tools for predicting the expression, alternative splicing, and regulation of a gene sequence are found here. This section also contains links to databases, methods, and analysis tools for protein expression, SAGE, EST, and microarray data. * Human Genome: This section contains links to draft annotations of the human genome in addition to resources for sequence polymorphisms and genomics. Also included are links related to ethical discussions surrounding the study of the human genome. * Literature: Links to resources related to published literature, including tools to search for articles and through literature abstracts. Additional text mining resources, open access resources, and literature goldmines are also listed. * Model Organisms: Included in this category are links to resources for various model organisms ranging from mammals to microbes. These include databases and tools for genome scale analyses. * Other Molecules: Bioinformatics tools related to molecules other than DNA, RNA, and protein. This category will include resources for the bioinformatics of small molecules as well as for other biopolymers including carbohydrates and metabolites. * Protein: This category contains links to useful resources for protein sequence and structure analyses. Resources for phylogenetic analyses, prediction of protein features, and analyses of interactions are also found here. * RNA: Resources include links to sequence retrieval programs, structure prediction and visualization tools, motif search programs, and information on various functional RNAs.
Predefined workflows in the ZBIT Bioinformatics Toolbox.
plos.figshare.com
xls
Updated Jun 1, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Michael Römer; Johannes Eichner; Andreas Dräger; Clemens Wrzodek; Finja Wrzodek; Andreas Zell (2023). Predefined workflows in the ZBIT Bioinformatics Toolbox. [Dataset]. http://doi.org/10.1371/journal.pone.0149263.t001
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0149263.t001
Dataset updated
Jun 1, 2023
Dataset provided by
PLOShttp://plos.org/
Authors
Michael Römer; Johannes Eichner; Andreas Dräger; Clemens Wrzodek; Finja Wrzodek; Andreas Zell
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Predefined workflows in the ZBIT Bioinformatics Toolbox.
e
PROSITE profiles
ebi.ac.uk
Updated Feb 5, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). PROSITE profiles [Dataset]. https://www.ebi.ac.uk/interpro/
Explore at:
Dataset updated
Feb 5, 2025
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
PROSITE is a database of protein families and domains. It consists of biologically significant sites, patterns and profiles that help to reliably identify to which known protein family a new sequence belongs. PROSITE is based at the Swiss Institute of Bioinformatics (SIB), Geneva, Switzerland.
f
Data Sheet 1_AlgaeOrtho, a bioinformatics tool for processing ortholog...
datasetcatalog.nlm.nih.gov
frontiersin.figshare.com
Updated Mar 4, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
LaPorte, Mary-Francis; Nag, Ambarish; Clark, Struan; Arora, Neha (2025). Data Sheet 1_AlgaeOrtho, a bioinformatics tool for processing ortholog inference results in algae.docx [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0002081624
Explore at:
Dataset updated
Mar 4, 2025
Authors
LaPorte, Mary-Francis; Nag, Ambarish; Clark, Struan; Arora, Neha
Description
IntroductionMicroalgae constitute a prominent feedstock for producing biofuels and biochemicals by virtue of their prolific reproduction, high bioproduct accumulation, and the ability to grow in brackish and saline water. However, naturally occurring wild type algal strains are rarely optimal for industrial use; therefore, bioengineering of algae is necessary to generate superior performing strains that can address production challenges in industrial settings, particularly the bioenergy and bioproduct sectors. One of the crucial steps in this process is deciding on a bioengineering target: namely, which gene/protein to differentially express. These targets are often orthologs which are defined as genes/proteins originating from a common ancestor in divergent species. Although bioinformatics tools for the identification of protein orthologs already exist, processing the output from such tools is nontrivial, especially for a researcher with little or no bioinformatics experience.MethodsThe present study introduces AlgaeOrtho, a user-friendly tool that builds upon the SonicParanoid orthology inference tool (based on an algorithm that identifies potential protein orthologs based on amino acid sequences) and the PhycoCosm database from JGI (Joint Genome Institute) to help researchers identify orthologs of their proteins of interest in multiple diverse algal species.ResultsThe output of this application includes a table of the putative orthologs of their protein of interest, a heatmap showing sequence similarity (%), and an unrooted tree of the putative protein orthologs. Notably, the tool would be instrumental in identifying novel bioengineering targets in different algal strains, including targets in not-fully annotated algal species, since it does not depend on existing protein annotations. We tested AlgaeOrtho using three case studies, for which orthologs of proteins relevant to bioengineering targets, were identified from diverse algal species, demonstrating its ease of use and utility for bioengineering researchers.DiscussionThis tool is unique in the protein ortholog identification space as it can visualize putative orthologs, as desired by the user, across several algal species.
M
Molecular Biology Software Report
marketresearchforecast.com
doc, pdf, ppt
Updated Oct 26, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Market Research Forecast (2025). Molecular Biology Software Report [Dataset]. https://www.marketresearchforecast.com/reports/molecular-biology-software-531059
Explore at:
doc, pdf, pptAvailable download formats
Dataset updated
Oct 26, 2025
Dataset authored and provided by
Market Research Forecast
License
https://www.marketresearchforecast.com/privacy-policyhttps://www.marketresearchforecast.com/privacy-policy
Time period covered
2026 - 2034
Area covered
Global
Variables measured
Market Size
Description
Explore the booming Molecular Biology Software market, projected to reach $3.7 billion by 2033. Discover key drivers, trends in bioinformatics, DNA analysis, and drug discovery.
C
Bioinformatics for Researchers in Life Sciences: Tools and Learning...
data.iadb.org
csv, pdf
Updated Apr 10, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
IDB Datasets (2025). Bioinformatics for Researchers in Life Sciences: Tools and Learning Resources [Dataset]. http://doi.org/10.60966/kwvb-wr19
Explore at:
csv(276253), pdf(2989058), csv(355108)Available download formats
Unique identifier
https://doi.org/10.60966/kwvb-wr19
Dataset updated
Apr 10, 2025
Dataset provided by
IDB Datasets
License
Attribution-NonCommercial-NoDerivs 3.0 (CC BY-NC-ND 3.0)https://creativecommons.org/licenses/by-nc-nd/3.0/
License information was derived automatically
Time period covered
Jan 1, 2020 - Jan 1, 2021
Description
The COVID-19 pandemic has shown that bioinformatics--a multidisciplinary field that combines biological knowledge with computer programming concerned with the acquisition, storage, analysis, and dissemination of biological data--has a fundamental role in scientific research strategies in all disciplines involved in fighting the virus and its variants. It aids in sequencing and annotating genomes and their observed mutations; analyzing gene and protein expression; simulation and modeling of DNA, RNA, proteins and biomolecular interactions; and mining of biological literature, among many other critical areas of research. Studies suggest that bioinformatics skills in the Latin American and Caribbean region are relatively incipient, and thus its scientific systems cannot take full advantage of the increasing availability of bioinformatic tools and data. This dataset is a catalog of bioinformatics software for researchers and professionals working in life sciences. It includes more than 300 different tools for varied uses, such as data analysis, visualization, repositories and databases, data storage services, scientific communication, marketplace and collaboration, and lab resource management. Most tools are available as web-based or desktop applications, while others are programming libraries. It also includes 10 suggested entries for other third-party repositories that could be of use.
f
Data from: MetaLab 2.0 Enables Accurate Post-Translational Modifications...
acs.figshare.com
xlsx
Updated Jun 1, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Kai Cheng; Zhibin Ning; Xu Zhang; Leyuan Li; Bo Liao; Janice Mayne; Daniel Figeys (2023). MetaLab 2.0 Enables Accurate Post-Translational Modifications Profiling in Metaproteomics [Dataset]. http://doi.org/10.1021/jasms.0c00083.s002
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.1021/jasms.0c00083.s002
Dataset updated
Jun 1, 2023
Dataset provided by
ACS Publications
Authors
Kai Cheng; Zhibin Ning; Xu Zhang; Leyuan Li; Bo Liao; Janice Mayne; Daniel Figeys
License
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Description
Studying the structure and function of microbiomes is an emerging research field. Metaproteomic approaches focusing on the characterization of expressed proteins and post-translational modifications (PTMs) provide a deeper understanding of microbial communities. Previous research has highlighted the value of examining microbiome-wide protein expression in studying the roles of the microbiome in human diseases. Nevertheless, the regulation of protein functions in complex microbiomes remains underexplored. This is mainly due to the lack of efficient bioinformatics tools to identify and quantify PTMs in the microbiome. We have developed comprehensive software termed MetaLab for the data analysis of metaproteomic data sets. Here, we build an open search workflow within MetaLab for unbiased identification and quantification of unmodified peptides as well as peptides with various PTMs from microbiome samples. This bioinformatics platform provides information about proteins, PTMs, taxa, functions, and pathways of microbial communities. The performance of the workflow was evaluated using conventional proteomics, metaproteomics from mouse and human gut microbiomes, and modification-specific enriched data sets. Superior accuracy and sensitivity were obtained simultaneously by using our method compared with the traditional closed search strategy.
s
iPTMnet
scicrunch.org
rrid.site
Updated Dec 4, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2023). iPTMnet [Dataset]. http://identifiers.org/RRID:SCR_014416
Explore at:
Unique identifier
https://identifiers.org/RRID:SCR_014416
Dataset updated
Dec 4, 2023
Description
A protein database which connects multiple disparate bioinformatics tools and systems text mining, data mining, analysis and visualization tools, and databases and ontologies.
q
Data from: Using Open-Source Bioinformatics and Visualization Tools to...
qubeshub.org
Updated Mar 18, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Laura Listenberger; Cassandra Joiner; Cassidy Terrell* (2022). Using Open-Source Bioinformatics and Visualization Tools to Explore the Structure and Function of SARS-CoV-2 Spike Protein [Dataset]. https://qubeshub.org/community/groups/coursesource/publications?id=2862
Explore at:
Dataset updated
Mar 18, 2022
Dataset provided by
QUBES
Authors
Laura Listenberger; Cassandra Joiner; Cassidy Terrell*
Description
The relationship between protein structure and function is a foundational concept in undergraduate biochemistry. We find this theme is best presented with assignments that encourage exploration and analysis. Here, we share a series of four assignments that use open-source, online molecular visualization and bioinformatics tools to examine the interaction between the SARS-CoV-2 spike protein and the ACE2 receptor. The interaction between these two proteins initiates SARS-CoV-2 infection of human host cells and is the cause of COVID-19. In assignment I, students identify sequences with homology to the SARS-CoV-2 spike protein and use them to build a primary sequence alignment. Students make connections to a linked primary research article as an example of how scientists use molecular and phylogenetic analysis to explore the origins of a novel virus. Assignments II through IV teach students to use an online molecular visualization tool for analysis of secondary, tertiary, and quaternary structure. Emphasis is placed on identification of noncovalent interactions that stabilize the SARS-CoV-2 spike protein and mediate its interaction with ACE2. We assigned this project to upper-level undergraduate biochemistry students at a public university and liberal arts college. Students in our courses completed the project as individual homework assignments. However, we can easily envision implementation of this project during multiple in-class sessions or in a biochemistry laboratory using in-person or remote learning. We share this project as a resource for instructors who aim to teach protein structure and function using inquiry-based molecular visualization activities.

Primary image: Exploration of SARS-CoV-2 spike protein: student generated data from assignments I - IV. Includes examples of figures submitted by students, including a sequence alignment and representations of 3D protein structure generated using UCSF Chimera. The primary image includes student generated data and a cartoon from Pixabay, an online repository of copyright free art.
coronavirus_2_isolate_Wuhan-Hu-1_complete_genome
kaggle.com
zip
Updated Feb 28, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Oscar Yáñez Feijóo (2024). coronavirus_2_isolate_Wuhan-Hu-1_complete_genome [Dataset]. https://www.kaggle.com/datasets/oscaryezfeijo/coronavirus-2-isolate-wuhan-hu-1-complete-genome
Explore at:
zip(836184 bytes)Available download formats
Dataset updated
Feb 28, 2024
Authors
Oscar Yáñez Feijóo
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Area covered
Wuhan
Description
3D SARS-CoV-19 Protein Visualization with Biopython Overview

This dataset provides a detailed Jupyter notebook guide on leveraging Biopython for bioinformatics, specifically focusing on the visualization of the SARS-CoV-19 protein structure. It covers fundamental bioinformatics concepts and applications, including sequence manipulation, transcription, translation, local alignment using NCBI-BLAST, and reading PDB files for 3D visualization.

Contents

Introduction to Biopython and its applications in bioinformatics. Detailed guide on understanding and manipulating FASTA file formats. Sequence manipulation techniques including indexing, slicing, concatenation, and codon search. Analysis of genetic material including DNA and RNA structures. Transcription and translation studies to understand protein synthesis. Basic Local Alignment Search Tool (BLAST) for identifying protein structures. Reading and visualizing PDB files to explore the 3D structure of the SARS-CoV-19 protein. Observations including sequence length, GC content, protein content, and structural insights into the SARS-CoV-19 protein.

Key Learnings

Understanding of Biopython for sequence analysis, manipulation, and visualization. Insights into the genetic makeup of SARS-CoV-19 and its protein structure. Practical experience with bioinformatics tools and techniques for research and analysis.

Ideal for

Bioinformatics students, researchers, and enthusiasts looking to deepen their understanding of viral protein structures and the application of Biopython in real-world scenarios.

Column descriptors

Header Information: The beginning of a PDB file usually contains metadata about the molecule, including the PDB ID (in this case, 7D4F), the title of the experiment, authors, and publication details if available.

ATOM Records: These lines provide detailed information about each atom in the molecule, including its serial number, atom name, residue name, chain identifier, residue sequence number, coordinates (X, Y, Z), occupancy, and temperature factor. This section is crucial for reconstructing the 3D structure.

HETATM Records: Similar to ATOM records, but for atoms that are not part of standard amino acids or nucleotides, such as ligands, solvent molecules, or metal ions.

Connectivity Records: These include CONECT records detailing the bonding between atoms, crucial for understanding the molecular connectivity.

Example Column Descriptors for PDB Data: When converting PDB data into a structured format like a CSV or DataFrame for analysis, you might consider the following "columns" based on ATOM/HETATM records:

AtomID: Unique identifier for each atom within the molecule.

AtomName: Name of the atom (e.g., CA for alpha carbon).

ResidueName: Name of the residue (amino acid or nucleotide) to which the atom belongs.

ChainID: Identifier for the protein chain.

ResidueID: Sequence number of the residue within the chain.

X, Y, Z Coordinates: The 3D coordinates of the atom in space.

Occupancy: The occupancy factor of the atom, indicating the proportion of time the atom is in the observed location.

TempFactor: Temperature factor (B-factor), indicating the motion of the atom within the crystal structure.

Usage:

Structural Biology and Bioinformatics: Understanding protein structures, including the folding, function, and interaction with other molecules.

Drug Design: Identifying potential binding sites for drug molecules by analyzing the structure.

Educational Purposes: Teaching molecular structure and function in biochemistry and molecular biology courses.

Links:

Biopython: https://biopython.org/DIST/docs/tutorial/Tutorial.html

Nature review article: Translation: DNA to mRNA to Protein: https://www.nature.com/scitable/topicpage/translation-dna-to-mrna-to-protein-393/
d
FASTA sequences of 6 different proteins of SARS-CoV-2
search.dataone.org
datadryad.org
Updated May 4, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Umama Khan; Md. Salauddin Khan; Pinky Debnath (2025). FASTA sequences of 6 different proteins of SARS-CoV-2 [Dataset]. http://doi.org/10.5061/dryad.7pvmcvdt0
Explore at:
Unique identifier
https://doi.org/10.5061/dryad.7pvmcvdt0
Dataset updated
May 4, 2025
Dataset provided by
Dryad Digital Repository
Authors
Umama Khan; Md. Salauddin Khan; Pinky Debnath
Time period covered
Jan 1, 2021
Description
The acute respiratory disease induced by the severe acute respiratory syndrome-coronavirus-2 (SARS-CoV-2) has become a global epidemic in just less than a year by the first half of 2020. The subsequent efficient human-to-human transmission of this virus eventually affected millions of people worldwide. The virulence of the SARS-CoV-2 is mostly regulated by its proteins but very little is known about the protein structures and functionalities. Therefore, the main purpose of this study is to learn more about these proteins through bioinformatics approaches. In this study, ORF10, ORF7b, ORF7a, ORF6, membrane glycoprotein, and envelope protein have been selected from a Bangladeshi Corona-virus strain G039392 and a number of bioinformatics tools and strategies were implemented for multiple sequence alignment and phylogeny analysis with 9 different variants, predicting hydropathicity, amino acid compositions, protein-binding propensity, protein disorders, 2D and 3D protein modeling.
e
HAMAP
ebi.ac.uk
Updated Feb 5, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). HAMAP [Dataset]. https://www.ebi.ac.uk/interpro/
Explore at:
Dataset updated
Feb 5, 2025
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
HAMAP stands for High-quality Automated and Manual Annotation of Proteins. HAMAP profiles are manually created by expert curators. They identify proteins that are part of well-conserved protein families or subfamilies. HAMAP is based at the SIB Swiss Institute of Bioinformatics, Geneva, Switzerland.
G
Genetic Analysis Software Report
datainsightsmarket.com
doc, pdf, ppt
Updated Jan 16, 2026
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Data Insights Market (2026). Genetic Analysis Software Report [Dataset]. https://www.datainsightsmarket.com/reports/genetic-analysis-software-1443466
Explore at:
doc, pdf, pptAvailable download formats
Dataset updated
Jan 16, 2026
Dataset authored and provided by
Data Insights Market
License
https://www.datainsightsmarket.com/privacy-policyhttps://www.datainsightsmarket.com/privacy-policy
Time period covered
2026 - 2034
Area covered
Global
Variables measured
Market Size
Description
Explore the booming Genetic Analysis Software market, projected to hit USD 304 million by 2025 and grow at a 4.9% CAGR. Discover key drivers like personalized medicine, genomics research, and cloud adoption.
d
Data from: Alignment-based protein mutational landscape prediction: doing...
search.dataone.org
data.niaid.nih.gov
+3more
Updated Jul 26, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Marina Abakarova; CÃ©line Marquet; Michael Rera; Burkhard Rost; Elodie Laine (2025). Alignment-based protein mutational landscape prediction: doing more with less [Dataset]. http://doi.org/10.5061/dryad.vdncjsz1s
Explore at:
Unique identifier
https://doi.org/10.5061/dryad.vdncjsz1s
Dataset updated
Jul 26, 2025
Dataset provided by
Dryad Digital Repository
Authors
Marina Abakarova; CÃ©line Marquet; Michael Rera; Burkhard Rost; Elodie Laine
Time period covered
Jan 1, 2023
Description
The wealth of genomic data has boosted the development of computational methods predicting the phenotypic outcomes of missense variants. The most accurate ones exploit multiple sequence alignments, which can be costly to generate. Recent efforts for democratizing protein structure prediction have overcome this bottleneck by leveraging the fast homology searchÂ of MMseqs2. Here, we show the usefulness of this strategy for mutational outcome prediction through a large-scale assessment of 1.5M missense variants across 72 protein families. Our study demonstrates the feasibility of producing alignment-based mutational landscape predictions that are both high-quality and compute-efficient for entire proteomes. We provide the community with the whole human proteome mutational landscape and simplified access to our predictiveÂ pipeline.

, , , # Alignment-based protein mutational landscape prediction: doing more with less.

Access this dataset on Dryad

This dataset contains the data and tools associated with Alignment-based protein mutational landscape prediction: doing more with less, Abakarova et al., Genome Biology and Evolution, 2023. doi:

Description of the data and file structure

We provide the community with data associated with our assessment of four different multiple sequence alignment (MSA) resources and protocols, as well as the complete single-mutational landscape of the human proteome predicted by combining the MSA protocol implemented in ColabFold and the variant effect predictor GEMME.

ProteinGym_assessment.tgz contains the data and scripts associated with our assessment of the four different MSA generation protocols (ColabFold, ProteinGym, ProteinNet, Pfam) against the ProteinGym substitution benchmark. This archive is organised as follo...
e
Data from: PROSITE
prosite.expasy.org
toothandnail-mailorder.com
+7more
Updated Oct 15, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). PROSITE [Dataset]. https://prosite.expasy.org/
Explore at:
Dataset updated
Oct 15, 2025
Description
PROSITE consists of documentation entries describing protein domains, families and functional sites as well as associated patterns and profiles to identify them [More... / References / Commercial users ]. PROSITE is complemented by ProRule , a collection of rules based on profiles and patterns, which increases the discriminatory power of profiles and patterns by providing additional information about functionally and/or structurally critical amino acids [More...].
P
Protein Sequence Analysis Tool Report
datainsightsmarket.com
doc, pdf, ppt
Updated Jan 5, 2026
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Data Insights Market (2026). Protein Sequence Analysis Tool Report [Dataset]. https://www.datainsightsmarket.com/reports/protein-sequence-analysis-tool-1941839
Explore at:
doc, pdf, pptAvailable download formats
Dataset updated
Jan 5, 2026
Dataset authored and provided by
Data Insights Market
License
https://www.datainsightsmarket.com/privacy-policyhttps://www.datainsightsmarket.com/privacy-policy
Time period covered
2026 - 2034
Area covered
Global
Variables measured
Market Size
Description
The Protein Sequence Analysis Tool market is booming, projected to reach $7.8B by 2033 (CAGR 12%). This in-depth analysis explores market drivers, trends, restraints, and key players, including Waters Corp and Thermo Fisher. Discover insights into software, services, and regional market shares for biopharma, clinical diagnostics, and research.
Computational and bioinformatics tools for personalized cancer medicine.
plos.figshare.com
xls
Updated Jun 1, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ruth Nussinov; Hyunbum Jang; Chung-Jung Tsai; Feixiong Cheng (2023). Computational and bioinformatics tools for personalized cancer medicine. [Dataset]. http://doi.org/10.1371/journal.pcbi.1006658.t001
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pcbi.1006658.t001
Dataset updated
Jun 1, 2023
Dataset provided by
PLOShttp://plos.org/
Authors
Ruth Nussinov; Hyunbum Jang; Chung-Jung Tsai; Feixiong Cheng
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Computational and bioinformatics tools for personalized cancer medicine.

Facebook

Twitter

Click to copy link

Link copied

Cite

Ami G. Sangster; Cameron Dufault; Haoning Qu; Denise Le; Julie D. Forman-Kay; Alan M. Moses (2025). Protein identifiers and segment boundaries of protein segments similar to FUS’s SYGQ-rich prion-like domain. [Dataset]. http://doi.org/10.1371/journal.pcbi.1012929.s005

Protein identifiers and segment boundaries of protein segments similar to FUS’s SYGQ-rich prion-like domain.

Explore at:

txtAvailable download formats

Unique identifier

https://doi.org/10.1371/journal.pcbi.1012929.s005

Dataset updated

Nov 14, 2025

Dataset provided by

PLOShttp://plos.org/

Authors

Ami G. Sangster; Cameron Dufault; Haoning Qu; Denise Le; Julie D. Forman-Kay; Alan M. Moses

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Identifiers are UniProt IDs and boundaries use zero-based indexing. These segments have been corrected for over-segmentation, meaning “POS” contains a list of start and stop boundaries of each segment for each protein. (TSV)

Clear search

Close search

Google apps

Main menu

Protein identifiers and segment boundaries of protein segments similar to...

NCBIFAM

Bioinformatics Market Growth Analysis - Size and Forecast 2025-2029 |...

Bioinformatics Links Directory

Predefined workflows in the ZBIT Bioinformatics Toolbox.

PROSITE profiles

Data Sheet 1_AlgaeOrtho, a bioinformatics tool for processing ortholog...

Molecular Biology Software Report

Bioinformatics for Researchers in Life Sciences: Tools and Learning...

Data from: MetaLab 2.0 Enables Accurate Post-Translational Modifications...

iPTMnet

Data from: Using Open-Source Bioinformatics and Visualization Tools to...

coronavirus_2_isolate_Wuhan-Hu-1_complete_genome

FASTA sequences of 6 different proteins of SARS-CoV-2

HAMAP

Genetic Analysis Software Report

Data from: Alignment-based protein mutational landscape prediction: doing...

Description of the data and file structure

Data from: PROSITE

Protein Sequence Analysis Tool Report

Computational and bioinformatics tools for personalized cancer medicine.

Protein identifiers and segment boundaries of protein segments similar to FUS’s SYGQ-rich prion-like domain.