Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
BackgroundHealth sciences research is increasingly focusing on big data applications, such as genomic technologies and precision medicine, to address key issues in human health. These approaches rely on biological data repositories and bioinformatic analyses, both of which are growing rapidly in size and scope. Libraries play a key role in supporting researchers in navigating these and other information resources.MethodsWith the goal of supporting bioinformatics research in the health sciences, the University of Arizona Health Sciences Library established a Bioinformation program. To shape the support provided by the library, I developed and administered a needs assessment survey to the University of Arizona Health Sciences campus in Tucson, Arizona. The survey was designed to identify the training topics of interest to health sciences researchers and the preferred modes of training.ResultsSurvey respondents expressed an interest in a broad array of potential training topics, including "traditional" information seeking as well as interest in analytical training. Of particular interest were training in transcriptomic tools and the use of databases linking genotypes and phenotypes. Staff were most interested in bioinformatics training topics, while faculty were the least interested. Hands-on workshops were significantly preferred over any other mode of training. The University of Arizona Health Sciences Library is meeting those needs through internal programming and external partnerships.ConclusionThe results of the survey demonstrate a keen interest in a variety of bioinformatic resources; the challenge to the library is how to address those training needs. The mode of support depends largely on library staff expertise in the numerous subject-specific databases and tools. Librarian-led bioinformatic training sessions provide opportunities for engagement with researchers at multiple points of the research life cycle. When training needs exceed library capacity, partnering with intramural and extramural units will be crucial in library support of health sciences bioinformatic research.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Abstract Public databases are essential to the development of multi-omics resources. The amount of data created by biological technologies needs a systematic and organized form of storage, that can quickly be accessed, and managed. This is the objective of a biological database. Here, we present an overview of human databases with web applications. The databases and tools allow the search of biological sequences, genes and genomes, gene expression patterns, epigenetic variation, protein-protein interactions, variant frequency, regulatory elements, and comparative analysis between human and model organisms. Our goal is to provide an opportunity for exploring large datasets and analyzing the data for users with little or no programming skills. Public user-friendly web-based databases facilitate data mining and the search for information applicable to healthcare professionals. Besides, biological databases are essential to improve biomedical search sensitivity and efficiency and merge multiple datasets needed to share data and build global initiatives for the diagnosis, prognosis, and discovery of new treatments for genetic diseases. To show the databases at work, we present a a case study using ACE2 as example of a gene to be investigated. The analysis and the complete list of databases is available in the following website .
In the last decade, High-Throughput Sequencing (HTS) has revolutionized biology and medicine. This technology allows the sequencing of huge amount of DNA and RNA fragments at a very low price. In medicine, HTS tests for disease diagnostics are already brought into routine practice. However, the adoption in plant health diagnostics is still limited. One of the main bottlenecks is the lack of expertise and consensus on the standardization of the data analysis. The Plant Health Bioinformatic Network (PHBN) is an Euphresco project aiming to build a community network of bioinformaticians/computational biologists working in plant health. One of the main goals of the project is to develop reference datasets that can be used for validation of bioinformatics pipelines and for standardization purposes. Semi-artificial datasets have been created for this purpose (Datasets 1 to 10). They are composed of a "real" HTS dataset spiked with artificial viral reads. It will allow researchers to adjust their pipeline/parameters as good as possible to approximate the actual viral composition of the semi-artificial datasets. Each semi-artificial dataset allows to test one or several limitations that could prevent virus detection or a correct virus identification from HTS data (i.e. low viral concentration, new viral species, non-complete genome). Eight artificial datasets only composed of viral reads (no background data) have also been created (Datasets 11 to 18). Each dataset consists of a mix of several isolates from the same viral species showing different frequencies. The viral species were selected to be as divergent as possible. These datasets can be used to test haplotype reconstruction software, the goal being to reconstruct all the isolates present in a dataset. A GitLab repository (https://gitlab.com/ilvo/VIROMOCKchallenge) is available and provides a complete description of the composition of each dataset, the methods used to create them and their goals. Dataset_x.fastq.gz These are the fastq files of the 18 datasets. Description of the datasets This is a word document describing each dataset.
https://www.bccresearch.com/aboutus/terms-conditionshttps://www.bccresearch.com/aboutus/terms-conditions
Explore BCC Research's comprehensive report on Bioinformatics technologies Market. This report aims to study current and historical market revenues can be estimated based on the services & platforms, solutions, and application type.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset is the accompanying data for the submitted manuscript:
"An ANI-2 Enabled Open-Source Protocol To Estimate Ligand Strain After Docking"
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
In Brazil, training capable bioinformaticians is done, mostly, in graduate programs, sometimes with experiences during the undergraduate period. However, this formation tends to be inefficient in attracting students to the area and mainly in attracting professionals to support research projects in research groups. To solve these issues, participation in short courses is important for training students and professionals in the usage of tools for specific areas that use bioinformatics, as well as in ways to develop solutions tailored to the local needs of academic institutions or research groups. In this aim, the project “Bioinformática na Estrada” (Bioinformatics on the Road) proposed improving bioinformaticians’ skills in undergraduate and graduate courses, primarily in the countryside of the State of Pará, in the Amazon region of Brazil. The project scope is practical courses focused on the areas of interest of the place where the courses are occurring to train and encourage students and researchers to work in this field, reducing the existing gap due to the lack of qualified bioinformatics professionals. Theoretical and practical workshops took place, such as Introduction to Bioinformatics, Computer Science Basics, Applications of Computational Intelligence applied to Bioinformatics and Biotechnology, Computational Tools for Bioinformatics, Soil Genomics and Research Perspectives and Horizons in the Amazon Region. In the end, 444 undergraduate and graduate students from higher education institutions in the state of Pará and other Brazilian states attended the events of the Bioinformatics on the Road project.
Over the past year, biology educators and staff at the Department of Energy Systems Biology Knowledgebase (KBase) initiated a collaborative effort to develop a curriculum for bioinformatics education. KBase is a free and easily accessible data science platform that integrates many bioinformatics resources into a graphical user interface built upon reproducible analysis notebooks. KBase held conversations with college and high school instructors to understand how KBase could potentially support their educational goals. These conversations morphed into a working group of biological and data science instructors that adapted the KBase platform to their curriculum needs, specifically around concepts in Genomics, Metagenomics, Pangenomics, and Phylogenetics. The KBase Educators Working Group developed modular, adaptable, and customizable instructional units. Each instructional module contains teaching resources, publicly available data, analysis tools, and markdown capability to tailor instructions and learning goals for each class. The online user interface enables students to conduct hands-on data science research and analyses without requiring programming skills or their own computational resources (these are provided by KBase). Alongside these resources, KBase continues to work with instructors, supporting the development of additional curriculum modules. For anyone new to the platform, KBase, and the growing KBase Educators Organization, provides a community network, accompanied by community-sourced guidelines, instructional templates, and peer support to use KBase within a classroom whether virtual or in-person.
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Protein Structure Initiative - TargetTrack protein target registration database (795 MB, gzipped tarball)
The Protein Structure Initiative was a high-throughput structural genomics effort from 2000-2015 focused on developing technologies to enable greater coverage of protein structure space. Over its 15-year tenure, over 100 investigators at 35 centers (see ContributingCenters.xls) declared over 350,000 protein sequences (targets) that they would study using state-of-the-art protein production and structure determination methods. Many of these targets were selected through bioinformatics-based methods to serve as representatives for sequence and structure clusters.
From 2003-2010, these selected sequences and some basic identifying metadata were kept in a database called TargetDB, created at the Research Collaboratory for Structural Bioinformatics at Rutgers University. In 2008, a second database named PepcDB was created to track detailed experimental trial history and the standard protocols used by the PSI centers. These two databases became the principal structural genomics target databases, and were rolled into the PSI Structural Biology Knowledgebase in 2008.
As part of the third phase of the PSI, TargetDB and PepcDB were merged into a single resource, TargetTrack, to facilitate one-stop access to the data as well as expanding the schema to include new required data items. Participating centers deposited the latest status on their active targets and the protocols that were used (along with any deviations) on a weekly or quarterly basis. TargetTrack provided a variety of pre-computed data downloads on a weekly basis as well.
In July 2017, the Structural Biology Knowledgebase ceased operations. The files provided in this tarball represent the final datafiles generated by TargetTrack (timestamp June 30, 2017). Please read the README included in this dataset for descriptions of each file.
The entire TargetTrack datafile in XML format can be found in /TargetTrack XML files/tt.xml.gz
Key documentation can be found in the /Documentation folder.
TargetTrack schema: targetTrack-v1.4.1.pdf
Spreadsheet with TargetTrack enumerations for relevant fields: targetTrackEnumeratedDataItems-v1.4.1-1.xls
Image depicted the XML data schema: targetTrack-v1.4.1.jpg
These files are 868 MB in total size, uncompressed.
To open the tarball, use the command 'tar -zxvf TargetTrack-1Jul2017.tar.gz'
-- created by the PSI Structural Biology Knowledgebase, July 5, 2017
The VBRC provides bioinformatics resources to support scientific research directed at viruses belonging to the Arenaviridae, Bunyaviridae, Filoviridae, Flaviviridae, Paramyxoviridae, Poxviridae, and Togaviridae families. The Center consists of a relational database and web application that support the data storage, annotation, analysis, and information exchange goals of this work. Each data release contains the complete genomic sequences for all viral pathogens and related strains that are available for species in the above-named families. In addition to sequence data, the VBRC provides a curation for each virus species, resulting in a searchable, comprehensive mini-review of gene function relating genotype to biological phenotype, with special emphasis on pathogenesis.
The Paired Omics Data Platform is a community-based initiative standardizing links between genomic and metabolomics data in a computer readable format to further the field of natural products discovery. The goals are to link molecules to their producers, find large scale genome-metabolome associations, use genomic data to assist in structural elucidation of molecules, and provide a centralized database for paired datasets. This dataset contains the projects in http://pairedomicsdata.bioinformatics.nl/. The JSON documents adhere to the http://pairedomicsdata.bioinformatics.nl/schema.json JSON schema.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Ameloblastoma is a highly aggressive odontogenic tumor, and its pathogenesis is associated with multiple participating genes. Objective: Our aim was to identify and validate new critical genes of conventional ameloblastoma using microarray and bioinformatics analysis. Methods: Gene expression microarray and bioinformatic analysis were performed to use CHIP H10KA and DAVID software for enrichment. Protein-protein interactions (PPI) were visualized using STRING-Cytoscape with MCODE plugin, followed by Kaplan-Meier and GEPIA analysis that were employed for the candidate's postulation. RT-qPCR and IHC assays were performed to validate the bioinformatic approach. Results: 376 upregulated genes were identified. PPI analysis revealed 14 genes that were validated by Kaplan-Meier and GEPIA resulting in PDGFA and IL2RA as candidate genes. The RT-qPCR analysis confirmed their intense expression. Immunohistochemistry analysis showed that PDGFA expression is parenchyma located. Conclusion: With bioinformatics methods, we can identify upregulated genes in conventional ameloblastoma, and with RT-qPCR and immunoexpression analysis validate that PDGFA could be a more specific and localized therapeutic target.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
There is a growing collection of genomics data sets generated for identifying the gene targets under control of transcription regulators (TRs). TR ChIP-seq and RNA expression experiments that perturb TR activity are the most common strategies for mapping TRs to genes at a genomic scale. However, the collection, preprocessing, summarization, and integration of these data sets requires a non-trivial degree of bioinformatics experience. In this study, we set out a framework to accomplish these tasks. We focus on eight TRs in both mouse and human, encompassing nearly 500 experiments, with two main objectives. The first is a detailed examination of the properties of the contributing experiments, to better learn of potential biases and pitfalls when aggregating diverse data sets. The second is to provide summarized, transparent, and convenient TR-target rankings based upon these genomic data sets for community use. Our work thus catalogues the state of the literature for a subset of important mammalian TRs, prioritizes gene targets based upon available empirical evidence, and provides a framework for ready expansion to more TR data sets.
A database of information on pox viruses. Goals of this project are to acquire and annotate data on poxviruses, and to develop and utilize new tools to facilitate the study of this group of organisms. This basic research is being undertaken with an eye toward the development of novel antiviral therapies, vaccines against human orthopoxvirus infections, new approaches for the environmental detection of virions, and methods to accomplish more rapid diagnosis of disease.
The purpose of the meeting described in this review was to decide how best to ensure the sustainability of the Network for Integrating Bioinformatics into Life Science Education (NIBLSE; pronounced “nibbles”). Biology research today generates large and complex datasets, and the analysis of these datasets is becoming increasingly critical to progress in the field. The long-term goal of NIBLSE is to address this need and achieve the full integration of bioinformatics into undergraduate life sciences education. Meeting participants supported several next steps for NIBLSE, including further development and dissemination of bioinformatics learning resources through our novel incubators and Faculty Mentoring Networks, vigorously pursuing assessment strategies for our learning resources, connecting learning resources with open educational resource (OER) textbooks, learning more about barriers to bioinformatics implementation for underrepresented groups, and developing future workshops and meetings. About half the participants at the meeting were newcomers to NIBLSE, a positive sign for the future. NIBLSE has many exciting opportunities available, and we welcome life science educators with any level of bioinformatics expertise as new members.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The CABANA project (Capacity Building for Bioinformatics in Latin America) was funded by the UK’s Global Challenges Research Fund in 2017 with the aim to strengthen the bioinformatics capacity and extend its applications in Latin America focused on three challenge areas – communicable diseases, sustainable food production and protection of biodiversity. For 5 years, the project executed activities including data analysis workshops, train-the-trainer workshops, secondments, eLearning development, knowledge exchange meetings, and research projects in 10 countries. The project was successful in accomplishing all its goals with a major impact on the region. It became a model by which the research needs determined the training that was delivered. Multiple publications and over 800 trainees are part of the legacy of the project.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
targets of A-7-O-G obtained from PharmMapper
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The research hypothesised that miR-206 and miR-383 act as tumour suppressors in medulloblastoma (MB) and that their downregulation contributes to the aggressiveness of MB and glioblastoma (GB). By identifying and targeting the genes regulated by these microRNAs (CORO1C and SV2B), new therapeutic approaches could be developed for treating these aggressive brain tumours.
The study employed high-throughput small-RNA sequencing to analyse the expression profiles of microRNAs in MB samples. Bioinformatics tools were used to predict the target genes of the significantly downregulated miRNAs. The expression levels of the identified targets, CORO1C and SV2B, were validated through various molecular biology techniques, including Reverse Transcription-quantitative Polymerase Chain Reaction (RT-qPCR), western blotting, and immunohistochemistry. Functional assays were also performed to validate the regulatory effect of miR-206 and miR-383 on their target genes.
Both miR-206 and miR-383 were found to be significantly downregulated in MB samples, suggesting their potential role as tumour suppressors. Bioinformatics analysis identified CORO1C and SV2B as the target genes of miR-206 and miR-383, respectively. RT-qPCR, western blotting, and immunohistochemistry confirmed the overexpression of CORO1C/CORO1C and SV2B/SV2B in MB and GB cells and tissue samples. Functional assays validated that miR-206 and miR-383 directly regulate the expression of CORO1C and SV2B, respectively. The data suggested that the miR-206/CORO1C and miR-383/SV2B axes play a crucial role in the pathogenesis of MB and GB. The downregulation of these miRNAs leads to the overexpression of their target genes, contributing to the aggressiveness of these tumours. These findings indicate that restoring the levels of miR-206 and miR-383, or directly targeting CORO1C and SV2B, could be a promising therapeutic strategy for treating aggressive brain malignancies in both paediatric and adult patients.
The identification of miR-206 and miR-383 as tumour suppressors and their target genes as therapeutic targets provides a foundation for the development of novel treatments for MB and GB. Researchers and clinicians can use this data to:
o Develop miRNA mimics or gene therapy approaches to restore the levels of miR-206 and miR-383 in tumour cells.
o Design small molecule inhibitors or antibodies to specifically target CORO1C and SV2B proteins.
o Explore combination therapies that incorporate these new targets to improve treatment efficacy and reduce side effects.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Additional file 2: Table S2. All of the hub genes and their scores in two groups.
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Objective: The study aimed to elucidate the significance of CLEC4G, CAMK2β, SLC22A1, CBFA2T3, and STAB2 in the prognosis of hepatocellular carcinoma (HCC) patients and their associated molecular biological characteristics. Additionally, the research sought to identify new potential biomarkers with therapeutic and diagnostic relevance for clinical applications. Methods and Materials: We utilized a publicly available high throughput phosphoproteomics and proteomics data set of HCC to focus on the analysis of 12 downregulated phosphoproteins in HCC. Our approach integrates bioinformatic analysis with pathway analysis, encompassing gene ontology (GO) analysis, Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis, and the construction of a protein–protein interaction (PPI) network. Results: In total, we quantified 11547 phosphorylation sites associated with 4043 phosphoproteins from a cohort of 159 HCC patients. Within this extensive data set, our specific focus was on 19 phosphorylation sites displaying significant downregulation (log2 FC ≤ −2 with p-values < 0.0001). Remarkably, our investigation revealed distinct pathways exhibiting differential regulation across multiple dimensions, including the genomic, transcriptomic, proteomic, and phosphoproteomic levels. These pathways encompass a wide range of critical cellular processes, including cellular component organization, cell cycle control, signaling pathways, transcriptional and translational control, and metabolism. Furthermore, our bioinformatics analysis unveiled noteworthy insights into the subcellular localizations, biological processes, and molecular functions associated with these proteins and phosphoproteins. Within the context of the PPI network, we identified 12 key genes CLEC4G, STAB2, ADH1A, ADH1B, CAMK2B, ADH4, CHGB, PYGL, ADH1C, AKAP12, CBFA2T3, and SLC22A1 as the top highly interconnected hub genes. Conclusions: The findings related to CLEC4G, ADH1B, SLC22A1, CAMK2β, CBFA2T3, and STAB2 indicate their reduced expression in HCC, which is associated with an unfavorable prognosis. Furthermore, the results of KEGG and GO pathway analyses suggest that these genes may impact liver cancer by engaging various targets and pathways, ultimately promoting the progression of hepatocellular carcinoma. These results underscore the significant potential of CLEC4G, ADH1B, SLC22A1, CAMK2β, CBFA2T3, and STAB2 as key contributors to HCC development and advancement. This insight holds promise for identifying therapeutic targets and charting research avenues to enhance our understanding of the intricate molecular mechanisms underlying hepatocellular carcinoma.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
BackgroundHealth sciences research is increasingly focusing on big data applications, such as genomic technologies and precision medicine, to address key issues in human health. These approaches rely on biological data repositories and bioinformatic analyses, both of which are growing rapidly in size and scope. Libraries play a key role in supporting researchers in navigating these and other information resources.MethodsWith the goal of supporting bioinformatics research in the health sciences, the University of Arizona Health Sciences Library established a Bioinformation program. To shape the support provided by the library, I developed and administered a needs assessment survey to the University of Arizona Health Sciences campus in Tucson, Arizona. The survey was designed to identify the training topics of interest to health sciences researchers and the preferred modes of training.ResultsSurvey respondents expressed an interest in a broad array of potential training topics, including "traditional" information seeking as well as interest in analytical training. Of particular interest were training in transcriptomic tools and the use of databases linking genotypes and phenotypes. Staff were most interested in bioinformatics training topics, while faculty were the least interested. Hands-on workshops were significantly preferred over any other mode of training. The University of Arizona Health Sciences Library is meeting those needs through internal programming and external partnerships.ConclusionThe results of the survey demonstrate a keen interest in a variety of bioinformatic resources; the challenge to the library is how to address those training needs. The mode of support depends largely on library staff expertise in the numerous subject-specific databases and tools. Librarian-led bioinformatic training sessions provide opportunities for engagement with researchers at multiple points of the research life cycle. When training needs exceed library capacity, partnering with intramural and extramural units will be crucial in library support of health sciences bioinformatic research.