Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Abstract Public databases are essential to the development of multi-omics resources. The amount of data created by biological technologies needs a systematic and organized form of storage, that can quickly be accessed, and managed. This is the objective of a biological database. Here, we present an overview of human databases with web applications. The databases and tools allow the search of biological sequences, genes and genomes, gene expression patterns, epigenetic variation, protein-protein interactions, variant frequency, regulatory elements, and comparative analysis between human and model organisms. Our goal is to provide an opportunity for exploring large datasets and analyzing the data for users with little or no programming skills. Public user-friendly web-based databases facilitate data mining and the search for information applicable to healthcare professionals. Besides, biological databases are essential to improve biomedical search sensitivity and efficiency and merge multiple datasets needed to share data and build global initiatives for the diagnosis, prognosis, and discovery of new treatments for genetic diseases. To show the databases at work, we present a a case study using ACE2 as example of a gene to be investigated. The analysis and the complete list of databases is available in the following website .
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Most of the proteins that are specifically turned over by selective autophagy are recognized by the presence of short Atg8 interacting motifs (AIMs) that facilitate their association with the autophagy apparatus. Such AIMs can be identified by bioinformatics methods based on their defined degenerate consensus F/W/Y-X-X-L/I/V sequences in which X represents any amino acid. Achieving reliability and/or fidelity of the prediction of such AIMs on a genome-wide scale represents a major challenge. Here, we present a bioinformatics approach, high fidelity AIM (hfAIM), which uses additional sequence requirements—the presence of acidic amino acids and the absence of positively charged amino acids in certain positions—to reliably identify AIMs in proteins. We demonstrate that the use of the hfAIM method allows for in silico high fidelity prediction of AIMs in AIM-containing proteins (ACPs) on a genome-wide scale in various organisms. Furthermore, by using hfAIM to identify putative AIMs in the Arabidopsis proteome, we illustrate a potential contribution of selective autophagy to various biological processes. More specifically, we identified 9 peroxisomal PEX proteins that contain hfAIM motifs, among which AtPEX1, AtPEX6 and AtPEX10 possess evolutionary-conserved AIMs. Bimolecular fluorescence complementation (BiFC) results verified that AtPEX6 and AtPEX10 indeed interact with Atg8 in planta. In addition, we show that mutations occurring within or nearby hfAIMs in PEX1, PEX6 and PEX10 caused defects in the growth and development of various organisms. Taken together, the above results suggest that the hfAIM tool can be used to effectively perform genome-wide in silico screens of proteins that are potentially regulated by selective autophagy. The hfAIM system is a web tool that can be accessed at link: http://bioinformatics.psb.ugent.be/hfAIM/.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
These data were generated using the AIMS interaction scoring function as outlined in the manuscript "A Systematic Characterization of Germline-Encoded Contacts Identifies the Source of Bias in TCR-MHC Interactions". They accompany the AIMS version 0.7 software available on GitHub: https://github.com/ctboughter/AIMS . These files are meant to be loaded into the mhc_germline_analysis.ipynb file, but are too large to be included on the GitHub page itself.
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Protein-Protein, Genetic, and Chemical Interactions for Xie Q (2016):hfAIM: A reliable bioinformatics approach for in silico genome-wide identification of autophagy-associated Atg8-interacting motifs in various organisms. curated by BioGRID (https://thebiogrid.org); ABSTRACT: Most of the proteins that are specifically turned over by selective autophagy are recognized by the presence of short Atg8 interacting motifs (AIMs) that facilitate their association with the autophagy apparatus. Such AIMs can be identified by bioinformatics methods based on their defined degenerate consensus F/W/Y-X-X-L/I/V sequences in which X represents any amino acid. Achieving reliability and/or fidelity of the prediction of such AIMs on a genome-wide scale represents a major challenge. Here, we present a bioinformatics approach, high fidelity AIM (hfAIM), which uses additional sequence requirements-the presence of acidic amino acids and the absence of positively charged amino acids in certain positions-to reliably identify AIMs in proteins. We demonstrate that the use of the hfAIM method allows for in silico high fidelity prediction of AIMs in AIM-containing proteins (ACPs) on a genome-wide scale in various organisms. Furthermore, by using hfAIM to identify putative AIMs in the Arabidopsis proteome, we illustrate a potential contribution of selective autophagy to various biological processes. More specifically, we identified 9 peroxisomal PEX proteins that contain hfAIM motifs, among which AtPEX1, AtPEX6 and AtPEX10 possess evolutionary-conserved AIMs. Bimolecular fluorescence complementation (BiFC) results verified that AtPEX6 and AtPEX10 indeed interact with Atg8 in planta. In addition, we show that mutations occurring within or nearby hfAIMs in PEX1, PEX6 and PEX10 caused defects in the growth and development of various organisms. Taken together, the above results suggest that the hfAIM tool can be used to effectively perform genome-wide in silico screens of proteins that are potentially regulated by selective autophagy. The hfAIM system is a web tool that can be accessed at link: http://bioinformatics.psb.ugent.be/hfAIM/.
Facebook
TwitterIn the last decade, High-Throughput Sequencing (HTS) has revolutionized biology and medicine. This technology allows the sequencing of huge amount of DNA and RNA fragments at a very low price. In medicine, HTS tests for disease diagnostics are already brought into routine practice. However, the adoption in plant health diagnostics is still limited. One of the main bottlenecks is the lack of expertise and consensus on the standardization of the data analysis. The Plant Health Bioinformatic Network (PHBN) is an Euphresco project aiming to build a community network of bioinformaticians/computational biologists working in plant health. One of the main goals of the project is to develop reference datasets that can be used for validation of bioinformatics pipelines and for standardization purposes.
Semi-artificial datasets have been created for this purpose (Datasets 1 to 10). They are composed of a “real†HTS dataset spiked with artificial viral reads. It will allow researchers to adjust ...
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Ameloblastoma is a highly aggressive odontogenic tumor, and its pathogenesis is associated with multiple participating genes. Objective: Our aim was to identify and validate new critical genes of conventional ameloblastoma using microarray and bioinformatics analysis. Methods: Gene expression microarray and bioinformatic analysis were performed to use CHIP H10KA and DAVID software for enrichment. Protein-protein interactions (PPI) were visualized using STRING-Cytoscape with MCODE plugin, followed by Kaplan-Meier and GEPIA analysis that were employed for the candidate's postulation. RT-qPCR and IHC assays were performed to validate the bioinformatic approach. Results: 376 upregulated genes were identified. PPI analysis revealed 14 genes that were validated by Kaplan-Meier and GEPIA resulting in PDGFA and IL2RA as candidate genes. The RT-qPCR analysis confirmed their intense expression. Immunohistochemistry analysis showed that PDGFA expression is parenchyma located. Conclusion: With bioinformatics methods, we can identify upregulated genes in conventional ameloblastoma, and with RT-qPCR and immunoexpression analysis validate that PDGFA could be a more specific and localized therapeutic target.
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Objective Using bioinformatics methods to screen potential miRNAs as biomarkers for postmenopausal osteoporosis (PMO). Methods Obtain the expression profile of PMO peripheral blood miRNA chip through the GEO public database. Firstly, the chip was re-annotated using R language, and then the clinical typing significance of the data was determined using similarity analysis (ANOSIM). Then, weighted gene co expression network analysis (WGCNA), multi-scale embedded gene co expression network analysis (MEGCNA), and nonnegative matrix factorization (NMF) were used to screen miRNAs related to PMO. Finally, the diagnostic efficacy of miRNA was evaluated using ROC curves, the target genes of miRNA were predicted using a database, and functional enrichment of the target genes was performed using Metascape.Results miR-223-3p has a high predictive diagnostic value for PMO. GO and KEGG enrichment analysis was conducted on 34 target genes potentially regulated by miR-223-3p, and the results showed that multiple pathways were associated with bone development.Conclusion miR-223-3p has significance in the diagnosis of PMO and may regulate bone development by regulating downstream target genes.
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Objective Bioinformatics methods were used to investigate the pathogenesis,disease-characteristic genes and immunoinvasive manifestations of obesity(OB)and nonalcoholic steatohepatitis(NASH), and to explore the correlation between disease-characteristic genes and immune cells.Methods OB and NASH related chips were obtained from GEO database,R language was used to analyze gene differences and WGCNA analysis,GO and KEGG enrichment were analyzed by intersection analysis,and protein-protein interaction network was constructed at the same time.Key genes were selected using 12 cytohubba methods,ROC curve and sample chip were used to detect the accuracy of key genes,and the disease characteristic genes with the best performance were selected.CIBERSORT algorithm was continued to analyze the immune infiltration of OB and NASH,and the correlation between disease characteristic genes and immune cells was analyzed.Results A total of 235 differential genes were obtained in the obesity training group GSE25401 and GSE151839,and 804 differential genes were obtained in the non-alcoholic steatohepatitis training group GSE63067 and GSE89632.GO analysis mainly involved the significant expression of interleukin 8 regulation.KEGG analysis showed that multiple comb inhibition complex and other pathways were closely related to OB and NASH.Key genes IL6,IL1B,IL1RN,VCAN and TNFAIP6 were selected by 12 cytohubba methods.ROC curve and sample chip were used to detect disease characteristic genes,and VCAN and IL1RN had the best effect.Conclusion: OB and NASH characteristic genes VCAN and IL1RN are significantly correlated with immune cells,which provides a preliminary basis for further research on OB and MASH targeted diagnosis and treatment.
Facebook
TwitterOutput files from the No 4. Taxonomic Workflow page of the SWELTR high- temp study. In this workflow we used the microeco package for taxonomic assessment. We first converted each phyloseq object into a microtable object using the file2meco package.
taxa_wf.rdata : contains all variables and phyloseq objects from 16s rRNA and ITS ASV taxonomic assessment. To see the Objects, in R run _load("taxa_wf.rdata", verbose=TRUE)_
Additional files:
For convenience, we also include individual phyloseq and microtable objects (collected in zip files).
I** _TS (its_taxa_objects.zip)_ :**
its18_ps_work_me.rds : microtable object for the FULL (unfiltered) ITS
data.
its18_ps_filt_me.rds : microtable object for the Arbitrary filtered ITS
data.
its18_ps_perfect_me.rds : microtable object for the PERfect ITS data.
its18_ps_pime_me.rds : microtable object for the PIME ITS data.
_**16S rRNA (ssu_taxa_objects.zip):**_
ssu18_ps_work_me.rds : microtable object for the FULL (unfiltered) 16S
rRNA data.
ssu18_ps_filt_me.rds : microtable object for the Arbitrary filtered 16S
rRNA data.
ssu18_ps_perfect_me.rds : microtable object for the PERfect 16S rRNA data.
ssu18_ps_pime_me.rds : microtable object for the PIME 16S rRNA data.
For one of the 16S rRNA analyses we looked at family-level diversity of major bacterial phyla. For this analysis, we renamed NA ranks by the next highest named rank. For example, ASV13884 was unclassifed at family level, so the NA was replaced with the next highest named rank (in this case order). Therefore the family-level classification for this ASV was changed to _o_Polyangiales_. Doing this allowed us to include uncalssifed abundance in our analyses. We include the following phyloseq objects containing the modifed taxonomies.
ssu18_ps_work_clean.rds : modified phyloseq object for the FULL
(unfiltered) 16S rRNA data.
ssu18_ps_filt_clean.rds : modified phyloseq object for the Arbitrary
filtered 16S rRNA data.
ssu18_ps_perfect_clean.rds : modified phyloseq object for the PERfect
filtered 16S rRNA data.
ssu18_ps_pime_clean.rds : modified phyloseq object for the PIME filtered
16S rRNA data.
Source code for the workflow can be found here:
https://github.com/sweltr/high-temp/blob/master/taxa.Rmd
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Objective This study aims to construct a recombinant expression plasmid using methods such as PCR and double enzyme digestion. By conducting single-factor experiments to adjust IPTG concentration, expression time, and expression temperature, the research seeks to optimize the expression conditions of the hnRNP A1 protein in BL21 competent cells. The goal is to obtain high-concentration, high-quality purified protein and to prepare high-titer polyclonal antibodies against hnRNP A1.Methods Bioinformatics tools were used to analyze the physicochemical properties and structure of hnRNP A1. The pET-28a-hnRNP A1 recombinant plasmid was constructed and transformed into BL21 cells. After optimizing expression conditions, hnRNP A1 protein was purified using nickel column chromatography and identified by Western Blot. The purified protein was used to immunize C57BL/6 mice to produce polyclonal antibodies, and the antibody titer was determined by indirect ELISA.Results The highest expression of hnRNP A1 was achieved under conditions of 0.4 mM IPTG, induction temperature of 42°C, and induction time of 8 hours. The purified protein concentration reached 2.0563 μg/μl, and Western Blot confirmed the target protein. The antibody titer detected by indirect ELISA was 1:409,600.Conclusion The physicochemical properties of hnRNP A1 were successfully analyzed, high-efficiency expression of hnRNP A1 protein was achieved, and high-titer mouse-derived polyclonal antibodies against hnRNP A1 were prepared, providing a valuable tool for further research.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
ObjectiveEating disorders develop through a combination of genetic vulnerability and environmental stress, however the genetic basis of this risk is unknown.MethodsTo understand the genetic basis of this risk, we performed whole exome sequencing on 93 unrelated individuals with eating disorders (38 restricted-eating and 55 binge-eating) to identify novel damaging variants. Candidate genes with an excessive burden of predicted damaging variants were then prioritized based upon an unbiased, data-driven bioinformatic analysis. One top candidate pathway was empirically tested for therapeutic potential in a mouse model of binge-like eating.ResultsAn excessive burden of novel damaging variants was identified in 186 genes in the restricted-eating group and 245 genes in the binge-eating group. This list is significantly enriched (OR = 4.6, p
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This workflow adapts the approach and parameter settings of Trans-Omics for precision Medicine (TOPMed). The RNA-seq pipeline originated from the Broad Institute. There are in total five steps in the workflow starting from:
For testing and analysis, the workflow author provided example data created by down-sampling the read files of a TOPMed public access data. Chromosome 12 was extracted from the Homo Sapien Assembly 38 reference sequence and provided by the workflow authors. The required GTF and RSEM reference data files are also provided. The workflow is well-documented with a detailed set of instructions of the steps performed to down-sample the data are also provided for transparency. The availability of example input data, use of containerization for underlying software and detailed documentation are important factors in choosing this specific CWL workflow for CWLProv evaluation.
This dataset folder is a CWLProv Research Object that captures the Common Workflow Language execution provenance, see https://w3id.org/cwl/prov/0.5.0 or use https://pypi.org/project/cwl
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset folder is a CWLProv Research Object that captures the Common Workflow Language execution provenance, see CWLProv 0.6.0 or use the cwlprov Python tool to explore.
The CWL alignment workflow included in this case study is designed by Data Biosphere. It adapts the alignment pipeline originally developed at Abecasis Lab, The University of Michigan. This workflow is part of NIH Data Commons initiative and comprises of four stages.
First step, Pre-align, accepts a Compressed Alignment Map (CRAM) file (a compressed format for BAM files developed by European Bioinformatics Institute (EBI)) and human genome reference sequence as input and using underlying software utilities of SAMtools such as view, sort and fixmate returns a list of fastq files which can be used as input for the next step.
The next step Align also accepts the human reference genome as input along with the output files from Pre-align and uses BWA-mem to generate aligned reads as BAM files. SAMBLASTER is used to mark duplicate reads and SAMtools view to convert read files from SAM to BAM format.
The BAM files generated after lign are sorted with SAMtool sort'.
Finally, these sorted alignment files are merged to produce single sorted BAM file using SAMtools merge in Post-align step.
Steps to reproduce
This analysis was run using a 16-core Linux cloud instance with 64GB RAM and pre-installed docker.
Install gsutils
export CLOUD_SDK_REPO="cloud-sdk-$(lsb_release -c -s)"
echo "deb http://packages.cloud.google.com/apt $CLOUD_SDK_REPO main" | \
sudo tee -a /etc/apt/sources.list.d/google-cloud-sdk.list
curl https://packages.cloud.google.com/apt/doc/apt-key.gpg | \
sudo apt-key add -
sudo apt-get update && sudo apt-get install google-cloud-sdk
Get the data and make the analysis environment ready:
git clone https://github.com/FarahZKhan/topmed-workflows.git
cd topmed-workflows
git checkout cwlprov_testing
cd aligner/sbg-alignment-cwl
# this is a custom script download google bucket files from json files and create a local json
# it needs gsutil to be installed though
git clone https://github.com/DailyDreaming/fetch_gs_frm_json.git
# Wait... this should download ~18Gb.
python2.7 fetch_gs_frm_json/dl_gsfiles_frm_json.py topmed-alignment.sample.json
Run the following commands to create the CWLProv Research Object:
time cwltool --no-match-user --provenance alignmnentwf0.6.0 --tmp-outdir-prefix=/CWLProv_workflow_testing/intermediate_temp/temp --tmpdir-prefix=/CWLProv_workflow_testing/intermediate_temp/temp topmed-alignment.cwl topmed-alignment.sample.json.new
zip -r alignment_0.6.0_linux.zip alignment_0.6.0_linux
sha256sum alignment_0.6.0_linux.zip > alignment_0.6.0_linux.zip.sha25
Facebook
Twitterhttps://www.promarketreports.com/privacy-policyhttps://www.promarketreports.com/privacy-policy
Reagents and Consumables: This segment includes consumables such as enzymes, buffers, and columns.Instruments: This segment includes instruments such as sequencers, mass spectrometers, and chromatographs.Others: This segment includes services such as sample preparation and data analysis. Recent developments include: BSI's objective is to make a positive influence in proteomic research, particularly by offering professionally supported software. Bioinformatics Solutions Inc. creates powerful algorithms based on cutting-edge research to solve basic bioinformatics difficulties. This small, agile team is dedicated to meeting the demands of pharmaceutical, biotechnological, and academic scientists, as well as advancing drug discovery research. The firm, started in 2000 in Waterloo, Canada, is comprised of a bright, award-winning, and clever crew of developers, scientists, and salespeople., Charles River Laboratories International, Inc. offers drug discovery and development solutions such as research models and related services, as well as outsourced preclinical services. Segments the business is divided into two divisions: The firm produces and sells research models, mostly genetically and virally specified purpose-bred rats and mice, with roughly 150 distinct strains. It also offers a variety of complementary services to help clients support the usage of research models in medication development., Intended Audience. Notable trends are: Increased use of digital manufacturing processes to propel market growth.
Facebook
TwitterBioCompute is shorthand for the IEEE 2791-2020 standard for Bioinformatics Analyses Generated by High-Throughput Sequencing (HTS) to facilitate communication. This pipeline documentation approach has been adopted by a few FDA centers. The goal is to ease the communication burdens between research centers, organizations, and industries. This web portal allows users to build a BioCompute Objects through the interface in a human and machine readable format.
Facebook
TwitterOutput files from the No 5. Aplha diversity Workflow page of the SWELTR high-temp study. In this workflow we used Hill numbers to assess alpha diversity across temperature treatments.
alpha_wf.rdata : contains all variables and phyloseq objects from 16s rRNA and ITS ASV alpha diversity assessment. To see the Objects, in R run load("alpha_wf.rdata", verbose=TRUE)
Additional files:
For convenience, we also include individual phyloseq objects (collected in zip files) where Hill numbers have been added to the sample data tables.
_**ITS (its_alpha_objects.zip)**_ :
its18_ps_work.rds : phyloseq object for the FULL (unfiltered) ITS data.
its18_ps_filt.rds : phyloseq object for the Arbitrary filtered ITS data.
its18_ps_perfect.rds : phyloseq object for the PERfect ITS data.
its18_ps_pime.rds : phyloseq object for the PIME ITS data.
_16S rRNA (ssu_alpha_objects.zip)_ :
ssu18_ps_work.rds : phyloseq object for the FULL (unfiltered) 16S rRNA
data.
ssu18_ps_filt.rds : phyloseq object for the Arbitrary filtered 16S rRNA
data.
ssu18_ps_perfect.rds : phyloseq object for the PERfect 16S rRNA data.
ssu18_ps_pime.rds : phyloseq object for the PIME 16S rRNA data.
Source code for the workflow can be found here:
https://github.com/sweltr/high-temp/blob/master/alpha.Rmd
Facebook
Twitterhttps://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
Libraries of structural prototypes that abstract protein local structures are known as structural alphabets and have proven to be very useful in various aspects of protein structure analyses and predictions. One such library, Protein Blocks, is composed of 16 standard 5-residues long structural prototypes. This form of analyzing proteins involves drafting its structure as a string of Protein Blocks. Predicting the local structure of a protein in terms of protein blocks is the general objective of this work. A new approach, PB-kPRED is proposed towards this aim. It involves (i) organizing the structural knowledge in the form of a database of pentapeptide fragments extracted from all protein structures in the PDB and (ii) applying a knowledge-based algorithm that does not rely on any secondary structure predictions and/or sequence alignment profiles, to scan this database and predict most probable backbone conformations for the protein local structures. Though PB-kPRED uses the structural information from homologues in preference, if available. The predictions were evaluated rigorously on 15,544 query proteins representing a non-redundant subset of the PDB filtered at 30% sequence identity cut-off. We have shown that the kPRED method was able to achieve mean accuracies ranging from 40.8% to 66.3% depending on the availability of homologues. The impact of the different strategies for scanning the database on the prediction was evaluated and is discussed. Our results highlights the usefulness of the method in the context of proteins without any known structural homologues. A scoring function that gives a good estimate of the accuracy of prediction was further developed. This score estimates very well the accuracy of the algorithm (R2 of 0.82). An online version of the tool is provided freely for non-commercial usage at http://www.bo-protscience.fr/kpred/.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
| Rabies is caused by lyssaviruses, and is one of the oldest known zoonoses. In recent years, more than 21,000 nucleotide sequences of rabies viruses (RABV), from the prototype species rabies lyssavirus, have been deposited in public databases. Subsequent phylogenetic analyses in combination with metadata suggest geographic distributions of RABV. However, these analyses somewhat experience technical difficulties in defining verifiable criteria for cluster allocations in phylogenetic trees inviting for a more rational approach. Therefore, we applied a relatively new mathematical clustering algorythm named ‘affinity propagation clustering’ (AP) to propose a standardized sub-species classification utilizing full-genome RABV sequences. Because AP has the advantage that it is computationally fast and works for any meaningful measure of similarity between data samples, it has previously been applied successfully in bioinformatics, for analysis of microarray and gene expression data, however, cluster analysis of sequences is still in its infancy. Existing (516) and original (46) full genome RABV sequences were used to demonstrate the application of AP for RABV clustering. On a global scale, AP proposed four clusters, i.e. New World cluster, Arctic/Arctic-like, Cosmopolitan, and Asian as previously assigned by phylogenetic studies. By combining AP with established phylogenetic analyses, it is possible to resolve phylogenetic relationships between verifiably determined clusters and sequences. This workflow will be useful in confirming cluster distributions in a uniform transparent manner, not only for RABV, but also for other comparative sequence analyses. |
Facebook
TwitterObjective: The objective of the present study was to determine a target gene and explore the molecular mechanisms involved in the pathogenesis of HER-2-positive breast cancer. Methods: Three RNA expression profiles obtained from the Gene Expression Omnibus (GEO) and the Cancer Genome Atlas (TCGA) were used to identify differentially expressed genes (DEGs) using the R software. A protein-protein interaction network was then constructed, and hub genes were determined. Subsequently, the relationship between clinical parameters and hub genes was examined to screen for target genes. Next, DNA methylation and genomic alterations of the target gene were evaluated. To further explore potential molecular mechanisms, a functional enrichment analysis of genes coexpressed with the target gene was performed. Results: The differential expression analysis revealed 217 DEGs in HER-2-positive breast cancer samples compared to normal breast tissues. RRM2 was the only hub gene closely associated with lymphatic metastasis and the patients’ prognosis. Additionally, RRM2 was found to be consistently amplified and negatively associated with the level of methylation. Functional enrichment analysis showed that the coexpressed genes were mainly involved in cell cycle regulation. Conclusions: RRM2 was identified as a target gene associated with the initiation, progression, and prognosis of HER-2-positive breast cancer, which may be considered as a new biomarker and therapeutic target.
Facebook
TwitterObjectives: The goal of our bioinformatics study was to comprehensively analyze the association between the whole calpain family members and the progression and prognosis of hepatocellular carcinoma (HCC).Methods: The data were collected from The Cancer Genome Atlas (TCGA). The landscape of the gene expression, copy number variation (CNV), mutation, and DNA methylation of calpain members were analyzed. Clustering analysis was performed to stratify the calpain-related groups. The least absolute shrinkage and selection operator (LASSO)-based Cox model was used to select hub survival genes.Results: We found 14 out of 16 calpain members expressed differently between tumor and normal tissues of HCC. The clustering analyses revealed high- and low-risk calpain groups which had prognostic difference. We found the high-risk calpain group had higher B cell infiltration and higher expression of immune checkpoint genes HAVCR2, PDCD1, and TIGHT. The CMap analysis found that the histone deacetylase (HDAC) inhibitor trichostatin A and the PI3K-AKT-mTOR pathway inhibitors LY-294002 and wortmannin might have a therapeutic effect on the high-risk calpain group. The DEGs between calpain groups were identified. Subsequent univariate Cox analysis of each DEG and LASSO-based Cox model obtained a calpain-related prognostic signature. The risk score model of this signature showed good ability to predict the overall survival of HCC patients in TCGA datasets and external validation datasets from the Gene Expression Omnibus database and the International Cancer Genome Consortium database.Conclusion: We found that calpain family members were associated with the progression, prognosis, and drug response of HCC. Our results require further studies to confirm.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Abstract Public databases are essential to the development of multi-omics resources. The amount of data created by biological technologies needs a systematic and organized form of storage, that can quickly be accessed, and managed. This is the objective of a biological database. Here, we present an overview of human databases with web applications. The databases and tools allow the search of biological sequences, genes and genomes, gene expression patterns, epigenetic variation, protein-protein interactions, variant frequency, regulatory elements, and comparative analysis between human and model organisms. Our goal is to provide an opportunity for exploring large datasets and analyzing the data for users with little or no programming skills. Public user-friendly web-based databases facilitate data mining and the search for information applicable to healthcare professionals. Besides, biological databases are essential to improve biomedical search sensitivity and efficiency and merge multiple datasets needed to share data and build global initiatives for the diagnosis, prognosis, and discovery of new treatments for genetic diseases. To show the databases at work, we present a a case study using ACE2 as example of a gene to be investigated. The analysis and the complete list of databases is available in the following website .