100+ datasets found
  1. STRING Network Analysis

    • figshare.com
    Updated May 22, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dain Lee (2025). STRING Network Analysis [Dataset]. http://doi.org/10.6084/m9.figshare.29126396.v2
    Explore at:
    Dataset updated
    May 22, 2025
    Dataset provided by
    Figsharehttp://figshare.com/
    figshare
    Authors
    Dain Lee
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This file contains the protein-protein interaction analysis dataset that was used in the unpublished manuscript and was further analyzed with the STRING online software.Significantly upregulated mRNAs (2,777 genes; p < 0.05) identified by bulk RNA-seq were analyzed using the STRING module in Cytoscape v.2.2.0 (Institute for System Biology; WA; USA). A cluster network was constructed using the MCL algorithm with a granularity parameter of 4, followed by filtering nodes with mcl.cluster > 10. The resulting 1,848 nodes were processed through STRING v12.0 (Swiss Institute of Bioinformatics; Lausanne; Switzerland) to generate a protein–protein interaction (PPI) network, incorporating evidence from text mining, genomic neighborhood, experimental data, curated databases, co-expression, gene fusion, and co-occurrence, with a minimum confidence score threshold of 0.40. Network modules were defined using the DBSCAN clustering algorithm with an ε parameter of 2. Cluster 1, representing the largest gene set (101 genes), was further analyzed by sorting the top 20 nodes with the highest node degree, resulting in a network comprising 101 nodes and 756 edges. Global network metrics indicated an average node degree of 15, a local clustering coefficient of 0.600, and a PPI enrichment p-value of < 1 × 10⁻¹⁶. The average values of coexpression, experimentally determined interactions, automated text mining, and combined scores were calculated.

  2. STRING-Protein-Protein-Interactions-Network

    • kaggle.com
    zip
    Updated Nov 29, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dr. Nagendra (2025). STRING-Protein-Protein-Interactions-Network [Dataset]. https://www.kaggle.com/datasets/mannekuntanagendra/string-protein-protein-interactions-network
    Explore at:
    zip(6368384 bytes)Available download formats
    Dataset updated
    Nov 29, 2025
    Authors
    Dr. Nagendra
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    ataset representing a Protein-Protein Interaction (PPI) network of human proteins. Data generated and scored using the comprehensive STRING database resource. Focuses on analyzing functional and physical associations between proteins. Includes confidence scores (e.g., text-mining, experimental) for each interaction. A foundational resource for systems biology and identifying molecular hubs in disease pathways.

  3. PPI prediction data (STRING 12.0 based)

    • zenodo.org
    bin, tsv
    Updated Oct 15, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Konstantin Volzhenin; Konstantin Volzhenin (2024). PPI prediction data (STRING 12.0 based) [Dataset]. http://doi.org/10.5281/zenodo.13936160
    Explore at:
    bin, tsvAvailable download formats
    Dataset updated
    Oct 15, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Konstantin Volzhenin; Konstantin Volzhenin
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    An extensive dataset of binary physical protein-protein interaction extracted from STRING 12.0 (>12,000 organisms) with artificially generated negatives. The dataset includes 72M positive pairs with STRING confidence scores> 0.9 and 720M negative pairs. The corresponding protein sequences are located in the .fasta files. The generation of the negatives was derived from https://doi.org/10.1016/j.isci.2024.110371

  4. STRING database ver.10.5 archive 9606 data sets

    • figshare.com
    txt
    Updated Jun 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Elena Sugis (2023). STRING database ver.10.5 archive 9606 data sets [Dataset]. http://doi.org/10.6084/m9.figshare.7857185.v1
    Explore at:
    txtAvailable download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    Figsharehttp://figshare.com/
    figshare
    Authors
    Elena Sugis
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This data collection contains the data sets related to human (9606) that were previously deposed as separate datasets in STRING ver.10.5 before changing the download files structure with release of ver.11.0.

  5. Protein interaction data for 222 BM zone components

    • figshare.com
    xlsx
    Updated Feb 6, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mychel Morais; Ranjay Jayadev; Rachel Lennon; David Sherwood; Jamie Ellingford; Craig Lawless (2022). Protein interaction data for 222 BM zone components [Dataset]. http://doi.org/10.6084/m9.figshare.19127504.v1
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Feb 6, 2022
    Dataset provided by
    Figsharehttp://figshare.com/
    figshare
    Authors
    Mychel Morais; Ranjay Jayadev; Rachel Lennon; David Sherwood; Jamie Ellingford; Craig Lawless
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    All human protein interactions were obtained from STRING (https://string-db.org/, version 11.0). Interactions were then filtered to those involving only BM zone proteins. Related to Fig. S6B.

  6. f

    Classification results of PPI predictions on the STRING database.

    • datasetcatalog.nlm.nih.gov
    Updated Apr 23, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Lee, Kyubum; Chen, Qingyu; Wei, Chih-Hsuan; Yan, Shankai; Kim, Sun; Lu, Zhiyong (2020). Classification results of PPI predictions on the STRING database. [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0000489034
    Explore at:
    Dataset updated
    Apr 23, 2020
    Authors
    Lee, Kyubum; Chen, Qingyu; Wei, Chih-Hsuan; Yan, Shankai; Kim, Sun; Lu, Zhiyong
    Description

    Combined-scores: PPIs that have combined scores are considered positive cases. Experimental-700: PPIs that have experimental scores over 700 are considered positive cases. Direct comparison: the results of embeddings using the same method (cbow) and same hyperparameters. Different embedding methods: the results of BioConceptVec (skip-gram), BioConceptVec (GloVe) and BioConceptVec (fastText). The highest results of each section are marked as bold.

  7. o

    Data for RAPPPID: Towards Generalisable Protein Interaction Prediction with...

    • explore.openaire.eu
    Updated Jun 23, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Joseph Szymborski; Amin Emad (2022). Data for RAPPPID: Towards Generalisable Protein Interaction Prediction with AWD-LSTM Twin Networks [Dataset]. http://doi.org/10.5281/zenodo.6709789
    Explore at:
    Dataset updated
    Jun 23, 2022
    Authors
    Joseph Szymborski; Amin Emad
    Description

    Data for RAPPPID, a method for the Regularised Automative Prediction of Protein-Protein Interactions using Deep Learning. These datasets are in a format that RAPPPID is ready to read. Comparatives Dataset These datasets were derived from the STRING v11 H. sapiens dataset, according to the C1, C2, and C3 procedures outlined by Park and Marcotte, 2012. Negative samples are sampled randomly from the space of proteins not known to interact. See Szymborski & Emad for details. Repeatability Datasets The following datasets are all derived from STRING in the manner as the comparatives dataset, but three different random seeds are used for drawing proteins. References Park,Y. and Marcotte,E.M. (2012) Flaws in evaluation schemes for pair-input computational predictions. Nat Methods, 9, 1134–1136. Szklarczyk, D., Gable, A. L., Lyon, D., Junge, A., Wyder, S., Huerta-Cepas, J., Simonovic, M., Doncheva, N. T., Morris, J. H., Bork, P., Jensen, L. J., and Mering, C. (2019). String v11: protein–protein association networks with increased coverage, supporting functional discovery in genome-wide experimental datasets. Nucleic Acids Research, 47(D1), D607–D613. Szymborski,J. and Emad,A. (2021) RAPPPID: Towards Generalisable Protein Interaction Prediction with AWD-LSTM Twin Networks. bioRxiv https://doi.org/10.1101/2021.08.13.456309

  8. Validation of the total new predicted links and the new predicted links...

    • plos.figshare.com
    xls
    Updated Jun 5, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Wei Zhang; Jia Xu; Yuanyuan Li; Xiufen Zou (2023). Validation of the total new predicted links and the new predicted links associated with the 10 proteins by STRING database for the 14317_PPI data. [Dataset]. http://doi.org/10.1371/journal.pone.0177029.t009
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 5, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Wei Zhang; Jia Xu; Yuanyuan Li; Xiufen Zou
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Validation of the total new predicted links and the new predicted links associated with the 10 proteins by STRING database for the 14317_PPI data.

  9. Z

    Evaluating homophily of human PPI with respect to chromosomes

    • data.niaid.nih.gov
    Updated Jul 30, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Apollonio, Nicola; Blankenberg, Daniel; Cumbo, Fabio; Franciosa, Paolo Giulio; Santoni, Daniele (2022). Evaluating homophily of human PPI with respect to chromosomes [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_6941314
    Explore at:
    Dataset updated
    Jul 30, 2022
    Dataset provided by
    Institute for Systems Analysis and Computer Science "Antonio Ruberti", National Research Council of Italy, Via dei Taurini 19, 00185 Rome, Italy
    Department of Statistical Science, University of Rome "La Sapienza", Piazzale Aldo Moro 5, 00185 Rome, Italy
    Institute for applied mathematics "Mauro Picone", National Research Council of Italy, Via dei Taurini 19, 00185 Rome, Italy
    Genomic Medicine Institute, Lerner Research Institute, Cleveland Clinic, 9500 Euclid Avenue, Cleveland, Ohio 44195, USA
    Authors
    Apollonio, Nicola; Blankenberg, Daniel; Cumbo, Fabio; Franciosa, Paolo Giulio; Santoni, Daniele
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Homophily/heterophily evaluation, expressed in terms of z-score values, is related to the human Protein-Protein Interaction Network (PPI), obtained from the STRING v11.5 database (https://string-db.org) setting standard threshold on edge score (T=700). Each protein occurring in the PPI was assigned to a class corresponding to the chromosome the related gene belongs to.

    A total of 23 classes (chr1, chr2, ..., chr22, chrX) were considered (excluding the class corresponding to chromosome Y because of the small number of genes occurring in the network).

    The homophily/heterophily nature of the network, with respect to chromosome classes, was evaluated through HONTO tool (https://github.com/cumbof/honto).

    In other words, the tendency of proteins to preferentially interact with proteins whose genes are physically located on the same chromosome (homophily) or on different chromosomes (heterophily) was investigated and evaluated in terms of z-scores.

    Values related to intra (along the diagonal) and inter chromosomal interactions (other than the diagonal) are also reported as a heatmap.

    As one can observe, values occurring in the diagonal are clearly higher than values out of the diagonal, leading to assess a homophilic nature of the network, confirming the link between shared chromosome and interaction in the PPI.

  10. f

    Selection of 30 central genes from PPI network, including 17 upregulated and...

    • datasetcatalog.nlm.nih.gov
    Updated Jun 11, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Li, Ping; Wang, Xiaoming; Chen, Xuewei; Dong, Lijin; Fan, Rong (2021). Selection of 30 central genes from PPI network, including 17 upregulated and 13 downregulated genes, by using the STRING and Cytoscape software. [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0000899333
    Explore at:
    Dataset updated
    Jun 11, 2021
    Authors
    Li, Ping; Wang, Xiaoming; Chen, Xuewei; Dong, Lijin; Fan, Rong
    Description

    Selection of 30 central genes from PPI network, including 17 upregulated and 13 downregulated genes, by using the STRING and Cytoscape software.

  11. Flow edges file

    • figshare.com
    txt
    Updated Nov 21, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Aaron Baker (2022). Flow edges file [Dataset]. http://doi.org/10.6084/m9.figshare.21588270.v1
    Explore at:
    txtAvailable download formats
    Dataset updated
    Nov 21, 2022
    Dataset provided by
    Figsharehttp://figshare.com/
    figshare
    Authors
    Aaron Baker
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Snapshot of 9606.protein.links.full.v10.5.experiments.abc.txt from https://string-db.org/

  12. d

    Supplementary data 5. Autoimmune protein-protein interaction network...

    • catalogue.data.govt.nz
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Supplementary data 5. Autoimmune protein-protein interaction network (Ai-PPIN) data - Dataset - data.govt.nz - discover and use data [Dataset]. https://catalogue.data.govt.nz/dataset/oai-figshare-com-article-14274659
    Explore at:
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The PPI network was constructed using the genes that are regulated by the SNPs associated with 18 AiDs. STRING PPI data was used for building the network. The list of proteins present in the Ai-PPIN and the edgelist of the network is provided here.

  13. h

    clustered_ppi_string_dedup

    • huggingface.co
    Updated Feb 5, 2026
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Synthyra (2026). clustered_ppi_string_dedup [Dataset]. https://huggingface.co/datasets/Synthyra/clustered_ppi_string_dedup
    Explore at:
    Dataset updated
    Feb 5, 2026
    Dataset authored and provided by
    Synthyra
    Description

    Clustered PPI datasets (BIOGRID + STRING) with sequence-disjoint splits

    This dataset repo contains multiple dataset variants of protein–protein interactions (PPIs), built by clustering proteins by sequence similarity and then constructing train/valid/test splits that are intended to be disjoint at the protein level (and thus hard to memorize via near-identical sequences). Artifacts are stored as compressed pickles (*.pkl.gz). A helper downloader exists in this repo:… See the full description on the dataset page: https://huggingface.co/datasets/Synthyra/clustered_ppi_string_dedup.

  14. Supplementary Materials to Untargeted metabolomics and label-free...

    • figshare.com
    docx
    Updated Jan 19, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Adhita Sri Prabakusuma (2022). Supplementary Materials to Untargeted metabolomics and label-free quantitative proteomics analysis of whole milk protein from Chinese Binglangjiang and Dehong buffalo breeds [Dataset]. http://doi.org/10.6084/m9.figshare.18488000.v2
    Explore at:
    docxAvailable download formats
    Dataset updated
    Jan 19, 2022
    Dataset provided by
    Figsharehttp://figshare.com/
    figshare
    Authors
    Adhita Sri Prabakusuma
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Dehong Dai and Jingpo Autonomous Prefecture
    Description

    This study aimed to analyze metabolite abundances and proteome differences between Binglangjiang buffalo milk (BBM) and Dehong buffalo milk (DBM). Untargeted ultraperformance liquid chromatography-tandem mass spectrometry (UPLC-MS/MS), label-free quantitative proteomics approaches, and bioinformatics analyses including Gene Ontology (GO), Kyoto Encyclopedia of Genes and Genomes (KEGG), and protein-protein interaction (PPI) were performed.

  15. w

    CreativeWork

    • pfocr.wikipathways.org
    Updated Oct 12, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    WikiPathways (2024). CreativeWork [Dataset]. https://pfocr.wikipathways.org/figures/PMC10372325_gr6.html
    Explore at:
    Dataset updated
    Oct 12, 2024
    Dataset authored and provided by
    WikiPathways
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Transcription factor-protein-protein interaction networks (TF-PPI) key pathway modulators in diabetes. A network of significantly modulated TF-PPIs for intact (A) and injured vessels at different timepoints - 20 hours (B), 2 weeks (C), and 6 weeks (D). Significantly up- and down-regulated genes from each timepoint comparing Goto-Kakizaki (GK) vs Wistar rats, were used to obtain TF-PPI, and this information was fed into STRING database to generate the network. The top 10 up- and 10 down-regulated TFs are shown in the network above. Up- and down-regulated TFs are indicated in green and red nodes respectively. Size of the nodes indicate the levels of P-value. All the interactions were predicted with the adjusted P-value < .05.

  16. Data from: Identification of transcription factor co-binding patterns with...

    • zenodo.org
    zip
    Updated Feb 2, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ieva Rauluseviciute; Ieva Rauluseviciute; Timothée Launay; Guido Barzaghi; Guido Barzaghi; Sarvesh Nikumbh; Sarvesh Nikumbh; Boris Lenhard; Boris Lenhard; Arnaud Krebs; Arnaud Krebs; Jaime A. Castro-Mondragon; Jaime A. Castro-Mondragon; Anthony Mathelier; Anthony Mathelier; Timothée Launay (2024). Identification of transcription factor co-binding patterns with non-negative matrix factorization [Dataset]. http://doi.org/10.5281/zenodo.10609241
    Explore at:
    zipAvailable download formats
    Dataset updated
    Feb 2, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Ieva Rauluseviciute; Ieva Rauluseviciute; Timothée Launay; Guido Barzaghi; Guido Barzaghi; Sarvesh Nikumbh; Sarvesh Nikumbh; Boris Lenhard; Boris Lenhard; Arnaud Krebs; Arnaud Krebs; Jaime A. Castro-Mondragon; Jaime A. Castro-Mondragon; Anthony Mathelier; Anthony Mathelier; Timothée Launay
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This repository contains pre-processed data required to reproduce the results of the paper "Identification of transcription factor co-binding patterns with non-negative matrix factorization". The repository with the code can be found here: https://bitbucket.org/CBGR/cobind_manuscript/src/master/. Data include transcription factor binding sites (TFBSs) for 7 species from UniBind 2021 database, joined motif collection from CIS-BP and JASPAR 2022 databases and corresponding physical protein-protein interaction (PPI) data from STRING database.

  17. m

    Supplementary resources to: “Using the Gene Ontology tool to produce de novo...

    • data.mendeley.com
    Updated Feb 4, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Anderson Santos (2017). Supplementary resources to: “Using the Gene Ontology tool to produce de novo protein-protein interaction networks with IS_A relationship” [Dataset]. http://doi.org/10.17632/f5nbt7gfvp.1
    Explore at:
    Dataset updated
    Feb 4, 2017
    Authors
    Anderson Santos
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    In the file “Gene_Ontology_de_novo_PPI.zip”, I present data extracted from the database used to coin the article " Using the Gene Ontology tool to produce de novo protein-protein interaction networks with IS_A relationship". There are protein-protein interaction (PPI) networks for all the ten organisms mentioned in the article, besides their respective plasmids. However, the PPIs available for download differ from those published since I didn't restringed them only to true positives according to the String database. Instead of that, I considered candidate relationships all protein pairs possessing commonality between all the three Gene Ontology categories. The edges weight reflects a logarithmic distance between protein pairs measured over the gene position within a chromosome. According to the methodology applied in the paper, a pair of genes separated by five loci has the weight=(1+(MAX-log(pos(locus_00006)-pos(locus_00001)))). In this formulae, pos extracts the locus_tag index; the logarithm of the difference is summed to one to avoid edges smaller than one because it could not be accepted by some visualization tools like GEPHI; MAX creates thicker lines for closer pairs. Figure 1 depicts data from the file "Escherichia_coli_S88_p1.dot".

  18. The analysis of SNT differential proteins after intervention UC based on...

    • data.niaid.nih.gov
    xml
    Updated Sep 9, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    enhui Ji; Hongjun Yang (2021). The analysis of SNT differential proteins after intervention UC based on proteomics [Dataset]. https://data.niaid.nih.gov/resources?id=pxd018509
    Explore at:
    xmlAvailable download formats
    Dataset updated
    Sep 9, 2021
    Dataset provided by
    Institute of Chinese Materia Medica, China Academy of Chinese Medical Sciences
    Authors
    enhui Ji; Hongjun Yang
    Variables measured
    Proteomics
    Description

    Orbitrap Fusion (Thermo Fisher Scientific) LC-MS/MS analyses were performed on an Easy-nLC 1000 liquid chromatography system (Thermo Fisher Scientific) coupled to an Orbitrap Fusion via a nano-electrospray ion source. Tryptic peptides were dissolved with a loading buffer (acetonitrile and 0.1% formic acid), and were eluted with a flow rate of 350 nL/min. Survey scans were acquired after an accumulation of 5×105 ions in the Orbitrap for m/z 300-1,400 using a resolution of 120,000 at m/z. The top speed data-dependent mode was selected for fragmentation in the cell at a normalized collision energy of 32%, and fragment ions were then transferred into the ion trap analyzer with the AGC target at 5×103 and maximum injection time at 35 ms. The dynamic exclusion of previously acquired precursor ions was enabled at 18 s. The Proteome Discoverer 1.4.1.14 was used for analysis of the protein spectrum. Oxidation (Methionine) and acetylation (Protein-N term) were chosen as variable modifications, cysteine carbamidomethylation was chosen as a fixed modification. Two missed cleavage sites for trypsin were allowed. The intensity-based absolute quantification (iBAQ)-based protein quantification were performed by an in-house software. The interaction of SNT-related differentially expressed proteins was investigated by STRING 11.0 (https://string-db.org). The differentially expressed protein interaction network (high reliability, interaction score > 0.4, PPI enrichment P-value < 1.0×10 -16) was selected for the analysis.

  19. f

    Table_1_Protein-Protein Interactions in Candida albicans.xlsx

    • datasetcatalog.nlm.nih.gov
    Updated Aug 7, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Schoeters, Floris; Van Dijck, Patrick (2019). Table_1_Protein-Protein Interactions in Candida albicans.xlsx [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0000118969
    Explore at:
    Dataset updated
    Aug 7, 2019
    Authors
    Schoeters, Floris; Van Dijck, Patrick
    Description

    Despite being one of the most important human fungal pathogens, Candida albicans has not been studied extensively at the level of protein-protein interactions (PPIs) and data on PPIs are not readily available in online databases. In January 2018, the database called “Biological General Repository for Interaction Datasets (BioGRID)” that contains the most PPIs for C. albicans, only documented 188 physical or direct PPIs (release 3.4.156) while several more can be found in the literature. Other databases such as the String database, the Molecular INTeraction Database (MINT), and the Database for Interacting Proteins (DIP) database contain even fewer interactions or do not even include C. albicans as a searchable term. Because of the non-canonical codon usage of C. albicans where CUG is translated as serine rather than leucine, it is often problematic to use the yeast two-hybrid system in Saccharomyces cerevisiae to study C. albicans PPIs. However, studying PPIs is crucial to gain a thorough understanding of the function of proteins, biological processes and pathways. PPIs can also be potential drug targets. To aid in creating PPI networks and updating the BioGRID, we performed an exhaustive literature search in order to provide, in an accessible format, a more extensive list of known PPIs in C. albicans.

  20. f

    Additional file 2: Table S1. of Changes in selective pressures associated...

    • datasetcatalog.nlm.nih.gov
    Updated Dec 14, 2016
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bazin, Eric; Vatsiou, Alexandra; Gaggiotti, Oscar (2016). Additional file 2: Table S1. of Changes in selective pressures associated with human population expansion may explain metabolic and immune related pathways enriched for signatures of positive selection [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0001514366
    Explore at:
    Dataset updated
    Dec 14, 2016
    Authors
    Bazin, Eric; Vatsiou, Alexandra; Gaggiotti, Oscar
    Description

    Significant gene sets found using iHS scores and Daub et al. [19] approach. Threshold is set to <0.09. Table S2. Significant gene sets found using iHS scores and Gowinda approach. Threshold is set to <0.09. Table S3. Significant gene sets found using XPCLR scores and Daub et al. [19] approach. Threshold is set to <0.09. For all the population pairs, the first population is the objective one and the second, the reference. Table S4. Significant gene sets found using XPCLR scores and Gowinda approach. Threshold is set to 0.09. For all the population pairs, the first population is the objective one and the second, the reference. Table S5. Immunity related gene sets detected with the GSEA approaches. Q-value threshold is set to <0.09. Table S6. 17 significant genes related to obesity, diabetes and metabolic syndrome that were found to be under positive selection with XPCLR and iHS analysis using the list of genes derived from Bio4j. Some of them (indicated with *) have been detected in previous studies to be under positive selection, too. The threshold was calculated based on the 1Â % cut off level. Genes are categorized in groups of potential risk factors, potential protective and indirect associations. Table S7. Significant genes related to obesity, diabetes and metabolic syndrome that we found to be under positive selection from XPCLR and iHS analysis using the Protein-Protein Interaction (PPI) networks from the STRING database. Five of them (indicated with *) have been detected in previous studies to be under positive selection. (XLSX 29 kb)

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Dain Lee (2025). STRING Network Analysis [Dataset]. http://doi.org/10.6084/m9.figshare.29126396.v2
Organization logoOrganization logo

STRING Network Analysis

Explore at:
Dataset updated
May 22, 2025
Dataset provided by
Figsharehttp://figshare.com/
figshare
Authors
Dain Lee
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

This file contains the protein-protein interaction analysis dataset that was used in the unpublished manuscript and was further analyzed with the STRING online software.Significantly upregulated mRNAs (2,777 genes; p < 0.05) identified by bulk RNA-seq were analyzed using the STRING module in Cytoscape v.2.2.0 (Institute for System Biology; WA; USA). A cluster network was constructed using the MCL algorithm with a granularity parameter of 4, followed by filtering nodes with mcl.cluster > 10. The resulting 1,848 nodes were processed through STRING v12.0 (Swiss Institute of Bioinformatics; Lausanne; Switzerland) to generate a protein–protein interaction (PPI) network, incorporating evidence from text mining, genomic neighborhood, experimental data, curated databases, co-expression, gene fusion, and co-occurrence, with a minimum confidence score threshold of 0.40. Network modules were defined using the DBSCAN clustering algorithm with an ε parameter of 2. Cluster 1, representing the largest gene set (101 genes), was further analyzed by sorting the top 20 nodes with the highest node degree, resulting in a network comprising 101 nodes and 756 edges. Global network metrics indicated an average node degree of 15, a local clustering coefficient of 0.600, and a PPI enrichment p-value of < 1 × 10⁻¹⁶. The average values of coexpression, experimentally determined interactions, automated text mining, and combined scores were calculated.

Search
Clear search
Close search
Google apps
Main menu