100+ datasets found
  1. f

    Data from: Cytoscape StringApp: Network Analysis and Visualization of...

    • acs.figshare.com
    xlsx
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nadezhda T. Doncheva; John H. Morris; Jan Gorodkin; Lars J. Jensen (2023). Cytoscape StringApp: Network Analysis and Visualization of Proteomics Data [Dataset]. http://doi.org/10.1021/acs.jproteome.8b00702.s002
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    ACS Publications
    Authors
    Nadezhda T. Doncheva; John H. Morris; Jan Gorodkin; Lars J. Jensen
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    Protein networks have become a popular tool for analyzing and visualizing the often long lists of proteins or genes obtained from proteomics and other high-throughput technologies. One of the most popular sources of such networks is the STRING database, which provides protein networks for more than 2000 organisms, including both physical interactions from experimental data and functional associations from curated pathways, automatic text mining, and prediction methods. However, its web interface is mainly intended for inspection of small networks and their underlying evidence. The Cytoscape software, on the other hand, is much better suited for working with large networks and offers greater flexibility in terms of network analysis, import, and visualization of additional data. To include both resources in the same workflow, we created stringApp, a Cytoscape app that makes it easy to import STRING networks into Cytoscape, retains the appearance and many of the features of STRING, and integrates data from associated databases. Here, we introduce many of the stringApp features and show how they can be used to carry out complex network analysis and visualization tasks on a typical proteomics data set, all through the Cytoscape user interface. stringApp is freely available from the Cytoscape app store: http://apps.cytoscape.org/apps/stringapp.

  2. f

    STRING protein-protein interaction networks for WT-C vs. WT-D.

    • datasetcatalog.nlm.nih.gov
    • figshare.com
    • +1more
    Updated Aug 19, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Chakrabarti, Subrata; Lin, Hanxin; Su, Zhaoliang; Biswas, Saumik; Feng, Biao; Levy, Michael; Sooshtari, Parisa (2022). STRING protein-protein interaction networks for WT-C vs. WT-D. [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0000392311
    Explore at:
    Dataset updated
    Aug 19, 2022
    Authors
    Chakrabarti, Subrata; Lin, Hanxin; Su, Zhaoliang; Biswas, Saumik; Feng, Biao; Levy, Michael; Sooshtari, Parisa
    Description

    STRING protein-protein interaction networks for WT-C vs. WT-D.

  3. f

    The optimal dimensions of raw network embedding representations and the...

    • plos.figshare.com
    xls
    Updated Jun 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Cen Wan; Domenico Cozzetto; Rui Fa; David T. Jones (2023). The optimal dimensions of raw network embedding representations and the corresponding 3rd hidden layer outputs (a.k.a. the STRING2GO-learnt functional representations) with their corresponding predictive power for biological process terms prediction, and the main characteristics of different STRING networks. [Dataset]. http://doi.org/10.1371/journal.pone.0209958.t001
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Cen Wan; Domenico Cozzetto; Rui Fa; David T. Jones
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The optimal dimensions of raw network embedding representations and the corresponding 3rd hidden layer outputs (a.k.a. the STRING2GO-learnt functional representations) with their corresponding predictive power for biological process terms prediction, and the main characteristics of different STRING networks.

  4. f

    Unknown genes and genes without any interactions in STRING in predicted T....

    • datasetcatalog.nlm.nih.gov
    Updated Sep 28, 2016
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jäntti, Jussi; Castillo, Sandra; Pakula, Tiina; Oja, Merja; Penttilä, Merja; Arvas, Mikko; Kludas, Jana; Rousu, Juho; Brouard, Céline (2016). Unknown genes and genes without any interactions in STRING in predicted T. reesei secretion network. [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0001517561
    Explore at:
    Dataset updated
    Sep 28, 2016
    Authors
    Jäntti, Jussi; Castillo, Sandra; Pakula, Tiina; Oja, Merja; Penttilä, Merja; Arvas, Mikko; Kludas, Jana; Rousu, Juho; Brouard, Céline
    Description

    Column ‘Gene’ contains the T. reesei gene ID. ‘In STRING’ tells if the gene has interactions in STRING. Columns ‘Btw’ and ‘Deg’ denote the betweenness and degree network statistics of the corresponding gene. Columns ‘Class’ and ‘Putative secretion pathway component’ are author assigned classifications. ‘Taxon specificity’ gives the largest taxonomic group the gene was found in.

  5. Statistics of the genes in the protein interaction network constructed based...

    • plos.figshare.com
    xls
    Updated May 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Shunyao Wu; Fengjing Shao; Jun Ji; Rencheng Sun; Rizhuang Dong; Yuanke Zhou; Shaojie Xu; Yi Sui; Jianlong Hu (2023). Statistics of the genes in the protein interaction network constructed based on the STRING database. [Dataset]. http://doi.org/10.1371/journal.pone.0116505.t003
    Explore at:
    xlsAvailable download formats
    Dataset updated
    May 30, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Shunyao Wu; Fengjing Shao; Jun Ji; Rencheng Sun; Rizhuang Dong; Yuanke Zhou; Shaojie Xu; Yi Sui; Jianlong Hu
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Statistics of the genes in the protein interaction network constructed based on the STRING database.

  6. Hepatitis C and hepatocellular carcinoma

    • wikipathways.org
    Updated Feb 16, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    WikiPathways (2023). Hepatitis C and hepatocellular carcinoma [Dataset]. https://www.wikipathways.org/pathways/WP3646.html
    Explore at:
    Dataset updated
    Feb 16, 2023
    Dataset authored and provided by
    WikiPathwayshttp://wikipathways.org/
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Pathway model based on hub miRNAs and their putative targets from network analysis. * From a set of differentially expressed genes in both chronic HCV (hepatitis C virus) and HCC (hepatocellular carcinoma) samples, a protein-protein network was constructed using STRING and GeneMANIA. * After topological analysis and network visualization in Cytoscape, the top hub genes were identified. * miRNAs related to hub genes were identified using miRTarBase server and combined with the PPI network to constructed a miRNA-Hubgene network. Based on Figure 4 from Poortahmasebi et al, How Hepatitis C Virus Leads to Hepatocellular Carcinoma: A Network-Based Study. Proteins on this pathway have targeted assays available via the CPTAC Assay Portal.

  7. i

    STRING

    • integbio.jp
    • opendatalab.com
    Updated Jun 17, 2013
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    STRING Consortium (2013). STRING [Dataset]. https://integbio.jp/dbcatalog/en/record/nbdc00690?jtpl=56
    Explore at:
    Dataset updated
    Jun 17, 2013
    Dataset provided by
    STRING Consortium
    License

    http://string-db.org/newstring_cgi/show_download_page.plhttp://string-db.org/newstring_cgi/show_download_page.pl

    Description

    STRING is a database of known and predicted protein interactions, including both physical and functional interactions. It contains data which derived from four sources: genomic context, high-throughput experiments, coexpression and previous knowledge. This database quantitatively integrates interaction data from these sources for a large number of organisms, and transfers information between these organisms where applicable. It performs iterative searches and visualizes the results in their genomic context. Many data including protein sequences, protein network, interaction types for protein links, orthologous groups or full database dumps (license required) are downloadable.

  8. f

    Basic information of the four original networks (HIPPIE, HumanNet, FunCoup...

    • datasetcatalog.nlm.nih.gov
    • plos.figshare.com
    Updated Dec 22, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yang, Jian; Lin, Limei; Yang, Fan; Wu, Duzhi; Yang, Tinghong; Zhao, Jing (2017). Basic information of the four original networks (HIPPIE, HumanNet, FunCoup and STRING) and the GO network. [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0001798563
    Explore at:
    Dataset updated
    Dec 22, 2017
    Authors
    Yang, Jian; Lin, Limei; Yang, Fan; Wu, Duzhi; Yang, Tinghong; Zhao, Jing
    Description

    Basic information of the four original networks (HIPPIE, HumanNet, FunCoup and STRING) and the GO network.

  9. Unclassified proteins in the STRING network analysis.

    • plos.figshare.com
    xls
    Updated Jun 2, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Han Tian; Leilei Wang; Ruiqi Cai; Ling Zheng; Lin Guo (2023). Unclassified proteins in the STRING network analysis. [Dataset]. http://doi.org/10.1371/journal.pone.0116453.t001
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 2, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Han Tian; Leilei Wang; Ruiqi Cai; Ling Zheng; Lin Guo
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Unclassified proteins in the STRING network analysis.

  10. f

    Selection of 30 central genes from PPI network, including 17 upregulated and...

    • datasetcatalog.nlm.nih.gov
    • plos.figshare.com
    Updated Jun 11, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Li, Ping; Wang, Xiaoming; Chen, Xuewei; Dong, Lijin; Fan, Rong (2021). Selection of 30 central genes from PPI network, including 17 upregulated and 13 downregulated genes, by using the STRING and Cytoscape software. [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0000899333
    Explore at:
    Dataset updated
    Jun 11, 2021
    Authors
    Li, Ping; Wang, Xiaoming; Chen, Xuewei; Dong, Lijin; Fan, Rong
    Description

    Selection of 30 central genes from PPI network, including 17 upregulated and 13 downregulated genes, by using the STRING and Cytoscape software.

  11. w

    CreativeWork

    • pfocr.wikipathways.org
    Updated Oct 12, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    WikiPathways (2024). CreativeWork [Dataset]. https://pfocr.wikipathways.org/figures/PMC10242111_fcell-11-1165308-g004.html
    Explore at:
    Dataset updated
    Oct 12, 2024
    Dataset authored and provided by
    WikiPathways
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Protein–protein interaction network of the top differentially expressed genes between the patient’s samples and the Ctrl cohort. Edges represent protein–protein associations. Confidence ≥0.700; maximum number of interactors ≤20. Edge confidence: high (0.700) and highest (0.900) (see https://string-db.org/cgi/network).

  12. Supplemental Table S5

    • figshare.com
    docx
    Updated Jun 2, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Abraham Moller (2023). Supplemental Table S5 [Dataset]. http://doi.org/10.6084/m9.figshare.13355945.v1
    Explore at:
    docxAvailable download formats
    Dataset updated
    Jun 2, 2023
    Dataset provided by
    figshare
    Figsharehttp://figshare.com/
    Authors
    Abraham Moller
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Summary statistics for protein-protein interaction networks identified with STRING amongst genes corresponding to significant SNPs or k-mers (inside or adjacent to genes). PPI enrichment p-value corresponds to the likelihood nodes and edges would be selected from the S. aureus database by chance.

  13. European Power Grid Network Dataset

    • kaggle.com
    zip
    Updated Mar 2, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Afroz (2024). European Power Grid Network Dataset [Dataset]. https://www.kaggle.com/datasets/pythonafroz/european-power-grid-network-dataset/discussion
    Explore at:
    zip(92071 bytes)Available download formats
    Dataset updated
    Mar 2, 2024
    Authors
    Afroz
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Description:

    The European Power Grid Network dataset contains anonym zed data that sheds light on the intricate connections between nodes within Europe’s electricity grid. Researchers and policymakers can leverage this dataset to gain valuable insights into energy trading patterns, nodal prices, and the stability of energy supply.

    1. Network Structure and Insights:

    o The dataset provides detailed information about the interconnections between nodes across the European power grid. Researchers can analyze these links to understand how electricity flows between different regions. o By examining nodal prices, researchers can uncover pricing dynamics. This includes variations based on geographical location, demand, and supply. o Geospatial analysis facilitated by this dataset allows researchers to identify patterns in power market behavior, congestion points, and reliability challenges.

    2. Critical Energy Supplies and Stability:

    o Identifying critical energy supplies is essential for maintaining grid stability. Policymakers can use this dataset to inform decisions related to energy security and resilience. o Additionally, the dataset enables cross-state comparisons of power price competitiveness, aiding policymakers in designing effective energy policies.

    This dataset contains anonymized information about the European power grid network, providing insights on the connections between nodes and their pricing. To use this dataset, one must identify the source and destination nodes of the power grid along with associated features such as prices and country information.

    Firstly, it is important to understand the readings of each column in order to navigate through the data effectively:

    1. from: The source node of the power grid. (Integer)

    2. to: The destination node of the power grid. (Integer)

    3. name: Name of the node in European Power Grid Network. (String)

    4. price: Price of electricity at each node. (Float)

    5. country: Country in which a particular node is located. (String).

    Secondly, it is helpful to visualize and explore this dataset with various plots for better understanding its features for valuable analysis insights such as geospatial exploration by plotting out their geographical locations on maps; comparison between different countries or regions regarding electricity prices; assessing economic relationships through trade flows or supply-chains networks related to energy market developments; etc., all are possible via simple analyses that can be done from this european_power_grid dataset!

    Acknowledgements

    If you use this dataset in your research, please credit the original authors.

    https://zenodo.org/records/7037956#.Y9Y6yNJBwUE

    License

    License: CC0 1.0 Universal (CC0 1.0) - Public Domain Dedication No Copyright - You can copy, modify, distribute and perform the work, even for commercial purposes, all without asking permission.

    https://creativecommons.org/publicdomain/zero/1.0/

  14. OGBN-Proteins (Processed for PyG)

    • kaggle.com
    zip
    Updated Feb 27, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Redao da Taupl (2021). OGBN-Proteins (Processed for PyG) [Dataset]. https://www.kaggle.com/dataup1/ogbn-proteins
    Explore at:
    zip(677947148 bytes)Available download formats
    Dataset updated
    Feb 27, 2021
    Authors
    Redao da Taupl
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    OGBN-Proteins

    Webpage: https://ogb.stanford.edu/docs/nodeprop/#ogbn-proteins

    Usage in Python

    import os.path as osp
    import pandas as pd
    import torch
    import torch_geometric.transforms as T
    from ogb.nodeproppred import PygNodePropPredDataset
    
    class PygOgbnProteins(PygNodePropPredDataset):
      def _init_(self, meta_csv = None):
        root, name, transform = '/kaggle/input', 'ogbn-proteins', T.ToSparseTensor()
        if meta_csv is None:
          meta_csv = osp.join(root, name, 'ogbn-master.csv')
        master = pd.read_csv(meta_csv, index_col = 0)
        meta_dict = master[name]
        meta_dict['dir_path'] = osp.join(root, name)
        super()._init_(name = name, root = root, transform = transform, meta_dict = meta_dict)
      def get_idx_split(self, split_type = None):
        if split_type is None:
          split_type = self.meta_info['split']
        path = osp.join(self.root, 'split', split_type)
        if osp.isfile(os.path.join(path, 'split_dict.pt')):
          return torch.load(os.path.join(path, 'split_dict.pt'))
        if self.is_hetero:
          train_idx_dict, valid_idx_dict, test_idx_dict = read_nodesplitidx_split_hetero(path)
          for nodetype in train_idx_dict.keys():
            train_idx_dict[nodetype] = torch.from_numpy(train_idx_dict[nodetype]).to(torch.long)
            valid_idx_dict[nodetype] = torch.from_numpy(valid_idx_dict[nodetype]).to(torch.long)
            test_idx_dict[nodetype] = torch.from_numpy(test_idx_dict[nodetype]).to(torch.long)
            return {'train': train_idx_dict, 'valid': valid_idx_dict, 'test': test_idx_dict}
        else:
          train_idx = dt.fread(osp.join(path, 'train.csv'), header = None).to_numpy().T[0]
          train_idx = torch.from_numpy(train_idx).to(torch.long)
          valid_idx = dt.fread(osp.join(path, 'valid.csv'), header = None).to_numpy().T[0]
          valid_idx = torch.from_numpy(valid_idx).to(torch.long)
          test_idx = dt.fread(osp.join(path, 'test.csv'), header = None).to_numpy().T[0]
          test_idx = torch.from_numpy(test_idx).to(torch.long)
          return {'train': train_idx, 'valid': valid_idx, 'test': test_idx}
    
    dataset = PygOgbnProteins()
    split_idx = dataset.get_idx_split()
    train_idx, valid_idx, test_idx = split_idx['train'], split_idx['valid'], split_idx['test']
    graph = dataset[0] # PyG Graph object
    

    Description

    Graph: The ogbn-proteins dataset is an undirected, weighted, and typed (according to species) graph. Nodes represent proteins, and edges indicate different types of biologically meaningful associations between proteins, e.g., physical interactions, co-expression or homology [1,2]. All edges come with 8-dimensional features, where each dimension represents the strength of a single association type and takes values between 0 and 1 (the larger the value is, the stronger the association is). The proteins come from 8 species.

    Prediction task: The task is to predict the presence of protein functions in a multi-label binary classification setup, where there are 112 kinds of labels to predict in total. The performance is measured by the average of ROC-AUC scores across the 112 tasks.

    Dataset splitting: The authors split the protein nodes into training/validation/test sets according to the species which the proteins come from. This enables the evaluation of the generalization performance of the model across different species.

    Note: For undirected graphs, the loaded graphs will have the doubled number of edges because the bidirectional edges will be added automatically.

    Summary

    Package#Nodes#EdgesSplit TypeTask TypeMetric
    ogb>=1.1.1132,53439,561,252SpeciesMulti-label binary classificationROC-AUC

    Open Graph Benchmark

    Website: https://ogb.stanford.edu

    The Open Graph Benchmark (OGB) [3] is a collection of realistic, large-scale, and diverse benchmark datasets for machine learning on graphs. OGB datasets are automatically downloaded, processed, and split using the OGB Data Loader. The model performance can be evaluated using the OGB Evaluator in a unified manner.

    References

    [1] Damian Szklarczyk, Annika L Gable, David Lyon, Alexander Junge, Stefan Wyder, Jaime Huerta-Cepas, Milan Simonovic, Nadezhda T Doncheva, John H Morris, Peer Bork, et al. STRING v11: protein–protein association networks with increased coverage, supporting functional discovery in genome-wide experimental datasets. Nucleic Acids Research, 47(D1):D607–D613, 2019. [2] Gene Ontology Consortium. The gene ontology resource: 20 years and still going strong. Nucleic Acids Research, 47(D1):D330–D338, 2018. [3] Weihua Hu, Matthias Fey, Marinka Zitnik, Yuxiao Dong, Hongyu Ren, Bowen Liu, Michele Catasta, and Jure Leskovec. Open graph benchm...

  15. Z

    Evaluating homophily of human PPI with respect to chromosomes

    • data.niaid.nih.gov
    Updated Jul 30, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Apollonio, Nicola; Blankenberg, Daniel; Cumbo, Fabio; Franciosa, Paolo Giulio; Santoni, Daniele (2022). Evaluating homophily of human PPI with respect to chromosomes [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_6941314
    Explore at:
    Dataset updated
    Jul 30, 2022
    Dataset provided by
    Genomic Medicine Institute, Lerner Research Institute, Cleveland Clinic, 9500 Euclid Avenue, Cleveland, Ohio 44195, USA
    Institute for Systems Analysis and Computer Science "Antonio Ruberti", National Research Council of Italy, Via dei Taurini 19, 00185 Rome, Italy
    Department of Statistical Science, University of Rome "La Sapienza", Piazzale Aldo Moro 5, 00185 Rome, Italy
    Institute for applied mathematics "Mauro Picone", National Research Council of Italy, Via dei Taurini 19, 00185 Rome, Italy
    Authors
    Apollonio, Nicola; Blankenberg, Daniel; Cumbo, Fabio; Franciosa, Paolo Giulio; Santoni, Daniele
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Homophily/heterophily evaluation, expressed in terms of z-score values, is related to the human Protein-Protein Interaction Network (PPI), obtained from the STRING v11.5 database (https://string-db.org) setting standard threshold on edge score (T=700). Each protein occurring in the PPI was assigned to a class corresponding to the chromosome the related gene belongs to.

    A total of 23 classes (chr1, chr2, ..., chr22, chrX) were considered (excluding the class corresponding to chromosome Y because of the small number of genes occurring in the network).

    The homophily/heterophily nature of the network, with respect to chromosome classes, was evaluated through HONTO tool (https://github.com/cumbof/honto).

    In other words, the tendency of proteins to preferentially interact with proteins whose genes are physically located on the same chromosome (homophily) or on different chromosomes (heterophily) was investigated and evaluated in terms of z-scores.

    Values related to intra (along the diagonal) and inter chromosomal interactions (other than the diagonal) are also reported as a heatmap.

    As one can observe, values occurring in the diagonal are clearly higher than values out of the diagonal, leading to assess a homophilic nature of the network, confirming the link between shared chromosome and interaction in the PPI.

  16. f

    Table4_Predicting Human Protein Subcellular Locations by Using a Combination...

    • datasetcatalog.nlm.nih.gov
    Updated Nov 5, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Zhang, ShiQi; Zhang, Yu-Hang; Huang, Tao; Cai, Yu-Dong; Zeng, Tao; Li, ZhanDong; Chen, Lei (2021). Table4_Predicting Human Protein Subcellular Locations by Using a Combination of Network and Function Features.XLSX [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0000819826
    Explore at:
    Dataset updated
    Nov 5, 2021
    Authors
    Zhang, ShiQi; Zhang, Yu-Hang; Huang, Tao; Cai, Yu-Dong; Zeng, Tao; Li, ZhanDong; Chen, Lei
    Description

    Given the limitation of technologies, the subcellular localizations of proteins are difficult to identify. Predicting the subcellular localization and the intercellular distribution patterns of proteins in accordance with their specific biological roles, including validated functions, relationships with other proteins, and even their specific sequence characteristics, is necessary. The computational prediction of protein subcellular localizations can be performed on the basis of the sequence and the functional characteristics. In this study, the protein–protein interaction network, functional annotation of proteins and a group of direct proteins with known subcellular localization were used to construct models. To build efficient models, several powerful machine learning algorithms, including two feature selection methods, four classification algorithms, were employed. Some key proteins and functional terms were discovered, which may provide important contributions for determining protein subcellular locations. Furthermore, some quantitative rules were established to identify the potential subcellular localizations of proteins. As the first prediction model that uses direct protein annotation information (i.e., functional features) and STRING-based protein–protein interaction network (i.e., network features), our computational model can help promote the development of predictive technologies on subcellular localizations and provide a new approach for exploring the protein subcellular localization patterns and their potential biological importance.

  17. f

    Topology of TCR signaling networks and effects of LAT deficiency.

    • datasetcatalog.nlm.nih.gov
    • plos.figshare.com
    • +1more
    Updated Oct 30, 2013
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Acuto, Oreste; Dushek, Omer; Salek, Mogjiborahman; Efstathiou, Georgios; de Wet, Ben; Trudgian, David C.; McGowan, Simon (2013). Topology of TCR signaling networks and effects of LAT deficiency. [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0001632564
    Explore at:
    Dataset updated
    Oct 30, 2013
    Authors
    Acuto, Oreste; Dushek, Omer; Salek, Mogjiborahman; Efstathiou, Georgios; de Wet, Ben; Trudgian, David C.; McGowan, Simon
    Description

    (A and B) Phosphorylation-specific networks based on integration of our phosphoproteomic data with protein-protein interactions in the STRING database. Note that both networks show constellation of hubs characterized by interacting proteins function. Zooms into the signaling hub of LAT efficient (LAT+/+) or deficient (LAT−/−) cell lines shows first neighbors (pistachio green or orange circles) of CD3ζ, LCK and ZAP-70 (red or blue circles). (C) The number of edges and (D) degree distribution for experimental and random networks. Orange and pistachio-green triangles correspond to degree distribution in networks based on data from two replicas in Jurkat CL20 cell line. Blue triangles and purple crosses correspond to networks based on the data in LAT-deficient cell line and randomly selected proteins, respectively.

  18. Validation of the total new predicted links and the new predicted links...

    • plos.figshare.com
    xls
    Updated Jun 5, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Wei Zhang; Jia Xu; Yuanyuan Li; Xiufen Zou (2023). Validation of the total new predicted links and the new predicted links associated with the 10 proteins by STRING database for the 14317_PPI data. [Dataset]. http://doi.org/10.1371/journal.pone.0177029.t009
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 5, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Wei Zhang; Jia Xu; Yuanyuan Li; Xiufen Zou
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Validation of the total new predicted links and the new predicted links associated with the 10 proteins by STRING database for the 14317_PPI data.

  19. f

    Data from: A Comparison of Computational Methods for Identifying Virulence...

    • datasetcatalog.nlm.nih.gov
    • plos.figshare.com
    Updated Aug 3, 2012
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Hu, Le-Le; Zheng, Lu-Lu; Ding, Juan; Hao, Pei; Feng, Kai-Yan; Li, Yi-Xue; Wang, Ya-Jun; Guo, Xiao-Kui; Cai, Yu-Dong; Chou, Kuo-Chen (2012). A Comparison of Computational Methods for Identifying Virulence Factors [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0001164979
    Explore at:
    Dataset updated
    Aug 3, 2012
    Authors
    Hu, Le-Le; Zheng, Lu-Lu; Ding, Juan; Hao, Pei; Feng, Kai-Yan; Li, Yi-Xue; Wang, Ya-Jun; Guo, Xiao-Kui; Cai, Yu-Dong; Chou, Kuo-Chen
    Description

    Bacterial pathogens continue to threaten public health worldwide today. Identification of bacterial virulence factors can help to find novel drug/vaccine targets against pathogenicity. It can also help to reveal the mechanisms of the related diseases at the molecular level. With the explosive growth in protein sequences generated in the postgenomic age, it is highly desired to develop computational methods for rapidly and effectively identifying virulence factors according to their sequence information alone. In this study, based on the protein-protein interaction networks from the STRING database, a novel network-based method was proposed for identifying the virulence factors in the proteomes of UPEC 536, UPEC CFT073, P. aeruginosa PAO1, L. pneumophila Philadelphia 1, C. jejuni NCTC 11168 and M. tuberculosis H37Rv. Evaluated on the same benchmark datasets derived from the aforementioned species, the identification accuracies achieved by the network-based method were around 0.9, significantly higher than those by the sequence-based methods such as BLAST, feature selection and VirulentPred. Further analysis showed that the functional associations such as the gene neighborhood and co-occurrence were the primary associations between these virulence factors in the STRING database. The high success rates indicate that the network-based method is quite promising. The novel approach holds high potential for identifying virulence factors in many other various organisms as well because it can be easily extended to identify the virulence factors in many other bacterial species, as long as the relevant significant statistical data are available for them.

  20. CreativeWork

    • pfocr.wikipathways.org
    Updated Oct 12, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    WikiPathways (2024). CreativeWork [Dataset]. https://pfocr.wikipathways.org/figures/PMC10372325_gr6.html
    Explore at:
    Dataset updated
    Oct 12, 2024
    Dataset authored and provided by
    WikiPathwayshttp://wikipathways.org/
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Transcription factor-protein-protein interaction networks (TF-PPI) key pathway modulators in diabetes. A network of significantly modulated TF-PPIs for intact (A) and injured vessels at different timepoints - 20 hours (B), 2 weeks (C), and 6 weeks (D). Significantly up- and down-regulated genes from each timepoint comparing Goto-Kakizaki (GK) vs Wistar rats, were used to obtain TF-PPI, and this information was fed into STRING database to generate the network. The top 10 up- and 10 down-regulated TFs are shown in the network above. Up- and down-regulated TFs are indicated in green and red nodes respectively. Size of the nodes indicate the levels of P-value. All the interactions were predicted with the adjusted P-value < .05.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Nadezhda T. Doncheva; John H. Morris; Jan Gorodkin; Lars J. Jensen (2023). Cytoscape StringApp: Network Analysis and Visualization of Proteomics Data [Dataset]. http://doi.org/10.1021/acs.jproteome.8b00702.s002

Data from: Cytoscape StringApp: Network Analysis and Visualization of Proteomics Data

Related Article
Explore at:
xlsxAvailable download formats
Dataset updated
May 31, 2023
Dataset provided by
ACS Publications
Authors
Nadezhda T. Doncheva; John H. Morris; Jan Gorodkin; Lars J. Jensen
License

Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically

Description

Protein networks have become a popular tool for analyzing and visualizing the often long lists of proteins or genes obtained from proteomics and other high-throughput technologies. One of the most popular sources of such networks is the STRING database, which provides protein networks for more than 2000 organisms, including both physical interactions from experimental data and functional associations from curated pathways, automatic text mining, and prediction methods. However, its web interface is mainly intended for inspection of small networks and their underlying evidence. The Cytoscape software, on the other hand, is much better suited for working with large networks and offers greater flexibility in terms of network analysis, import, and visualization of additional data. To include both resources in the same workflow, we created stringApp, a Cytoscape app that makes it easy to import STRING networks into Cytoscape, retains the appearance and many of the features of STRING, and integrates data from associated databases. Here, we introduce many of the stringApp features and show how they can be used to carry out complex network analysis and visualization tasks on a typical proteomics data set, all through the Cytoscape user interface. stringApp is freely available from the Cytoscape app store: http://apps.cytoscape.org/apps/stringapp.

Search
Clear search
Close search
Google apps
Main menu