8 datasets found
  1. Geo-location Graphs

    • kaggle.com
    Updated Nov 11, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Subhajit Sahu (2021). Geo-location Graphs [Dataset]. https://www.kaggle.com/wolfram77/graphs-geo-location/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Nov 11, 2021
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Subhajit Sahu
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Gowalla is a location-based social networking website where users share their locations by checking-in. The friendship network is undirected and was collected using their public API, and consists of 196,591 nodes and 950,327 edges. We have collected a total of 6,442,890 check-ins of these users over the period of Feb. 2009 - Oct. 2010.

    Brightkite was once a location-based social networking service provider where users shared their locations by checking-in. The friendship network was collected using their public API, and consists of 58,228 nodes and 214,078 edges. The network is originally directed but we have constructed a network with undirected edges when there is a friendship in both ways. We have also collected a total of 4,491,143 checkins of these users over the period of Apr. 2008 - Oct. 2010.

    Stanford Network Analysis Platform (SNAP) is a general purpose, high performance system for analysis and manipulation of large networks. Graphs consists of nodes and directed/undirected/multiple edges between the graph nodes. Networks are graphs with data on nodes and/or edges of the network.

    The core SNAP library is written in C++ and optimized for maximum performance and compact graph representation. It easily scales to massive networks with hundreds of millions of nodes, and billions of edges. It efficiently manipulates large graphs, calculates structural properties, generates regular and random graphs, and supports attributes on nodes and edges. Besides scalability to large graphs, an additional strength of SNAP is that nodes, edges and attributes in a graph or a network can be changed dynamically during the computation.

    SNAP was originally developed by Jure Leskovec in the course of his PhD studies. The first release was made available in Nov, 2009. SNAP uses a general purpose STL (Standard Template Library)-like library GLib developed at Jozef Stefan Institute. SNAP and GLib are being actively developed and used in numerous academic and industrial projects.

    https://snap.stanford.edu/data/index.html

  2. Epinions Signed Social Network (SNAP)

    • kaggle.com
    Updated Dec 16, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Subhajit Sahu (2021). Epinions Signed Social Network (SNAP) [Dataset]. https://www.kaggle.com/datasets/wolfram77/graphs-snap-soc-sign-epinions
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Dec 16, 2021
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Subhajit Sahu
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Epinions social network

    Dataset information

    This is who-trust-whom online social network of a a general consumer review
    site Epinions.com. Members of the site can decide whether to ''trust'' each
    other. All the trust relationships interact and form the Web of Trust which is then combined with review ratings to determine which reviews are shown to the user.

    Dataset statistics

    Nodes 131828
    Edges 841372
    Nodes in largest WCC 119130 (0.904)
    Edges in largest WCC 833695 (0.991)
    Nodes in largest SCC 41441 (0.314)
    Edges in largest SCC 693737 (0.825)
    Average clustering coefficient 0.2424
    Number of triangles 4910076
    Fraction of closed triangles 0.08085
    Diameter (longest shortest path) 14
    90-percentile effective diameter 4.9

    Source (citation)

    J. Leskovec, D. Huttenlocher, J. Kleinberg: Signed Networks in Social Media.
    28th ACM Conference on Human Factors in Computing Systems (CHI), 2010.
    http://cs.stanford.edu/people/jure/pubs/triads-chi10.pdf

    Files
    File Description
    soc-sign-epinions.txt.gz Directed Epinions signed social network

  3. d

    Repository URL

    • datadiscoverystudio.org
    resource url
    Updated 2009
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Repository URL [Dataset]. http://datadiscoverystudio.org/geoportal/rest/metadata/item/26e3ea4c55a94e8994f43c72689851fa/html
    Explore at:
    resource urlAvailable download formats
    Dataset updated
    2009
    Description

    Link Function: information

  4. Data from: Youtube social network

    • kaggle.com
    zip
    Updated Sep 1, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Lorenzo De Tomasi (2019). Youtube social network [Dataset]. https://www.kaggle.com/datasets/lodetomasi1995/youtube-social-network
    Explore at:
    zip(10604317 bytes)Available download formats
    Dataset updated
    Sep 1, 2019
    Authors
    Lorenzo De Tomasi
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Area covered
    YouTube
    Description

    Youtube social network and ground-truth communities Dataset information Youtube is a video-sharing web site that includes a social network. In the Youtube social network, users form friendship each other and users can create groups which other users can join. We consider such user-defined groups as ground-truth communities. This data is provided by Alan Mislove et al.

    We regard each connected component in a group as a separate ground-truth community. We remove the ground-truth communities which have less than 3 nodes. We also provide the top 5,000 communities with highest quality which are described in our paper. As for the network, we provide the largest connected component.

    more info : https://snap.stanford.edu/data/com-Youtube.html

  5. h

    gowalla-dataset

    • huggingface.co
    Updated Dec 30, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Hassan Abedi (2024). gowalla-dataset [Dataset]. https://huggingface.co/datasets/habedi/gowalla-dataset
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Dec 30, 2024
    Authors
    Hassan Abedi
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Description

    Gowalla Dataset

    The Gowalla dataset, sourced from the Stanford Network Analysis Project (SNAP), contains user check-ins and social network information from the now-defunct location-based social networking platform Gowalla.

      Key features:
    

    Check-in data: records of user check-ins at various locations with timestamps and geographical coordinates (latitude, longitude). Social graph: user relationships represented as a graph, where edges denote friendships between… See the full description on the dataset page: https://huggingface.co/datasets/habedi/gowalla-dataset.

  6. d

    Graph theory indicators for e-mail network

    • search.dataone.org
    • dataverse.harvard.edu
    Updated Nov 22, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Christidis, Panayotis (2023). Graph theory indicators for e-mail network [Dataset]. https://search.dataone.org/view/sha256%3A92e96d219a464f9de7291122dc651b84bd01e8b96c0ca02d760f3c5de5bb4b5e
    Explore at:
    Dataset updated
    Nov 22, 2023
    Dataset provided by
    Harvard Dataverse
    Authors
    Christidis, Panayotis
    Description

    This dataset includes graph theory indicators (centrality and clustering coefficients) for the Stanford Network Analysis Project (SNAP) "email-Eu-core-temporal" network, a well-known reference dataset for Social Network Analysis (SNA) of e-mail traffic.

  7. m

    Data from: A Hybrid Matheuristic for the Spread of Influence on Social...

    • data.mendeley.com
    • redu.unicamp.br
    Updated Nov 11, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Felipe Pereira (2024). A Hybrid Matheuristic for the Spread of Influence on Social Networks - Complementary Data [Dataset]. http://doi.org/10.17632/f4kyk7vkst.1
    Explore at:
    Dataset updated
    Nov 11, 2024
    Authors
    Felipe Pereira
    License

    Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
    License information was derived automatically

    Description

    This dataset contains complementary data to the paper "A Hybrid Matheuristic for the Spread of Influence on Social Networks" [1], which proposes a matheuristic for combinatorial optimization problems involving the spread of information in social networks.

    For the computational experiments discussed in that paper, we provide:

    • Two sets of instances, originally obtained from [2-6];
    • The solutions attained by exact and heuristic methods;
    • The collected results;
    • The matheuristic source code;

    The directories "benchmark_*/instances/" contain files that describe the sets of instances. Each instance is associated with a graph containing

    The first

    where and

    The next line contains

    The last line contains

    The directories "benchmark_*/solutions_*/" contain files describing feasible solutions for the corresponding sets of instances.

    The first line of each file contains:

    where is the number of vertices in the target set. Each of the next lines contains:

    where

    The last line contains an integer that represents the target set cost.

    The directory "hmf_source_code/" contains an implementation of the matheuristic framework proposed in [1], namely, HMF.

    This work was supported by grants from Santander Bank, the Brazilian National Council for Scientific and Technological Development (CNPq), the São Paulo Research Foundation (FAPESP), the Fund for Support to Teaching, Research and Outreach Activities (FAEPEX), and the Coordination for the Improvement of Higher Education Personnel (CAPES), all in Brazil.

    Caveat: The opinions, hypotheses and conclusions or recommendations expressed in this material are the sole responsibility of the authors and do not necessarily reflect the views of Santander, CNPq, FAPESP, FAEPEX, or CAPES.

    References

    [1] F. C. Pereira, P. J. de Rezende, and T. Yunes. A Hybrid Matheuristic for the Spread of Influence on Social Networks. 2024. Submitted.

    [2] S. Raghavan and R. Zhang. A branch-and-cut approach for the weighted target set selection problem on social networks. 2024. https://doi.org/10.1287/ijoo.2019.0012

    [3] J. Leskovec and A. Krevl. SNAP Datasets: Stanford Large Network Dataset Collection. 2024. https://snap.stanford.edu/data

    [4] R. A. Rossi and N. K. Ahmed. The Network Data Repository with Interactive Graph Analytics and Visualization. 2022. https://networkrepository.com

    [5] J. Kunegis. KONECT – The Koblenz Network Collection. 2013. http://dl.acm.org/citation.cfm?id=2488173

    [6] O. Lesser, L. Tenenboim-Chekina, L. Rokach, and Y. Elovici. Intruder or Welcome Friend: Inferring Group Membership in Online Social Networks. 2013. https://doi.org/10.1007/978-3-642-37210-0_40

  8. m

    A Row Generation Algorithm for Finding Optimal Burning Sequences of Large...

    • data.mendeley.com
    • redu.unicamp.br
    Updated Nov 11, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Felipe Pereira (2024). A Row Generation Algorithm for Finding Optimal Burning Sequences of Large Graphs - Complementary Data [Dataset]. http://doi.org/10.17632/c95hp3m4mz.2
    Explore at:
    Dataset updated
    Nov 11, 2024
    Authors
    Felipe Pereira
    License

    Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
    License information was derived automatically

    Description

    This dataset contains complementary data to the paper "A Row Generation Algorithm for Finding Optimal Burning Sequences of Large Graphs" [1], which proposes an exact algorithm for the Graph Burning Problem, an NP-hard optimization problem that models a form of contagion diffusion on social networks.

    Concerning the computational experiments discussed in that paper, we make available:

    • Four sets of instances;
    • The optimal (or best known) solutions obtained;
    • The source code;
    • An Appendix with additional details about the results.

    The "delta" input sets include graphs that are real-world networks [1,2], while the "grid" input set contains graphs that are square grids.

    The directories "delta_10K_instances", "delta_100K_instances", "delta_4M_instances" and "grid_instances" contain files that describe the sets of instances. The first two lines of each file contain:

    where

    where and

    The directories "delta_10K_solutions", "delta_100K_solutions", "delta_4M_solutions" and "grid_solutions" contain files that describe the optimal (or best known) solutions for the corresponding sets of instances.

    The first line of each file contains:

    where is the number of vertices in the burning sequence. Each of the next lines contains:

    where

    The directory "source_code" contains the implementations of the exact algorithm proposed in the paper [1], namely, PRYM.

    Lastly, the file "appendix.pdf" presents additional details on the results reported in the paper.

    This work was supported by grants from Santander Bank, Brazil, Brazilian National Council for Scientific and Technological Development (CNPq), Brazil, São Paulo Research Foundation (FAPESP), Brazil and Fund for Support to Teaching, Research and Outreach Activities (FAEPEX).

    Caveat: the opinions, hypotheses and conclusions or recommendations expressed in this material are the sole responsibility of the authors and do not necessarily reflect the views of Santander, CNPq, FAPESP or FAEPEX.

    References

    [1] F. C. Pereira, P. J. de Rezende, T. Yunes and L. F. B. Morato. A Row Generation Algorithm for Finding Optimal Burning Sequences of Large Graphs. Submitted. 2024.

    [2] Jure Leskovec and Andrej Krevl. SNAP Datasets: Stanford Large Network Dataset Collection. 2024. https://snap.stanford.edu/data

    [3] Ryan A. Rossi and Nesreen K. Ahmed. The Network Data Repository with Interactive Graph Analytics and Visualization. In: AAAI, 2022. https://networkrepository.com

  9. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Subhajit Sahu (2021). Geo-location Graphs [Dataset]. https://www.kaggle.com/wolfram77/graphs-geo-location/code
Organization logo

Geo-location Graphs

Location-based networks from the Stanford Network Analysis Platform (SNAP)

Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Nov 11, 2021
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Subhajit Sahu
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Gowalla is a location-based social networking website where users share their locations by checking-in. The friendship network is undirected and was collected using their public API, and consists of 196,591 nodes and 950,327 edges. We have collected a total of 6,442,890 check-ins of these users over the period of Feb. 2009 - Oct. 2010.

Brightkite was once a location-based social networking service provider where users shared their locations by checking-in. The friendship network was collected using their public API, and consists of 58,228 nodes and 214,078 edges. The network is originally directed but we have constructed a network with undirected edges when there is a friendship in both ways. We have also collected a total of 4,491,143 checkins of these users over the period of Apr. 2008 - Oct. 2010.

Stanford Network Analysis Platform (SNAP) is a general purpose, high performance system for analysis and manipulation of large networks. Graphs consists of nodes and directed/undirected/multiple edges between the graph nodes. Networks are graphs with data on nodes and/or edges of the network.

The core SNAP library is written in C++ and optimized for maximum performance and compact graph representation. It easily scales to massive networks with hundreds of millions of nodes, and billions of edges. It efficiently manipulates large graphs, calculates structural properties, generates regular and random graphs, and supports attributes on nodes and edges. Besides scalability to large graphs, an additional strength of SNAP is that nodes, edges and attributes in a graph or a network can be changed dynamically during the computation.

SNAP was originally developed by Jure Leskovec in the course of his PhD studies. The first release was made available in Nov, 2009. SNAP uses a general purpose STL (Standard Template Library)-like library GLib developed at Jozef Stefan Institute. SNAP and GLib are being actively developed and used in numerous academic and industrial projects.

https://snap.stanford.edu/data/index.html

Search
Clear search
Close search
Google apps
Main menu