3 datasets found

Data from: Youtube social network
kaggle.com
zip
Updated Sep 1, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Lorenzo De Tomasi (2019). Youtube social network [Dataset]. https://www.kaggle.com/datasets/lodetomasi1995/youtube-social-network
Explore at:
zip(10604317 bytes)Available download formats
Dataset updated
Sep 1, 2019
Authors
Lorenzo De Tomasi
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Area covered
YouTube
Description
Youtube social network and ground-truth communities Dataset information Youtube is a video-sharing web site that includes a social network. In the Youtube social network, users form friendship each other and users can create groups which other users can join. We consider such user-defined groups as ground-truth communities. This data is provided by Alan Mislove et al.

We regard each connected component in a group as a separate ground-truth community. We remove the ground-truth communities which have less than 3 nodes. We also provide the top 5,000 communities with highest quality which are described in our paper. As for the network, we provide the largest connected component.

more info : https://snap.stanford.edu/data/com-Youtube.html
YouTube Social Network with Communities (SNAP)
kaggle.com
zip
Updated Dec 16, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Subhajit Sahu (2021). YouTube Social Network with Communities (SNAP) [Dataset]. https://www.kaggle.com/wolfram77/graphs-snap-com-youtube
Explore at:
zip(13777811 bytes)Available download formats
Dataset updated
Dec 16, 2021
Authors
Subhajit Sahu
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Youtube social network and ground-truth communities

https://snap.stanford.edu/data/com-Youtube.html

Dataset information

Youtube (http://www.youtube.com/) is a video-sharing web site that includes a social network. In the Youtube social network, users form friendship each other and users can create groups which other users can join. We consider
such user-defined groups as ground-truth communities. This data is provided by Alan Mislove et al.
(http://socialnetworks.mpi-sws.org/data-imc2007.html)

We regard each connected component in a group as a separate ground-truth
community. We remove the ground-truth communities which have less than 3
nodes. We also provide the top 5,000 communities with highest quality
which are described in our paper (http://arxiv.org/abs/1205.6233). As for
the network, we provide the largest connected component.

Network statistics
Nodes 1,134,890
Edges 2,987,624
Nodes in largest WCC 1134890 (1.000)
Edges in largest WCC 2987624 (1.000)
Nodes in largest SCC 1134890 (1.000)
Edges in largest SCC 2987624 (1.000)
Average clustering coefficient 0.0808
Number of triangles 3056386
Fraction of closed triangles 0.002081
Diameter (longest shortest path) 20
90-percentile effective diameter 6.5
Community statistics
Number of communities 8,385
Average community size 13.50
Average membership size 0.10

Source (citation)
J. Yang and J. Leskovec. Defining and Evaluating Network Communities based on Ground-truth. ICDM, 2012. http://arxiv.org/abs/1205.6233

Files
File Description
com-youtube.ungraph.txt.gz Undirected Youtube network
com-youtube.all.cmty.txt.gz Youtube communities
com-youtube.top5000.cmty.txt.gz Youtube communities (Top 5,000)

Notes on inclusion into the SuiteSparse Matrix Collection, July 2018:

The graph in the SNAP data set is 1-based, with nodes numbered 1 to
1,157,827.

In the SuiteSparse Matrix Collection, Problem.A is the undirected Youtube
network, a matrix of size n-by-n with n=1,134,890, which is the number of
unique user id's appearing in any edge.

Problem.aux.nodeid is a list of the node id's that appear in the SNAP data set. A(i,j)=1 if person nodeid(i) is friends with person nodeid(j). The
node id's are the same as the SNAP data set (1-based).

C = Problem.aux.Communities_all is a sparse matrix of size n by 16,386
which represents the communities in the com-youtube.all.cmty.txt file.
The kth line in that file defines the kth community, and is the column
C(:,k), where C(i,k)=1 if person nodeid(i) is in the kth community. Row
C(i,:) and row/column i of the A matrix thus refer to the same person,
nodeid(i).

Ctop = Problem.aux.Communities_top5000 is n-by-5000, with the same
structure as the C array above, with the content of the
com-youtube.top5000.cmty.txt.gz file.
f
Most frequent words appearing in each feed.
plos.figshare.com
xls
Updated Nov 5, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Andrea Failla; Giulio Rossetti (2024). Most frequent words appearing in each feed. [Dataset]. http://doi.org/10.1371/journal.pone.0310330.t004
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0310330.t004
Dataset updated
Nov 5, 2024
Dataset provided by
PLOS ONE
Authors
Andrea Failla; Giulio Rossetti
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Pollution of online social spaces caused by rampaging d/misinformation is a growing societal concern. However, recent decisions to reduce access to social media APIs are causing a shortage of publicly available, recent, social media data, thus hindering the advancement of computational social science as a whole. We present a large, high-coverage dataset of social interactions and user-generated content from Bluesky Social to address this pressing issue. The dataset contains the complete post history of over 4M users (81% of all registered accounts), totalling 235M posts. We also make available social data covering follow, comment, repost, and quote interactions. Since Bluesky allows users to create and like feed generators (i.e., content recommendation algorithms), we also release the full output of several popular algorithms available on the platform, along with their timestamped “like” interactions. This dataset allows novel analysis of online behavior and human-machine engagement patterns. Notably, it provides ground-truth data for studying the effects of content exposure and self-selection and performing content virality and diffusion analysis.
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

Lorenzo De Tomasi (2019). Youtube social network [Dataset]. https://www.kaggle.com/datasets/lodetomasi1995/youtube-social-network

Data from: Youtube social network

dataset for networks, graphs analysis

Explore at:

zip(10604317 bytes)Available download formats

Dataset updated

Sep 1, 2019

Authors

Lorenzo De Tomasi

License

https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

Area covered

YouTube

Description

Youtube social network and ground-truth communities Dataset information Youtube is a video-sharing web site that includes a social network. In the Youtube social network, users form friendship each other and users can create groups which other users can join. We consider such user-defined groups as ground-truth communities. This data is provided by Alan Mislove et al.

We regard each connected component in a group as a separate ground-truth community. We remove the ground-truth communities which have less than 3 nodes. We also provide the top 5,000 communities with highest quality which are described in our paper. As for the network, we provide the largest connected component.

more info : https://snap.stanford.edu/data/com-Youtube.html

Clear search

Close search

Google apps

Main menu

Data from: Youtube social network

YouTube Social Network with Communities (SNAP)

Youtube social network and ground-truth communities

Notes on inclusion into the SuiteSparse Matrix Collection, July 2018:

Most frequent words appearing in each feed.

Data from: Youtube social network

dataset for networks, graphs analysis