10 datasets found
  1. g

    incels.is forum dataset

    • search.gesis.org
    • pollux-fid.de
    • +1more
    Updated Sep 27, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Wedel, Lion (2023). incels.is forum dataset [Dataset]. https://search.gesis.org/research_data/SDN-10.7802-2485
    Explore at:
    Dataset updated
    Sep 27, 2023
    Dataset provided by
    GESIS, Köln
    GESIS search
    Authors
    Wedel, Lion
    License

    https://www.gesis.org/en/institute/data-usage-termshttps://www.gesis.org/en/institute/data-usage-terms

    Description

    For legal reasons, the study data (data_incels_gesis.zip) are no longer available as of 30.01.2024 until further notice.

    The dataset consists of all publicly visible posts and the data that comes with each post of the online-forum incels.is during the first week of November 2022. The forum is the current (2022) largest communication platform within the incel community. Incels (involuntary celibates) are of interest to researchers through several offline and online violence cases. Violence against themselves is also an accepted act within the community. This dataset is a large-scale collection of digital behavioral data that allows researchers to investigate questions surrounding, e.g. incels, hate-speech, communication in online forums, the emergence of acts of terrorism, and suicide prevention. The dataset was used for a Master's Thesis and is the basis for upcoming publications. This repository contains the dataset and the scripts that showcase the analysis of this dataset.

  2. Links to websites from the biggest incel forum 2021-2022, by platform

    • statista.com
    Updated Dec 1, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2022). Links to websites from the biggest incel forum 2021-2022, by platform [Dataset]. https://www.statista.com/statistics/1345861/links-to-sites-from-biggest-incel-forum-by-platform/
    Explore at:
    Dataset updated
    Dec 1, 2022
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    Jan 1, 2021 - Jul 7, 2022
    Area covered
    Worldwide
    Description

    According to a study conducted between January 2021 and July 2022, links to video-hosting website YouTube were posted over 14 thousand times on the largest dedicated incel forum in the world, making it by far the most linked-to site on the platform. Reddit ranked second with over five thousand links on the incel website. Links to popular social media networks were also often uploaded, with 1,149 links to Twitter and 862 to TikTok, respectively.

  3. Monthly visits to forums related to biggest global incel forum 2022

    • statista.com
    Updated Nov 28, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2022). Monthly visits to forums related to biggest global incel forum 2022 [Dataset]. https://www.statista.com/statistics/1345796/monthly-visits-forums-related-biggest-global-incel-forum/
    Explore at:
    Dataset updated
    Nov 28, 2022
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    Apr 2022 - Jun 2022
    Area covered
    Worldwide
    Description

    The largest incel forum in the world is supported and directed by a number of websites that are founded and managed by the same individuals, serving as part of the wider incel community. According to a 2022 report, an average of 4.4 million people per month visited a body image forum that encourages "looksmaxing," the incel term for constantly trying to improve one's physical appearance. Overall, the incel forum itself totaled on average 2.4 million visits per month. The body image, suicide, and unemployment forums promote incel ideology in differing forms. With the exception of the suicide forum, which attracts 750 thousand of monthly visits, all of the websites are male-only forums.

  4. Global share of traffic to the biggest dedicated incel forum 2022, by...

    • statista.com
    Updated Nov 28, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2022). Global share of traffic to the biggest dedicated incel forum 2022, by country [Dataset]. https://www.statista.com/statistics/1345054/share-of-global-traffic-incel-internet-forum-by-country/
    Explore at:
    Dataset updated
    Nov 28, 2022
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    Apr 2022 - Jun 2022
    Area covered
    Worldwide
    Description

    According to a 2022 report, the United States accounted for the vast majority of web traffic to the world's largest dedicated incel forum, accounting for 43.8 percent of all visits. The United Kingdom and Poland followed, with 7.5 percent and 4.2 percent of all web traffic, respectively. The site receives around 2.6 million visitors each month, and although the forum is an American-based English language platform it draws users from all over the world.

  5. Share of posts on the biggest global incel forum including offensive slurs...

    • statista.com
    Updated Nov 28, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2022). Share of posts on the biggest global incel forum including offensive slurs 2021-2022 [Dataset]. https://www.statista.com/statistics/1345126/share-of-posts-on-biggest-global-incel-forum-include-slurs/
    Explore at:
    Dataset updated
    Nov 28, 2022
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    Jan 1, 2021 - Jul 7, 2022
    Area covered
    Worldwide
    Description

    According to a 2022 report, 16 percent of posts on the largest dedicated incel forum included misogynistic slurs, whilst one in twenty posts on the forum contained racist or antisemitic slurs. Additionally, three percent of all posts contained offensive homophobic slurs. On average, hate speech of some form was present in 21 percent of all postings on the incel forum.

  6. Monthly posts including violent terms on the biggest incel forum 2021-2022,...

    • statista.com
    Updated Dec 1, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2022). Monthly posts including violent terms on the biggest incel forum 2021-2022, by word [Dataset]. https://www.statista.com/statistics/1345878/posts-including-violent-terms-on-incel-forum-by-word/
    Explore at:
    Dataset updated
    Dec 1, 2022
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    Jan 1, 2021 - Jul 7, 2022
    Area covered
    Worldwide
    Description

    According to a study conducted between January 2021 and July 2022, the word "kill", and variations of the word, were used on average 1,181 times every month in posts on the world's largest incel forum - once every 37 minutes. Per month, on average, 255 postings contained the word "shot," while 209 posts contained the word "murder."

  7. Users of the biggest global incel forum stance on rape 2022

    • statista.com
    Updated Dec 1, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2022). Users of the biggest global incel forum stance on rape 2022 [Dataset]. https://www.statista.com/statistics/1345809/users-of-largest-global-incel-forum-on-rape-2022/
    Explore at:
    Dataset updated
    Dec 1, 2022
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    Jan 1, 2021 - Jul 7, 2022
    Area covered
    Worldwide
    Description

    According to a report conducted throughout 2021 and the first half of 2022, 89 percent of users of the largest dedicated incel internet forum who commented on posts about rape backed the original post, demonstrating their support. Overall, five percent of users were against rape for moral reasoning, and five percent were against for non-moral reasoning.

  8. Z

    Dataset for: The Evolution of the Manosphere Across the Web

    • data.niaid.nih.gov
    Updated Aug 30, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Gianluca Stringhini (2020). Dataset for: The Evolution of the Manosphere Across the Web [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_4007912
    Explore at:
    Dataset updated
    Aug 30, 2020
    Dataset provided by
    Stephanie Greenberg
    Savvas Zannettou
    Manoel Horta Ribeiro
    Jeremy Blackburn
    Emiliano De Cristofaro
    Barry Bradlyn
    Summer Long
    Gianluca Stringhini
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The Evolution of the Manosphere Across the Web

    We make available data related to subreddit and standalone forums from the manosphere.

    We also make available Perspective API annotations for all posts.

    You can find the code in GitHub.

    Please cite this paper if you use this data:

    @article{ribeiroevolution2021, title={The Evolution of the Manosphere Across the Web}, author={Ribeiro, Manoel Horta and Blackburn, Jeremy and Bradlyn, Barry and De Cristofaro, Emiliano and Stringhini, Gianluca and Long, Summer and Greenberg, Stephanie and Zannettou, Savvas}, booktitle = {{Proceedings of the 15th International AAAI Conference on Weblogs and Social Media (ICWSM'21)}}, year={2021} }

    1. Reddit data

    We make available data for forums and for relevant subreddits (56 of them, as described in subreddit_descriptions.csv). These are available, 1 line per post in each subreddit Reddit in /ndjson/reddit.ndjson. A sample for example is:

    { "author": "Handheld_Gaming", "date_post": 1546300852, "id_post": "abcusl", "number_post": 9.0, "subreddit": "Braincels", "text_post": "Its been 2019 for almost 1 hour And I am at a party with 120 people, half of them being foids. The last year had been the best in my life. I actually was happy living hope because I was redpilled to the death.

    Now that I am blackpilled I see that I am the shortest of all men and that I am the only one with a recessed jaw.

    Its over. Its only thanks to my age old friendship with chads and my social skills I had developed in the past year that a lot of men like me a lot as a friend.

    No leg lengthening syrgery is gonna save me. Ignorance was a bliss. Its just horror now seeing that everyone can make out wirth some slin hoe at the party.

    I actually feel so unbelivably bad for turbomanlets. Life as an unattractive manlet is a pain, I cant imagine the hell being an ugly turbomanlet is like. I would have roped instsntly if I were one. Its so unfair.

    Tallcels are fakecels and they all can (and should) suck my cock.

    If I were 17cm taller my life would be a heaven and I would be the happiest man alive.

    Just cope and wait for affordable body tranpslants.", "thread": "t3_abcusl" }

    1. Forums

    We here describe the .sqlite and .ndjson files that contain the data from the following forums.

    (avfm) --- https://d2ec906f9aea-003845.vbulletin.net (incels) --- https://incels.co/ (love_shy) --- http://love-shy.com/lsbb/ (redpilltalk) --- https://redpilltalk.com/ (mgtow) --- https://www.mgtow.com/forums/ (rooshv) --- https://www.rooshvforum.com/ (pua_forum) --- https://www.pick-up-artist-forum.com/ (the_attraction) --- http://www.theattractionforums.com/

    The files are in folders /sqlite/ and /ndjson.

    2.1 .sqlite

    All the tables in the sqlite. datasets follow a very simple {key:value} format. Each key is a thread name (for example /threads/housewife-is-like-a-job.123835/) and each value is a python dictionary or a list. This file contains three tables:

    idx each key is the relative address to a thread and maps to a post. Each post is represented by a dict:

    "type": (list) in some forums you can add a descriptor such as [RageFuel] to each topic, and you may also have special types of posts, like sticked/pool/locked posts.
    "title": (str) title of the thread; "link": (str) link to the thread; "author_topic": (str) username that created the thread; "replies": (int) number of replies, may differ from number of posts due to difference in crawling date; "views": (int) number of views; "subforum": (str) name of the subforum; "collected": (bool) indicates if raw posts have been collected; "crawled_idx_at": (str) datetime of the collection.

    processed_posts each key is the relative address to a thread and maps to a list with posts (in order). Each post is represented by a dict:

    "author": (str) author's username; "resume_author": (str) author's little description; "joined_author": (str) date author joined; "messages_author": (int) number of messages the author has; "text_post": (str) text of the main post; "number_post": (int) number of the post in the thread; "id_post": (str) unique post identifier (depends), for sure unique within thread; "id_post_interaction": (list) list with other posts ids this post quoted; "date_post": (str) datetime of the post, "links": (tuple) nice tuple with the url parsed, e.g. ('https', 'www.youtube.com', '/S5t6K9iwcdw'); "thread": (str) same as key; "crawled_at": (str) datetime of the collection.

    raw_posts each key is the relative address to a thread and maps to a list with unprocessed posts (in order). Each post is represented by a dict:

    "post_raw": (binary) raw html binary; "crawled_at": (str) datetime of the collection.

    2.2 .ndjson

    Each line consists of a json object representing a different comment with the following fields:

    "author": (str) author's username; "resume_author": (str) author's little description; "joined_author": (str) date author joined; "messages_author": (int) number of messages the author has; "text_post": (str) text of the main post; "number_post": (int) number of the post in the thread; "id_post": (str) unique post identifier (depends), for sure unique within thread; "id_post_interaction": (list) list with other posts ids this post quoted; "date_post": (str) datetime of the post, "links": (tuple) nice tuple with the url parsed, e.g. ('https', 'www.youtube.com', '/S5t6K9iwcdw'); "thread": (str) same as key; "crawled_at": (str) datetime of the collection.

    1. Perspective

    We also run each post and reddit post through perspective, the files are located in the /perspective/ folder. They are compressed with gzip. One example output

    { "id_post": 5200, "hate_output": { "text": "I still can\u2019t wrap my mind around both of those articles about these c~~~s sleeping with poor Haitian Men. Where\u2019s the uproar?, where the hell is the outcry?, the \u201cpig\u201d comments or the \u201ccreeper comments\u201d. F~~~ing hell, if roles were reversed and it was an article about Men going to Europe where under 18 sex in legal, you better believe they would crucify the writer of that article and DEMAND an apology by the paper that wrote it.. This is exactly what I try and explain to people about the double standards within our modern society. A bunch of older women, wanna get their kicks off by sleeping with poor Men, just before they either hit or are at menopause age. F~~~ing unreal, I\u2019ll never forget going to Sweden and Norway a few years ago with one of my buddies and his girlfriend who was from there, the legal age of consent in Norway is 16 and in Sweden it\u2019s 15. I couldn\u2019t believe it, but my friend told me \u201c hey, it\u2019s normal here\u201d . Not only that but the age wasn\u2019t a big different in other European countries as well. One thing i learned very quickly was how very Misandric Sweden as well as Denmark were.", "TOXICITY": 0.6079781, "SEVERE_TOXICITY": 0.53744453, "INFLAMMATORY": 0.7279288, "PROFANITY": 0.58842486, "INSULT": 0.5511079, "OBSCENE": 0.9830818, "SPAM": 0.17009115 } }

    1. Working with sqlite

    A nice way to read some of the files of the dataset is using SqliteDict, for example:

    from sqlitedict import SqliteDict processed_posts = SqliteDict("./data/forums/incels.sqlite", tablename="processed_posts")

    for key, posts in processed_posts.items(): for post in posts: # here you could do something with each post in the dataset pass

    1. Helpers

    Additionally, we provide two .sqlite files that are helpers used in the analyses. These are related to reddit, and not to the forums! They are:

    channel_dict.sqlite a sqlite where each key corresponds to a subreddit and values are lists of dictionaries users who posted on it, along with timestamps.

    author_dict.sqlite a sqlite where each key corresponds to an author and values are lists of dictionaries of the subreddits they posted on, along with timestamps.

    These are used in the paper for the migration analyses.

    1. Examples and particularities for forums

    Although we did our best to clean the data and be consistent across forums, this is not always possible. In the following subsections we talk about the particularities of each forum, directions to improve the parsing which were not pursued as well as give some examples on how things work in each forum.

    6.1 incels

    Check out an archived version of the front page, the thread page and a post page, as well as a dump of the data stored for a thread page and a post page.

    types: for the incel forums the special types associated with each thread in the idx table are “Sticky”, “Pool”, “Closed”, and the custom types added by users, such as [LifeFuel]. These last ones are all in brackets. You can see some examples of these in the on the example thread page.

    quotes: quotes in this forum were quite nice and thus, all quotations are deterministic.

    6.2 LoveShy

    Check out an archived version of the front page, the thread page and a post page, as well as a dump of the data stored for a thread page and a post page.

    types: no types were parsed. There are some rules in the forum, but not significant.

    quotes: quotes were obtained from exact text+author match, or author match + a jaccard

  9. d

    Radicalization and Deradicalization in Online Communities

    • search.dataone.org
    Updated Jan 12, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jennifer Golbeck (2024). Radicalization and Deradicalization in Online Communities [Dataset]. http://doi.org/10.7910/DVN/MS9ODP
    Explore at:
    Dataset updated
    Jan 12, 2024
    Dataset provided by
    Harvard Dataverse
    Authors
    Jennifer Golbeck
    Description

    A collection of incel forum data from /r/Incels, /r/Braincels, /r/IncelExit, saidit /s/Incels, and incels.is

  10. COPE

    • zenodo.org
    Updated Jul 14, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Paolo Gajo; Paolo Gajo (2023). COPE [Dataset]. http://doi.org/10.5281/zenodo.7933656
    Explore at:
    Dataset updated
    Jul 14, 2023
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Paolo Gajo; Paolo Gajo
    Description

    Unsupervised datasets of posts scraped from Incels.is and Il forum dei brutti.

  11. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Wedel, Lion (2023). incels.is forum dataset [Dataset]. https://search.gesis.org/research_data/SDN-10.7802-2485

incels.is forum dataset

Related Article
Explore at:
39 scholarly articles cite this dataset (View in Google Scholar)
Dataset updated
Sep 27, 2023
Dataset provided by
GESIS, Köln
GESIS search
Authors
Wedel, Lion
License

https://www.gesis.org/en/institute/data-usage-termshttps://www.gesis.org/en/institute/data-usage-terms

Description

For legal reasons, the study data (data_incels_gesis.zip) are no longer available as of 30.01.2024 until further notice.

The dataset consists of all publicly visible posts and the data that comes with each post of the online-forum incels.is during the first week of November 2022. The forum is the current (2022) largest communication platform within the incel community. Incels (involuntary celibates) are of interest to researchers through several offline and online violence cases. Violence against themselves is also an accepted act within the community. This dataset is a large-scale collection of digital behavioral data that allows researchers to investigate questions surrounding, e.g. incels, hate-speech, communication in online forums, the emergence of acts of terrorism, and suicide prevention. The dataset was used for a Master's Thesis and is the basis for upcoming publications. This repository contains the dataset and the scripts that showcase the analysis of this dataset.

Search
Clear search
Close search
Google apps
Main menu