100+ datasets found
  1. Kaggle: Forum Discussions

    • kaggle.com
    zip
    Updated Nov 8, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nicolás Ariel González Muñoz (2025). Kaggle: Forum Discussions [Dataset]. https://www.kaggle.com/datasets/nicolasgonzalezmunoz/kaggle-forum-discussions
    Explore at:
    zip(542099 bytes)Available download formats
    Dataset updated
    Nov 8, 2025
    Authors
    Nicolás Ariel González Muñoz
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Note: This is a work in progress, and not all the Kaggle forums are included in this dataset. The remaining forums will be added when I end solving some issues with the data generators related to these forums.

    Summary

    Welcome to the Kaggle Forum Discussions dataset!. This dataset contains curated data about recent discussions opened in the different forums on Kaggle. The data is obtained through web scraping techniques, using the selenium libraries, and converting text data into markdown style using the markdownify package.

    This dataset contains information about the discussion main topic, topic title, comments, votes, medals and more, and is designed to serve as a complement to the data available on the Kaggle meta dataset, specifically for recent discussions. Keep reading to see the details.

    Extraction Technique

    As a dynamic website that relies heavily in JavaScript (JS), I extracted the data in this dataset through web scraping techniques using the selenium library.

    The functions and classes used to scrape the data on Kaggle where stored on a utility script publicly available here. As JS-generated pages like Kaggle are unstable where trying to scrape them, the mentioned script implements capabilities for retrying connections and to await for elements to appear.

    Each Forum was scrapped using a one notebook for each, then the mentioned notebooks were connected to a central notebook that generates this dataset. Also the discussions are scrapped in parallel so to enhance speed. This dataset represents all the data that can be gathered in a single notebook session, from the most recent to the most old.

    If you need more control on the data you want to research, feel free to import all you need from the utility script mentioned before.

    Structure

    This dataset contains several folders, each named as the discussion forum they contain data about. For example, the 'competition-hosting' folder contains data about the Competition Hosting forum. Inside each folder, you'll find two files: one is a csv file and the other a json file.

    The json file (in Python, represented as a dictionary) is indexed with the ID that Kaggle assigns to the mentioned discussion. Each ID is paired with its corresponding discussion, which is represented as a nested dictionary (the discussion dict), which contains the following fields: - title: The title of the main topic. - content: Content of the main topic. - tags: List containing the discussion's tags. - datetime: Date and time at which the discussion was published (in ISO 8601 format). - votes: Number of votes gotten by the discussion. - medal: Medal awarded by the main topic (if any). - user: User that published the main topic. - expertise: Publisher's expertise, measured by the Kaggle progression system. - n_comments: Total number of comments in the current discussion. - n_appreciation_comments: Total number of appreciation comments in the current discussion. - comments: Dictionary containing data about the comments in the discussion. Each comment is indexed by an ID assigned by Kaggle, containing the following fields: - content: Comment's content. - is_appreciation: Wether the comment is of appreciation. - is_deleted: Wether the comment was deleted. - n_replies: Number of replies to the comment. - datetime: Date and time at which the comment was published (in ISO 8601 format). - votes: Number of votes gotten by the current comment. - medal: Medal awarded by the comment (if any). - user: User that published the comment. - expertise: Publisher's expertise, measured by the Kaggle progression system. - n_deleted: Total number of deleted replies (including self). - replies: A dict following this same format.

    By other side, the csv file serves as a summary of the json file, containing information about the comments limited to the hottest and most voted comments.

    Note: Only the 'content' field is mandatory for each discussion. The availability of the other fields is subject to the stability of the scraping tasks, which may also affect the update frequency.

  2. p

    Forum Locations Data for United States

    • poidata.io
    csv, json
    Updated Oct 25, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Business Data Provider (2025). Forum Locations Data for United States [Dataset]. https://poidata.io/brand-report/forum/united-states
    Explore at:
    json, csvAvailable download formats
    Dataset updated
    Oct 25, 2025
    Dataset authored and provided by
    Business Data Provider
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    2025
    Area covered
    United States
    Variables measured
    Website URL, Phone Number, Review Count, Business Name, Email Address, Business Hours, Customer Rating, Business Address, Brand Affiliation, Geographic Coordinates
    Description

    Comprehensive dataset containing 20 verified Forum locations in United States with complete contact information, ratings, reviews, and location data.

  3. Data For Forum Participation Analysis

    • figshare.com
    txt
    Updated Sep 15, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Harrison Fried (2022). Data For Forum Participation Analysis [Dataset]. http://doi.org/10.6084/m9.figshare.16992223.v3
    Explore at:
    txtAvailable download formats
    Dataset updated
    Sep 15, 2022
    Dataset provided by
    Figsharehttp://figshare.com/
    figshare
    Authors
    Harrison Fried
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    All of the data files (.csv) required for the Forum Navigation analysis. Includes an actor-actor edgelist, an actor-forum edgelist, an actor-issue edgelist, a forum-issue edgelist, an issue-issue matrix, an isolates dataset, an actor attributes data frame (actor_orgtype.csv) and a forum attributes data frame (ForumSponsorship.csv).

  4. P

    Internet news, forum text data

    • opendata.pku.edu.cn
    bin, docx, txt
    Updated Nov 23, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Peking University Open Research Data Platform (2017). Internet news, forum text data [Dataset]. http://doi.org/10.18170/DVN/HQN48I
    Explore at:
    txt(76), docx(26637), bin(769687963)Available download formats
    Dataset updated
    Nov 23, 2017
    Dataset provided by
    Peking University Open Research Data Platform
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Data Description: Internet news, forum text data. Time Range: 2017-10-16 to 2017-11-17. Data Volume: 482400. Data Format: json. Author: State Information Center.

  5. Pressure data for forum

    • kaggle.com
    zip
    Updated Jun 10, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jabbar (2022). Pressure data for forum [Dataset]. https://www.kaggle.com/datasets/jabbarbayramov/pressure-data-for-forum
    Explore at:
    zip(211457 bytes)Available download formats
    Dataset updated
    Jun 10, 2022
    Authors
    Jabbar
    Description

    Dataset

    This dataset was created by Jabbar

    Contents

  6. i

    Grant Giving Statistics for Cross Border Data Forum Inc

    • instrumentl.com
    Updated Feb 12, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). Grant Giving Statistics for Cross Border Data Forum Inc [Dataset]. https://www.instrumentl.com/990-report/cross-border-data-forum-inc
    Explore at:
    Dataset updated
    Feb 12, 2024
    Variables measured
    Total Assets, Total Giving
    Description

    Financial overview and grant giving statistics of Cross Border Data Forum Inc

  7. Women Veterans Forum - ACS Data for Story

    • catalog.data.gov
    • data.va.gov
    • +2more
    Updated Nov 23, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Department of Veterans Affairs (2021). Women Veterans Forum - ACS Data for Story [Dataset]. https://catalog.data.gov/dataset/women-veterans-forum-acs-data-for-story
    Explore at:
    Dataset updated
    Nov 23, 2021
    Dataset provided by
    United States Department of Veterans Affairshttp://va.gov/
    Description

    Data pulled from ACS that used to power certain visualizations for the "Women Veterans Forum" Story

  8. o

    Forums and Discussion Board Data from Reddit, Quora, Stack Overflow + More

    • openwebninja.com
    json
    Updated Sep 22, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    OpenWeb Ninja (2024). Forums and Discussion Board Data from Reddit, Quora, Stack Overflow + More [Dataset]. https://www.openwebninja.com/api/real-time-forums-search
    Explore at:
    jsonAvailable download formats
    Dataset updated
    Sep 22, 2024
    Dataset authored and provided by
    OpenWeb Ninja
    Area covered
    Worldwide
    Description

    This dataset provides comprehensive real-time Google search in forum and discussion board data aggregated from across the web. The data is continuously updated to provide the most current discussions and conversations. Users can leverage this dataset for community research, social listening, market research, and trend analysis tools. Whether you're building a forum aggregator, conducting community research, or developing social listening tools, this dataset provides current and reliable forum data. The dataset is delivered in a JSON format via REST API.

  9. p

    Forum Locations Data for France

    • poidata.io
    csv, json
    Updated Nov 10, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Business Data Provider (2025). Forum Locations Data for France [Dataset]. https://poidata.io/brand-report/forum/france
    Explore at:
    json, csvAvailable download formats
    Dataset updated
    Nov 10, 2025
    Dataset authored and provided by
    Business Data Provider
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    2025
    Area covered
    France
    Variables measured
    Website URL, Phone Number, Review Count, Business Name, Email Address, Business Hours, Customer Rating, Business Address, Brand Affiliation, Geographic Coordinates
    Description

    Comprehensive dataset containing 36 verified Forum locations in France with complete contact information, ratings, reviews, and location data.

  10. d

    Program with Abstracts from the Second Polar Data Forum

    • search.dataone.org
    • arcticdata.io
    • +1more
    Updated May 19, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Julie Friddell (2020). Program with Abstracts from the Second Polar Data Forum [Dataset]. http://doi.org/10.18739/A21W9V
    Explore at:
    Dataset updated
    May 19, 2020
    Dataset provided by
    Arctic Data Center
    Authors
    Julie Friddell
    Time period covered
    Mar 1, 2015 - Feb 29, 2016
    Area covered
    Description

    The polar science community has unprecedented opportunities for science based on open, networked, digital, and ubiquitous communication technologies. This presents an urgent need for the polar science community, Arctic residents, and other stakeholders to establish a clear global vision, strategy, and action plan to ensure effective stewardship of and access to valuable Arctic and Antarctic data resources. The Second Polar Data Forum (PDF II) built on the successes of the first Polar Data Forum (PDF I) in Tokyo, Japan, October 2013. PDF II further refined relevant themes and priorities and is accelerating progress by establishing clear actions to address the target issues. This includes meeting the needs of society and science through promotion of open data and effective data stewardship, establishing sharing and interoperability of data at a variety of levels, developing trusted data management systems, and ensuring long-term data preservation.

  11. C

    City of Pittsburgh Capital Budget Deliberative Forum Survey Responses

    • data.wprdc.org
    • s.cnmilf.com
    • +2more
    csv, xlsx
    Updated May 26, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    City of Pittsburgh (2023). City of Pittsburgh Capital Budget Deliberative Forum Survey Responses [Dataset]. https://data.wprdc.org/dataset/capital-budget-deliberative-forum-survey
    Explore at:
    xlsx(81797), csv(74769), xlsx(100409)Available download formats
    Dataset updated
    May 26, 2023
    Dataset authored and provided by
    City of Pittsburgh
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Pittsburgh
    Description

    Note: This dataset will be used to post updated versions of capital budget deliberative forums into the future.

    Collected responses from surveys distributed at the City of Pittsburgh's Capital Budget Deliberative Forum Meetings. These surveys collected resident sentiment on what capital projects the City should prioritize in the upcoming year.

    The collected responses of residents who attended the deliberative meetings concerning the City's 2019 & 2020 Capital Budgets. Residents were asked to identify a specific capital project (or projects) that they felt needed to be completed in their neighborhood. They were asked to be specific as to work needed and the location. They were then asked to use a series of options to note how important they found a list of capital project priorities. Finally, they were asked to share their opinion of the Deliberative Budget Forum by choosing from a series of options.

  12. o

    Forum Drive Cross Street Data in Bend, OR

    • ownerly.com
    Updated Dec 8, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ownerly (2021). Forum Drive Cross Street Data in Bend, OR [Dataset]. https://www.ownerly.com/or/bend/forum-dr-home-details
    Explore at:
    Dataset updated
    Dec 8, 2021
    Dataset authored and provided by
    Ownerly
    Area covered
    Bend, Northeast Forum Drive, Oregon
    Description

    This dataset provides information about the number of properties, residents, and average property values for Forum Drive cross streets in Bend, OR.

  13. List of the forum data used in this study.

    • plos.figshare.com
    xls
    Updated Jun 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nelleke H. J. Oostdijk; Mattijs S. Lambooij; Peter Beinema; Albert Wong; Florian A. Kunneman; Peter H. J. Keizers (2023). List of the forum data used in this study. [Dataset]. http://doi.org/10.1371/journal.pone.0215858.t001
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Nelleke H. J. Oostdijk; Mattijs S. Lambooij; Peter Beinema; Albert Wong; Florian A. Kunneman; Peter H. J. Keizers
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    List of the forum data used in this study.

  14. c

    Catchment Data and Evidence Forum 2019 Summary

    • monitoring.catchmentbasedapproach.org
    Updated May 26, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Rivers Trust (2020). Catchment Data and Evidence Forum 2019 Summary [Dataset]. https://monitoring.catchmentbasedapproach.org/documents/ab8e9d76050c4111aa79b2a950efdcdb
    Explore at:
    Dataset updated
    May 26, 2020
    Dataset authored and provided by
    The Rivers Trust
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Description

    The Catchment Data & Evidence Forum was organised by the CaBA Catchment Data User Group (CDUG), which is a multi-sectoral CaBA working group, consisting of data users, data providers and modellers. The focus of this year’s FORUM was on the collection and use of CaBA data. This is locally collected data, which is needed to compliment the national evidence base from government agencies and research institutes. The enormous potential for CaBA data to contribute to the 25 Year Environment Plan was a key opportunity identified at the 2018 FORUM. A series of discussions, followed by interactive voting, were used to set a workplan for the CaBA National Support Group.

  15. Open Data in Global Environmental Research: The Belmont Forum’s Open Data...

    • plos.figshare.com
    pdf
    Updated Jun 3, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Birgit Schmidt; Birgit Gemeinholzer; Andrew Treloar (2023). Open Data in Global Environmental Research: The Belmont Forum’s Open Data Survey [Dataset]. http://doi.org/10.1371/journal.pone.0146695
    Explore at:
    pdfAvailable download formats
    Dataset updated
    Jun 3, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Birgit Schmidt; Birgit Gemeinholzer; Andrew Treloar
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This paper presents the findings of the Belmont Forum’s survey on Open Data which targeted the global environmental research and data infrastructure community. It highlights users’ perceptions of the term “open data”, expectations of infrastructure functionalities, and barriers and enablers for the sharing of data. A wide range of good practice examples was pointed out by the respondents which demonstrates a substantial uptake of data sharing through e-infrastructures and a further need for enhancement and consolidation. Among all policy responses, funder policies seem to be the most important motivator. This supports the conclusion that stronger mandates will strengthen the case for data sharing.

  16. o

    Forum Lane Cross Street Data in Iron Station, NC

    • ownerly.com
    Updated Mar 19, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ownerly (2022). Forum Lane Cross Street Data in Iron Station, NC [Dataset]. https://www.ownerly.com/nc/iron-station/forum-ln-home-details
    Explore at:
    Dataset updated
    Mar 19, 2022
    Dataset authored and provided by
    Ownerly
    Area covered
    North Carolina, Forum Lane, Iron Station
    Description

    This dataset provides information about the number of properties, residents, and average property values for Forum Lane cross streets in Iron Station, NC.

  17. forum-data-r-progamming-coursera

    • kaggle.com
    zip
    Updated Sep 9, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kelly Xu (2019). forum-data-r-progamming-coursera [Dataset]. https://www.kaggle.com/datasets/kkellyxfq/forumdatarprogammingcoursera
    Explore at:
    zip(425061 bytes)Available download formats
    Dataset updated
    Sep 9, 2019
    Authors
    Kelly Xu
    Description

    This file is for my postgraduate study. The data is concerned with the Coursera forum data of R Programming. All data has been anonymized for the purpose of data privacy.

    The data scaped is dated from September 2018 to September 2019.

  18. e

    Dress Forum Inc Export Import Data | Eximpedia

    • eximpedia.app
    Updated Feb 7, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). Dress Forum Inc Export Import Data | Eximpedia [Dataset]. https://www.eximpedia.app/companies/dress-forum-inc/08714800
    Explore at:
    Dataset updated
    Feb 7, 2025
    Description

    Dress Forum Inc Export Import Data. Follow the Eximpedia platform for HS code, importer-exporter records, and customs shipment details.

  19. Light novel forum dataset

    • kaggle.com
    zip
    Updated May 12, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Manusha (2020). Light novel forum dataset [Dataset]. https://www.kaggle.com/manushadilan/light-novel-forum-dataset
    Explore at:
    zip(44833 bytes)Available download formats
    Dataset updated
    May 12, 2020
    Authors
    Manusha
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Description

    Context

    This datasets was created as a result of web scrapping using python tool called scrapy.

    Content

    This contains the user interacting with light novel sharing forum. Subject column represent subject of the forum post and almost every subject is a name of light novel they are sharing. Second column represent who created it. Third column shows that how many views got that light novel. Next column shows how many users have replied for that post. Next 2 columns show who post last on that post and when that last post made.

    Inspiration

    I hope this will unlock hidden secrets of light novel reading communities.

  20. p

    Forum Locations Data for Mexico

    • poidata.io
    csv, json
    Updated Nov 14, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Business Data Provider (2025). Forum Locations Data for Mexico [Dataset]. https://poidata.io/brand-report/forum/mexico
    Explore at:
    csv, jsonAvailable download formats
    Dataset updated
    Nov 14, 2025
    Dataset authored and provided by
    Business Data Provider
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    2025
    Area covered
    Mexico
    Variables measured
    Website URL, Phone Number, Review Count, Business Name, Email Address, Business Hours, Customer Rating, Business Address, Brand Affiliation, Geographic Coordinates
    Description

    Comprehensive dataset containing 13 verified Forum locations in Mexico with complete contact information, ratings, reviews, and location data.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Nicolás Ariel González Muñoz (2025). Kaggle: Forum Discussions [Dataset]. https://www.kaggle.com/datasets/nicolasgonzalezmunoz/kaggle-forum-discussions
Organization logo

Kaggle: Forum Discussions

Curated data about discussions on the Kaggle forums

Explore at:
zip(542099 bytes)Available download formats
Dataset updated
Nov 8, 2025
Authors
Nicolás Ariel González Muñoz
License

Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically

Description

Note: This is a work in progress, and not all the Kaggle forums are included in this dataset. The remaining forums will be added when I end solving some issues with the data generators related to these forums.

Summary

Welcome to the Kaggle Forum Discussions dataset!. This dataset contains curated data about recent discussions opened in the different forums on Kaggle. The data is obtained through web scraping techniques, using the selenium libraries, and converting text data into markdown style using the markdownify package.

This dataset contains information about the discussion main topic, topic title, comments, votes, medals and more, and is designed to serve as a complement to the data available on the Kaggle meta dataset, specifically for recent discussions. Keep reading to see the details.

Extraction Technique

As a dynamic website that relies heavily in JavaScript (JS), I extracted the data in this dataset through web scraping techniques using the selenium library.

The functions and classes used to scrape the data on Kaggle where stored on a utility script publicly available here. As JS-generated pages like Kaggle are unstable where trying to scrape them, the mentioned script implements capabilities for retrying connections and to await for elements to appear.

Each Forum was scrapped using a one notebook for each, then the mentioned notebooks were connected to a central notebook that generates this dataset. Also the discussions are scrapped in parallel so to enhance speed. This dataset represents all the data that can be gathered in a single notebook session, from the most recent to the most old.

If you need more control on the data you want to research, feel free to import all you need from the utility script mentioned before.

Structure

This dataset contains several folders, each named as the discussion forum they contain data about. For example, the 'competition-hosting' folder contains data about the Competition Hosting forum. Inside each folder, you'll find two files: one is a csv file and the other a json file.

The json file (in Python, represented as a dictionary) is indexed with the ID that Kaggle assigns to the mentioned discussion. Each ID is paired with its corresponding discussion, which is represented as a nested dictionary (the discussion dict), which contains the following fields: - title: The title of the main topic. - content: Content of the main topic. - tags: List containing the discussion's tags. - datetime: Date and time at which the discussion was published (in ISO 8601 format). - votes: Number of votes gotten by the discussion. - medal: Medal awarded by the main topic (if any). - user: User that published the main topic. - expertise: Publisher's expertise, measured by the Kaggle progression system. - n_comments: Total number of comments in the current discussion. - n_appreciation_comments: Total number of appreciation comments in the current discussion. - comments: Dictionary containing data about the comments in the discussion. Each comment is indexed by an ID assigned by Kaggle, containing the following fields: - content: Comment's content. - is_appreciation: Wether the comment is of appreciation. - is_deleted: Wether the comment was deleted. - n_replies: Number of replies to the comment. - datetime: Date and time at which the comment was published (in ISO 8601 format). - votes: Number of votes gotten by the current comment. - medal: Medal awarded by the comment (if any). - user: User that published the comment. - expertise: Publisher's expertise, measured by the Kaggle progression system. - n_deleted: Total number of deleted replies (including self). - replies: A dict following this same format.

By other side, the csv file serves as a summary of the json file, containing information about the comments limited to the hottest and most voted comments.

Note: Only the 'content' field is mandatory for each discussion. The availability of the other fields is subject to the stability of the scraping tasks, which may also affect the update frequency.

Search
Clear search
Close search
Google apps
Main menu