9 datasets found

h
steam-games-dataset
huggingface.co
Updated May 17, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Martin Bustos (2025). steam-games-dataset [Dataset]. http://doi.org/10.57967/hf/0511
Explore at:
Unique identifier
https://doi.org/10.57967/hf/0511
Dataset updated
May 17, 2025
Authors
Martin Bustos
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Overview

Information of more than 110,000 games published on Steam. Maintained by Fronkon Games. This dataset has been created with this code (MIT) and use the API provided by Steam, the largest gaming platform on PC. Data is also collected from Steam Spy. Only published games, no DLCs, episodes, music, videos, etc. Here is a simple example of how to parse json information:

Simple parse of the 'games.json' file.

import os import json

dataset = {} if… See the full description on the dataset page: https://huggingface.co/datasets/FronkonGames/steam-games-dataset.
Top 1500 games on steam by revenue 09-09-2024
kaggle.com
Updated Sep 11, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ali Cem Topcu (2024). Top 1500 games on steam by revenue 09-09-2024 [Dataset]. https://www.kaggle.com/datasets/alicemtopcu/top-1500-games-on-steam-by-revenue-09-09-2024
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Sep 11, 2024
Dataset provided by
Kaggle
Authors
Ali Cem Topcu
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
This is my first data set that I upload to Kaggle, I hope people who collobrate or use this data enjoys it lasts Till this day I haven't seen much detailed datasets about game and games industry. I felt like I need to start it some where were gamers, game companies and gaming enthusiast who are interested in game data analytics can benefit from. firstly I must give a big tanks to gamalytic.com from where I downloaded this data freely from.

About this data set: This dataset contains comprehensive information on the top 1500 games released on Steam between January 1, 2024, and September 9, 2024. Aggregated from 30 separate files, and combined into a single dataset. Minor adjustments were made, such as aligning game release dates for consistency.

Key Features: Game Details: Includes titles, release dates, and developer/publisher information. Sales and Revenue: Tracks the number of copies sold, revenue generated, and pricing details. Player Engagement: Provides average playtime, peak player counts, and other user engagement metrics. Reviews and Scores: Features review scores and ratings. Dynamic Market Data: Offers insights into game performance trends over time, such as sales rank and price fluctuations.

This dataset can be useful for:

Game Developers: Understanding market trends, competitor analysis, and consumer behavior. Data Scientists: Exploring various data analysis techniques, including regression analysis, clustering, and time-series forecasting. Researchers: Analyzing game industry patterns and the impact of game characteristics on sales and user engagement.
o
Steam Game Review Dataset
opendatabay.com
.undefined
Updated Jun 23, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Datasimple (2025). Steam Game Review Dataset [Dataset]. https://www.opendatabay.com/data/dataset/ca15fd2a-228a-4409-8c16-4aef376d7e2a
Explore at:
.undefinedAvailable download formats
Dataset updated
Jun 23, 2025
Dataset authored and provided by
Datasimple
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Area covered
Reviews & Ratings
Description
Context Video games have greatly contributed, and continue to contribute to the expansion of the entertainment industry. When the first video game, Pong, was launched in an arcade machine in 1972, it ignited a video game craze that quickly swept over the youth. With this, businesses such as Atari Games and Nintendo saw the golden opportunity of investing in a developing entertainment sector and began churning out gaming software and hardware. This caused the rise of the video game industry, which has generated over $109 billion in revenue and 2.2 billion gamers since its conception 50 years ago.

In this industry with over 47 million daily active users, Steam has been operating for almost 16 years. Its constant improvement to better accommodate users has made its development notable in the video game industry.

Steam is a digital distribution platform tailored to gamers and game developers. While it initially catered to PC games, the platform soon expanded its availability to home video game consoles such as the Xbox and Sony PlayStation. In Steam, gamers can log in to the website to conveniently purchase and play games online, a better alternative to buying physical copies of the games and manually downloading it on the computer.

game

Content A lot of gamers write reviews at the game page and have an option of choosing whether they would recommend this game to others or not. However, determining this sentiment automatically from text can help Steam to automatically tag such reviews extracted from other forums across the internet and can help them better judge the popularity of games.

Game overview information for both train and test are available in single file game_overview.csv inside train.zip

Acknowledgements Steam digital distribution.

Inspiration Predict whether the reviewer recommended the game titles available in the test set on the basis of review text and other information.

Original Data Source: Steam Game Review Dataset
Steam Review Dataset (2017)
zenodo.org
bz2
Updated Jan 24, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Antoni Sobkowicz; Antoni Sobkowicz (2020). Steam Review Dataset (2017) [Dataset]. http://doi.org/10.5281/zenodo.1000885
Explore at:
bz2Available download formats
Unique identifier
https://doi.org/10.5281/zenodo.1000885
Dataset updated
Jan 24, 2020
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Antoni Sobkowicz; Antoni Sobkowicz
License
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Description
The dataset contains over 6.4 million publicly available reviews in English from Steam Reviews portion of Steam store run by Valve. Each review is described by review text, the id of game it belongs to, review sentiment (positive or negative) and a number of users who tough review was helpful. This is essentially an extension to previously released Steam Review Dataset

The resource is provided as a bzip2 compressed CSV file.

Steam Reviews and Steam are owned by Valve. Authors are not affiliated with and are not endorsed by Valve / Steam
Sentiment Analysis for Steam Reviews
kaggle.com
Updated Sep 27, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Piyush Agnihotri (2020). Sentiment Analysis for Steam Reviews [Dataset]. https://www.kaggle.com/datasets/piyushagni5/sentiment-analysis-for-steam-reviews
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Sep 27, 2020
Dataset provided by
Kaggle
Authors
Piyush Agnihotri
Description
Sentiment Analysis for Steam Reviews

Steam is a video game digital distribution service with a vast community of gamers globally. A lot of gamers write reviews on the game page and have the option of choosing whether they would recommend this game to others or not. However, determining this sentiment automatically from the text can help Steam to automatically tag such reviews extracted from other forums across the internet and can help them better judge the popularity of games.

Given the review text with user recommendation and other information related to each game for 64 game titles, the task is to create a test set by making a split from the training set and try to predict whether the reviewer recommended the game titles available in the test set on the basis of review text and other information.

Game overview information for the train is available in single file game_overview.csv.

About Data Source: Steam Platform

train.csv

review_id --> Unique ID for each review

title --> Title of the game

year --> Year in which the review was posted

user_review --> Full Text of the review posted by a user

user_suggestion --> (Target) Game marked Recommended(1) and Not Recommended(0) by the user

game_overview.csv

title --> Title of the game

developer --> Name of the developer of the game

publisher --> Name of the publisher of the game

tags --> Popular user-defined tags for the game

overview --> Overview of the game provided by the publisher.

Acknowledgements

The data is collected from the Analytic Vidhya, JanataHack: NLP Hackathon.
u
Pinterest Fashion Compatibility
cseweb.ucsd.edu
beta.data.urbandatacentre.ca
json
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
UCSD CSE Research Project, Pinterest Fashion Compatibility [Dataset]. https://cseweb.ucsd.edu/~jmcauley/datasets.html
Explore at:
jsonAvailable download formats
Dataset authored and provided by
UCSD CSE Research Project
Description
This dataset contains images (scenes) containing fashion products, which are labeled with bounding boxes and links to the corresponding products.

Metadata includes

product IDs

bounding boxes

Basic Statistics:

Scenes: 47,739

Products: 38,111

Scene-Product Pairs: 93,274
u
PDMX
cseweb.ucsd.edu
json
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
UCSD CSE Research Project, PDMX [Dataset]. https://cseweb.ucsd.edu/~jmcauley/datasets.html
Explore at:
jsonAvailable download formats
Dataset authored and provided by
UCSD CSE Research Project
Description
We introduce PDMX: a Public Domain MusicXML dataset for symbolic music processing, including over 250k musical scores in MusicXML format. PDMX is the largest publicly available, copyright-free MusicXML dataset in existence. PDMX includes genre, tag, description, and popularity metadata for every file.
Z
Spiking Seizure Classification Dataset
data.niaid.nih.gov
Updated Jan 13, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Gallou, Olympia (2025). Spiking Seizure Classification Dataset [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_10800793
Explore at:
Dataset updated
Jan 13, 2025
Dataset provided by
Gallou, Olympia
Matthew, Cook
Ito, Hiroyuki
GHOSH, SAPTARSHI
Bartels, Jim
Sarnthein, Johannes
Indiveri, Giacomo
License
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Description
Dataset for event encoded analog EEG signals for detection of Epileptic seizures

This dataset contains events that are encoded from the analog signals recorded during pre-surgical evaluations of patients at the Sleep-Wake-Epilepsy-Center (SWEC) of the University Department of Neurology at the Inselspital Bern. The analog signals are sourced from the SWEC-ETHZ iEEG Database

This database contains event streams for 10 seizures recorded from 5 patients and generated by the DYnamic Neuromorphic Asynchronous Processor (DYNAP-SE2) to demonstrate a proof-of-concept of encoding seizures with network synchronization. The pipeline consists of two parts (I) an Analog Front End (AFE) and (II) an SNN termed as"Non-Local Non-Global" (NLNG) network.

In the first part of the pipeline, the digitally recorded signals from SWEC-ETHZ iEEG Database are converted to analog signals via an 18-bit Digital-to-Analog converter (DAC) and then amplified and encoded into events by an Asynchronous Delta Modulator (ADM). Then in the second part, the encoded event streams are fed into the SNN that extracts the features of the epileptic seizure by extracting the partial synchronous patterns intrinsic to the seizure dynamics.

Details about the neuromorphic processing pipeline and the encoding process are included in a manuscript under review. The preprint is available in bioRxiv

InstallationThe installation requires Python>=3.x and conda (or py-venv) package. Users can then install the requirements inside a conda environment using

conda env create -f requirements.txt -n sez

Once created the conda environment can be activated with conda activate sez

The main files in the database are described in the hierarchy below.

EventSezDataset/

├─ data/

│ ├─ P x S x

│ │ ├─ Pat x Sz x _CH x .csv

├─ LSVM_Params/

│ ├─ opt_svm_params/

│ ├─ pat_x_features_SYNCH/

├─ fig_gen.py

├─ sync_mat_gen.py

├─ SeizDetection_FR.py

├─ SeizDetection_SYNCH.py

├─ support.py

├─ run.sh

├─ requirements.txt

where x represents the Patient ID and the Seizure ID respectively.

requirements.txt: This file lists the requirements for the execution of the Python code.

fig_gen.py: This file plots the analog signals and the associated AFE and NLNG event streams. The execution of the code happens with `python fig_gen.py 1 1 13', where patient 2, seizure 1, and channel 13 of the recording are plotted.

sync_mat_gen.py: This file describes the function for plotting the synchronization matrices emerging from the ADM and the NLNG spikes with either linear or log colorbar. The execution of the code happens with python sync_mat_gen.py 1 1' orpython sync_mat_gen.py 1 1 log'. This execution generated four figures for pre-seizure, First Half of seizure, Second Half of seizure, and post-seizure time periods, where patient 1 and seizure 1. The third option can either be left blank or input as lin or log, for respective color bar scales. The time is the signal-time as mentioned in the table below.

run.sh: A simple Linux script to run the above code for all patients and seizures.

SeizDetection_FR.py: This file runs the LSVM on the ADM and NLNG spikes, using the firing rate (FR) as a feature. The code is currently set up with plotting with pre-computed features (in the LSVM_Params/opt_svm_params/ folder). Users can use the code for training the LSVM with different parameters as well.

SeizDetection_SYNCH.py: This file runs the LSVM on the kernelized ADM and NLNG spikes, using the flattened SYNC matrices as a feature. The code is currently set up with plotting with pre-computed features (in the LSVM_Params/pat_x_features_SYNCH/ folder). Users can use the code for training the LSVM with different parameters as well.

LSVM_Params: Folder containing LSVM features with different parameter combinations.

support.py: This file contains the necessary functions.

data/P1S1/: This folder, for example, contains the event streams for all channels for seizure 1 of patient 1.

Pat1_Sz_1_CH1.csv: This file contains the spikes of the AFE and the NLNG layers with the following tabular format (which can be extracted by the fig_gen.py)

Comments

SStart: 180 //Start of the Seizure in signal time# SEnd: 276.0 //Start of the Seizure in signal time# Pid: 2 // The patient ID as per the SWEC-ETHZ iEEG Database # Sid: 1 // The Seizure ID as per the SWEC-ETHZ iEEG Database # Channel_No: 1 // The channel number

SYS_time signal_time dac_value ADMspikes NLNGspikes

The time from the interface FPGA The time of the signal as per the SWEC ETHZ Database The value of the analog signals as recorded in the SWEC ETHZ Database The event-steam is the output of the AFE in boolean format. True represents a spike The spike-steam is the output of the SNN in boolean format. True represents a spike
W
Data from: COMPUTER MODELING OF A THREE-DIMENSIONAL STEAM INJECTION...
cloud.csiss.gmu.edu
pdf
Updated Aug 8, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Energy Data Exchange (2019). COMPUTER MODELING OF A THREE-DIMENSIONAL STEAM INJECTION EXPERIMENT [Dataset]. https://cloud.csiss.gmu.edu/uddi/dataset/computer-modeling-of-a-three-dimensional-steam-injection-experiment
Explore at:
pdf(1838275)Available download formats
Dataset updated
Aug 8, 2019
Dataset provided by
Energy Data Exchange
Description
The experimental results and CT scans obtained during a steam-flooding experiment with the SUPRI 3-D steam injection laboratory model are compared with the results obtained from a numerical simulator for the same experiment. Simulation studies were carried out using the STARS (Steam and Additives Reservoir Simulator) compositional simulator. The saturation and temperature distributions obtained and heat loss rates measured in the experimental model at different stages of steam-flooding were compared with those calculated from the numerical simulator. There is a fairly good agreement between the experimental results and the simulator output. However, the experimental scans show a greater degree of gravity override than that obtained with the simulator for the same heat-loss rates. Symmetric sides of the experimental 5-spot show asymmetric heat-loss rates contrary to theory and simulator results. Some utility programs have been written for extracting, processing and outputting the required grid data from the STARS simulator. These are general in nature and can be useful for other STARS users.
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

Martin Bustos (2025). steam-games-dataset [Dataset]. http://doi.org/10.57967/hf/0511

steam-games-dataset

Steam Games Dataset

FronkonGames/steam-games-dataset

Explore at:

Unique identifier

https://doi.org/10.57967/hf/0511

Dataset updated

May 17, 2025

Authors

Martin Bustos

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Overview

Information of more than 110,000 games published on Steam. Maintained by Fronkon Games. This dataset has been created with this code (MIT) and use the API provided by Steam, the largest gaming platform on PC. Data is also collected from Steam Spy. Only published games, no DLCs, episodes, music, videos, etc. Here is a simple example of how to parse json information:

Simple parse of the 'games.json' file.

import os import json

dataset = {} if… See the full description on the dataset page: https://huggingface.co/datasets/FronkonGames/steam-games-dataset.

Clear search

Close search

Google apps

Main menu

steam-games-dataset

Simple parse of the 'games.json' file.

Top 1500 games on steam by revenue 09-09-2024

Steam Game Review Dataset

Steam Review Dataset (2017)

Sentiment Analysis for Steam Reviews

Sentiment Analysis for Steam Reviews

About Data Source: Steam Platform

Acknowledgements

Pinterest Fashion Compatibility

PDMX

Spiking Seizure Classification Dataset

Comments

SStart: 180 //Start of the Seizure in signal time# SEnd: 276.0 //Start of the Seizure in signal time# Pid: 2 // The patient ID as per the SWEC-ETHZ iEEG Database # Sid: 1 // The Seizure ID as per the SWEC-ETHZ iEEG Database # Channel_No: 1 // The channel number

Data from: COMPUTER MODELING OF A THREE-DIMENSIONAL STEAM INJECTION...

steam-games-dataset

Steam Games Dataset

FronkonGames/steam-games-dataset

Simple parse of the 'games.json' file.