Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Overview
Information of more than 110,000 games published on Steam. Maintained by Fronkon Games. This dataset has been created with this code (MIT) and use the API provided by Steam, the largest gaming platform on PC. Data is also collected from Steam Spy. Only published games, no DLCs, episodes, music, videos, etc. Here is a simple example of how to parse json information:
import os import json
dataset = {} if… See the full description on the dataset page: https://huggingface.co/datasets/FronkonGames/steam-games-dataset.
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
This is my first data set that I upload to Kaggle, I hope people who collobrate or use this data enjoys it lasts Till this day I haven't seen much detailed datasets about game and games industry. I felt like I need to start it some where were gamers, game companies and gaming enthusiast who are interested in game data analytics can benefit from. firstly I must give a big tanks to gamalytic.com from where I downloaded this data freely from.
About this data set: This dataset contains comprehensive information on the top 1500 games released on Steam between January 1, 2024, and September 9, 2024. Aggregated from 30 separate files, and combined into a single dataset. Minor adjustments were made, such as aligning game release dates for consistency.
Key Features: Game Details: Includes titles, release dates, and developer/publisher information. Sales and Revenue: Tracks the number of copies sold, revenue generated, and pricing details. Player Engagement: Provides average playtime, peak player counts, and other user engagement metrics. Reviews and Scores: Features review scores and ratings. Dynamic Market Data: Offers insights into game performance trends over time, such as sales rank and price fluctuations.
This dataset can be useful for:
Game Developers: Understanding market trends, competitor analysis, and consumer behavior. Data Scientists: Exploring various data analysis techniques, including regression analysis, clustering, and time-series forecasting. Researchers: Analyzing game industry patterns and the impact of game characteristics on sales and user engagement.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Context Video games have greatly contributed, and continue to contribute to the expansion of the entertainment industry. When the first video game, Pong, was launched in an arcade machine in 1972, it ignited a video game craze that quickly swept over the youth. With this, businesses such as Atari Games and Nintendo saw the golden opportunity of investing in a developing entertainment sector and began churning out gaming software and hardware. This caused the rise of the video game industry, which has generated over $109 billion in revenue and 2.2 billion gamers since its conception 50 years ago.
In this industry with over 47 million daily active users, Steam has been operating for almost 16 years. Its constant improvement to better accommodate users has made its development notable in the video game industry.
Steam is a digital distribution platform tailored to gamers and game developers. While it initially catered to PC games, the platform soon expanded its availability to home video game consoles such as the Xbox and Sony PlayStation. In Steam, gamers can log in to the website to conveniently purchase and play games online, a better alternative to buying physical copies of the games and manually downloading it on the computer.
game
Content A lot of gamers write reviews at the game page and have an option of choosing whether they would recommend this game to others or not. However, determining this sentiment automatically from text can help Steam to automatically tag such reviews extracted from other forums across the internet and can help them better judge the popularity of games.
Game overview information for both train and test are available in single file game_overview.csv inside train.zip
Acknowledgements Steam digital distribution.
Inspiration Predict whether the reviewer recommended the game titles available in the test set on the basis of review text and other information.
Original Data Source: Steam Game Review Dataset
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
The dataset contains over 6.4 million publicly available reviews in English from Steam Reviews portion of Steam store run by Valve. Each review is described by review text, the id of game it belongs to, review sentiment (positive or negative) and a number of users who tough review was helpful. This is essentially an extension to previously released Steam Review Dataset
The resource is provided as a bzip2 compressed CSV file.
Steam Reviews and Steam are owned by Valve. Authors are not affiliated with and are not endorsed by Valve / Steam
Steam is a video game digital distribution service with a vast community of gamers globally. A lot of gamers write reviews on the game page and have the option of choosing whether they would recommend this game to others or not. However, determining this sentiment automatically from the text can help Steam to automatically tag such reviews extracted from other forums across the internet and can help them better judge the popularity of games.
Given the review text with user recommendation and other information related to each game for 64 game titles, the task is to create a test set by making a split from the training set and try to predict whether the reviewer recommended the game titles available in the test set on the basis of review text and other information.
Game overview information for the train is available in single file game_overview.csv.
review_id --> Unique ID for each review
title --> Title of the game
year --> Year in which the review was posted
user_review --> Full Text of the review posted by a user
user_suggestion --> (Target) Game marked Recommended(1) and Not Recommended(0) by the user
title --> Title of the game
developer --> Name of the developer of the game
publisher --> Name of the publisher of the game
tags --> Popular user-defined tags for the game
overview --> Overview of the game provided by the publisher.
The data is collected from the Analytic Vidhya, JanataHack: NLP Hackathon.
This dataset contains images (scenes) containing fashion products, which are labeled with bounding boxes and links to the corresponding products.
Metadata includes
product IDs
bounding boxes
Basic Statistics:
Scenes: 47,739
Products: 38,111
Scene-Product Pairs: 93,274
We introduce PDMX: a Public Domain MusicXML dataset for symbolic music processing, including over 250k musical scores in MusicXML format. PDMX is the largest publicly available, copyright-free MusicXML dataset in existence. PDMX includes genre, tag, description, and popularity metadata for every file.
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Dataset for event encoded analog EEG signals for detection of Epileptic seizures
This dataset contains events that are encoded from the analog signals recorded during pre-surgical evaluations of patients at the Sleep-Wake-Epilepsy-Center (SWEC) of the University Department of Neurology at the Inselspital Bern. The analog signals are sourced from the SWEC-ETHZ iEEG Database
This database contains event streams for 10 seizures recorded from 5 patients and generated by the DYnamic Neuromorphic Asynchronous Processor (DYNAP-SE2) to demonstrate a proof-of-concept of encoding seizures with network synchronization. The pipeline consists of two parts (I) an Analog Front End (AFE) and (II) an SNN termed as"Non-Local Non-Global" (NLNG) network.
In the first part of the pipeline, the digitally recorded signals from SWEC-ETHZ iEEG Database are converted to analog signals via an 18-bit Digital-to-Analog converter (DAC) and then amplified and encoded into events by an Asynchronous Delta Modulator (ADM). Then in the second part, the encoded event streams are fed into the SNN that extracts the features of the epileptic seizure by extracting the partial synchronous patterns intrinsic to the seizure dynamics.
Details about the neuromorphic processing pipeline and the encoding process are included in a manuscript under review. The preprint is available in bioRxiv
InstallationThe installation requires Python>=3.x and conda (or py-venv) package. Users can then install the requirements inside a conda environment using
conda env create -f requirements.txt -n sez
Once created the conda environment can be activated with conda activate sez
The main files in the database are described in the hierarchy below.
EventSezDataset/
├─ data/
│ ├─ P x S x
│ │ ├─ Pat x Sz x _CH x .csv
├─ LSVM_Params/
│ ├─ opt_svm_params/
│ ├─ pat_x_features_SYNCH/
├─ fig_gen.py
├─ sync_mat_gen.py
├─ SeizDetection_FR.py
├─ SeizDetection_SYNCH.py
├─ support.py
├─ run.sh
├─ requirements.txt
where x represents the Patient ID and the Seizure ID respectively.
requirements.txt: This file lists the requirements for the execution of the Python code.
fig_gen.py: This file plots the analog signals and the associated AFE and NLNG event streams. The execution of the code happens with `python fig_gen.py 1 1 13', where patient 2, seizure 1, and channel 13 of the recording are plotted.
sync_mat_gen.py: This file describes the function for plotting the synchronization matrices emerging from the ADM and the NLNG spikes with either linear or log colorbar. The execution of the code happens with python sync_mat_gen.py 1 1' or
python sync_mat_gen.py 1 1 log'. This execution generated four figures for pre-seizure, First Half of seizure, Second Half of seizure, and post-seizure time periods, where patient 1 and seizure 1. The third option can either be left blank or input as lin
or log
, for respective color bar scales. The time is the signal-time as mentioned in the table below.
run.sh: A simple Linux script to run the above code for all patients and seizures.
SeizDetection_FR.py: This file runs the LSVM on the ADM and NLNG spikes, using the firing rate (FR) as a feature. The code is currently set up with plotting with pre-computed features (in the LSVM_Params/opt_svm_params/ folder). Users can use the code for training the LSVM with different parameters as well.
SeizDetection_SYNCH.py: This file runs the LSVM on the kernelized ADM and NLNG spikes, using the flattened SYNC matrices as a feature. The code is currently set up with plotting with pre-computed features (in the LSVM_Params/pat_x_features_SYNCH/ folder). Users can use the code for training the LSVM with different parameters as well.
LSVM_Params: Folder containing LSVM features with different parameter combinations.
support.py: This file contains the necessary functions.
data/P1S1/: This folder, for example, contains the event streams for all channels for seizure 1 of patient 1.
Pat1_Sz_1_CH1.csv: This file contains the spikes of the AFE and the NLNG layers with the following tabular format (which can be extracted by the fig_gen.py)
SYS_time signal_time dac_value ADMspikes NLNGspikes
The time from the interface FPGA The time of the signal as per the SWEC ETHZ Database The value of the analog signals as recorded in the SWEC ETHZ Database The event-steam is the output of the AFE in boolean format. True represents a spike The spike-steam is the output of the SNN in boolean format. True represents a spike
The experimental results and CT scans obtained during a steam-flooding experiment with the SUPRI 3-D steam injection laboratory model are compared with the results obtained from a numerical simulator for the same experiment. Simulation studies were carried out using the STARS (Steam and Additives Reservoir Simulator) compositional simulator. The saturation and temperature distributions obtained and heat loss rates measured in the experimental model at different stages of steam-flooding were compared with those calculated from the numerical simulator. There is a fairly good agreement between the experimental results and the simulator output. However, the experimental scans show a greater degree of gravity override than that obtained with the simulator for the same heat-loss rates. Symmetric sides of the experimental 5-spot show asymmetric heat-loss rates contrary to theory and simulator results. Some utility programs have been written for extracting, processing and outputting the required grid data from the STARS simulator. These are general in nature and can be useful for other STARS users.
Not seeing a result you expected?
Learn how you can add new datasets to our index.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Overview
Information of more than 110,000 games published on Steam. Maintained by Fronkon Games. This dataset has been created with this code (MIT) and use the API provided by Steam, the largest gaming platform on PC. Data is also collected from Steam Spy. Only published games, no DLCs, episodes, music, videos, etc. Here is a simple example of how to parse json information:
import os import json
dataset = {} if… See the full description on the dataset page: https://huggingface.co/datasets/FronkonGames/steam-games-dataset.