Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Overview
Information of more than 110,000 games published on Steam. Maintained by Fronkon Games. This dataset has been created with this code (MIT) and use the API provided by Steam, the largest gaming platform on PC. Data is also collected from Steam Spy. Only published games, no DLCs, episodes, music, videos, etc. Here is a simple example of how to parse json information:
import os import json
dataset = {} if… See the full description on the dataset page: https://huggingface.co/datasets/FronkonGames/steam-games-dataset.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This Dataset provides up-to-date information on the sales performance and popularity of various video games worldwide. The data includes the name, platform, year of release, genre, publisher, and sales in North America, Europe, Japan, and other regions. It also features scores and ratings from both critics and users, including average critic score, number of critics reviewed, average user score, number of users reviewed, developer, and rating. This comprehensive and essential dataset offers valuable insights into the global video game market and is a must-have tool for gamers, industry professionals, and market researchers. by source
More Datasets
For more datasets, click here.
Column Name | Description |
---|---|
Name | The name of the video game. |
Platform | The platform on which the game was released, such as PlayStation, Xbox, Nintendo, etc. |
Year of Release | The year in which the game was released. |
Genre | The genre of the video game, such as action, adventure, sports, etc. |
Publisher | The company responsible for publishing the game. |
NA Sales | The sales of the game in North America. |
EU Sales | The sales of the game in Europe. |
JP Sales | The sales of the game in Japan. |
Other Sales | The sales of the game in other regions. |
Global Sales | The total sales of the game across the world. |
Critic Score | The average score given to the game by professional critics. |
Critic Count | The number of critics who reviewed the game. |
User Score | The average score given to the game by users. |
User Count | The number of users who reviewed the game. |
Developer | The company responsible for developing the game. |
Rating | The rating assigned to the game by organizations such as the ESRB or PEGI. |
- Market Analysis: The video game sales data can be used to analyze market trends and identify popular genres, platforms, and publishers. This can be useful for industry professionals to make informed decisions about game development and marketing strategies.
- Sales Forecasting: The sales data can be used to forecast future trends and predict the success of upcoming games.
- Consumer Insights: The data can be analyzed to gain insights into consumer preferences and buying habits, which can be used to tailor marketing strategies and improve customer satisfaction.
- Comparison of Competitors: The data can be used to compare the sales performance of competing video games and identify market leaders.
- Gaming Industry Performance: The data can be used to evaluate the overall performance of the gaming industry and track its growth over time.
- Gaming Popularity by Region: The data can be analyzed to determine which regions are the largest markets for video games and which genres are most popular in each region.
- Impact of Reviews: The data can be used to study the impact of critic and user reviews on sales and the relationship between scores and sales performance.
- Gaming Trends over Time: The data can be used to identify trends in the gaming industry over time and to track the evolution of the market.
- Gaming Demographics: The data can be used to analyze the demographic makeup of the gaming audience, including age, gender, and income.
- Impact of Gaming Industry on the Economy: The data can be used to evaluate the impact of the gaming industry on the economy and to assess its contribution to job creation and economic growth.
if this dataset was used in your work or studies, please credit the original source Please Credit ↑ ⠀⠀⠀
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Version 4 of the dataset is available (Sep 19 2019)!
Note this version has significantly more data than Version 2.
Dataset description paper (full version) is available!
https://arxiv.org/pdf/1903.06754.pdf (updated Sep 7 2019)
Tools for visualizing the data is available!
https://github.com/corgiTrax/Gaze-Data-Processor
=========================== Dataset Description ===========================
We provide a large-scale, high-quality dataset of human actions with simultaneously recorded eye movements while humans play Atari video games. The dataset consists of 117 hours of gameplay data from a diverse set of 20 games, with 8 million action demonstrations and 328 million gaze samples. We introduce a novel form of gameplay, in which the human plays in a semi-frame-by-frame manner. This leads to near-optimal game decisions and game scores that are comparable or better than known human records. For every game frame, its corresponding image frame, the human keystroke action, the reaction time to make that action, the gaze positions, and immediate reward returned by the environment were recorded.
Q & A: Why frame-by-frame game mode?
Resolving state-action mismatch: Closed-loop human visuomotor reaction time is around 250-300 milliseconds. Therefore, during gameplay, state (image) and action that are simultaneously recorded at time step t could be mismatched. Action at time t could be intended for a state 250-300ms ago. This effect causes a serious issue for supervised learning algorithms, since label at and input st are no longer matched. Frame-by-frame game play ensures states and actions are matched at every timestep.
Maximizing human performance: Frame-by-frame mode makes gameplay more relaxing and reduces fatigue, which could normally result in blinking and would corrupt eye-tracking data. More importantly, this design reduces sub-optimal decisions caused by inattentive blindness.
Highlighting critical states that require multiple eye movements: Human decision time and all eye movements were recorded at every frame. The states that could lead to a large reward or penalty, or the ones that require sophisticated planning, will take longer and require multiple eye movements for the player to make a decision. Stopping gameplay means that the observer can use eye-movements to resolve complex situations. This is important because if the algorithm is going to learn from eye-movements it must contain all “relevant” eye-movements.
============================ Readme ============================
GameName: String. Game name. e.g., “alien” indicates the trial is collected for game Alien (15 min time limit). “alien_highscore” is the trajectory collected from the best player’s highest score (2 hour limit). See dataset description paper for details.
trial_id: Integer. One can use this number to locate the associated .tar.bz2 file and label file.
subject_id: Char. Human subject identifiers.
load_trial: Integer. 0 indicates that the game starts from scratch. If this field is non-zero, it means that the current trial continues from a saved trial. The number indicates the trial number to look for.
highest_score: Integer. The highest game score obtained from this trial.
total_frame: Number of image frames in the .tar.bz2 repository.
total_game_play_time: Integer. game time in ms.
total_episode: Integer. number of episodes in the current trial. An episode terminates when all lives are consumed.
avg_error: Float. Average eye-tracking validation error at the end of each trial in visual degree (1 visual degree = 1.44 cm in our experiment). See our paper for the calibration/validation process.
max_error: Float. Max eye-tracking validation error.
low_sample_rate: Percentage. Percentage of frames with less than 10 gaze samples. The most common reason for this is blinking.
frame_averaging: Boolean. The game engine allows one to turn this on or off. When turning on (TRUE), two consecutive frames are averaged, this alleviates screen flickering in some games.
fps: Integer. Frame per second when an action key is held down.
*.tar.bz2 files: contains game image frames. The filename indicates its trial number.
*.txt files: label file for each trial, including:
frame_id: String. The ID of a frame, can be used to locate the corresponding image frame in .tar.bz2 file.
episode_id: Integer (not available for some trials). Episode number, starting from 0 for each trial. A trial could contain a single trial or multiple trials.
score: Integer (not available for some trials). Current game score for that frame.
duration(ms): Integer. Time elapsed until the human player made a decision.
unclipped_reward: Integer. Immediate reward returned by the game engine.
action: Integer. See action_enums.txt for the mapping. This is consistent with the Arcade Learning Environment setup.
gaze_positions: Null/A list of integers: x0,y0,x1,y1,...,xn,yn. Gaze positions for the current frame. Could be null if no gaze. (0,0) is the top-left corner. x: horizontal axis. y: vertical.
============================ Citation ============================
If you use the Atari-HEAD in your research, we ask that you please cite the following:
@misc{zhang2019atarihead,
title={Atari-HEAD: Atari Human Eye-Tracking and Demonstration Dataset},
author={Ruohan Zhang and Calen Walshe and Zhuode Liu and Lin Guan and Karl S. Muller and Jake A. Whritner and Luxin Zhang and Mary M. Hayhoe and Dana H. Ballard},
year={2019},
eprint={1903.06754},
archivePrefix={arXiv},
primaryClass={cs.LG}
}
Zhang, Ruohan, Zhuode Liu, Luxin Zhang, Jake A. Whritner, Karl S. Muller, Mary M. Hayhoe, and Dana H. Ballard. "AGIL: Learning attention from human for visuomotor tasks." In Proceedings of the European Conference on Computer Vision (ECCV), pp. 663-679. 2018.
@inproceedings{zhang2018agil,
title={AGIL: Learning attention from human for visuomotor tasks},
author={Zhang, Ruohan and Liu, Zhuode and Zhang, Luxin and Whritner, Jake A and Muller, Karl S and Hayhoe, Mary M and Ballard, Dana H},
booktitle={Proceedings of the European Conference on Computer Vision (ECCV)},
pages={663--679},
year={2018}
}
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Introduction This dataset consists of human preferences over different trajectories in a game that can be framed as a Markov decision process. The game is grid-based, and in it, a car must move to a goal while avoiding obstacles and minimizing costs (e.g., by minimizing gas costs or collecting coins). Trajectories in this game consist of sequences of states and actions. The dataset collects human preferences over segments of these trajectories. For example, do humans prefer that the car drives out of its way to collect a coin, or that it drives directly to the goal? The data was collected to study how to learn a reward function from human subject preferences for use with reinforcement learning (RL). RL is a powerful tool that allows robots and other software agents to learn new behaviors through trial and error. Recent advancements in RL have significantly improved its effectiveness, making it increasingly applicable to real-world robotics challenges such as quadrupedal locomotion and autonomous driving. To increase the utility and alignment of RL agents, we study how to learn a reward function from human preferences between pairs of trajectory segments using this game. We provide the game code, which we created for this study, in our corresponding codebase. Also included is a data report file, entitled A_data_report.pdf, containing a detailed account of how the dataset was obtained and its content. Data Collection Subjects were shown various pairs of behaviors (ie: player trajectories) in this game and asked to label which one they preferred. The game is designed such that the objective of the game is easy to understand, but identifying optimal behavior is difficult for the players. This serves as a non-trivial test bed for various preference learning algorithms, where a good reward function learned from preferences must correctly balance various reward features. Game Design We designed a simple grid-world style game to show subjects when eliciting preferences. The game consists of a grid of cells, each of a specific road surface type. The player can move one cell in one of the four cardinal directions, and the player’s goal is to maximize the sum of rewards. The game can terminate either at the destination for +50 reward or in failure at a sheep for −50 reward. Cells contain other items which either result in a positive or negative reward, and the player is penalized -1 for every move they make. The implementation of this game is in our accompanying codebase. We chose one instantiation of this game for gathering our dataset of human preferences. This specific instantiation has a 10 × 10 grid. From every state, the highest return possible involves reaching the goal, rather than hitting a sheep or perpetually avoiding termination. Figure 1 shows this task. Human Subjects 143 subjects were recruited via Amazon Mechanical Turk. We filtered workers based on task comprehension (see data report for more details) and required that all workers were located in the United States, had an approval rating of at least 99%, and completed at least 100 other MTurk HITs. The resulting dataset comprises data collected from 50 subjects. This filtered data consists of 1812 preferences over 1245 unique segment pairs. This data collection was IRB-approved. Dataset Organization and Contents The full dataset is organized in two directories. The directory deliver_mdp contains all collected human preferences, as well as the corresponding game that subjects were shown and all additional data needed to learn a reward function from these preferences. The directory entitled random_mdps contains 200 additional game instantiations, as well as synthetically generated preferences for each of these games. For further information on the specific files and what they contain, refer to the data report located in this repository. Results Summary A preference model is a mathematical representation of a person's preferences over different trajectory segments. Preferences are expressed in terms of pairwise comparisons between segments, where the preference model takes in two segments and outputs the probability that a human would prefer one segment to the other. Given a preference model and a dataset of preferences generated by this model, one can then learn a reward function for an RL task. This dataset was used to evaluate two different preference models; the ubiquitously assumed partial return model and our proposed regret model. Our corresponding paper shows that the regret model is a better predictor of human preferences than the partial return model, and that it results in a learned reward function that, when optimized over, induces more performant behavior under the game's true reward function. Code We provide scripts for reproducing the experiments which include learning and evaluating reward functions from the provided preference datasets. The code accompanying this dataset can be found here. This dataset contains a script...
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Con Espressione Game Dataset
A piece of music can be expressively performed, or interpreted, in a variety of ways. With the help of an online questionnaire, the Con Espressione Game, we collected some 1,500 descriptions of expressive character relating to 45 performances of 9 excerpts from classical piano pieces, played by different famous pianists. More specifically, listeners were asked to describe, using freely chosen words (preferably: adjectives), how they perceive the expressive character of the different performances. The aim of this research is to find the dimensions of musical expression (in Western classical piano music) that can be attributed to a performance, as perceived and described in natural language by listeners.
The Con Espressione Game was launched on the 3rd of April 2018.
Dataset structure
Listeners’ Descriptions of Expressive performance
piece_performer_data.csv: A comma separated file (CSV) containing information about the pieces in the dataset. Strings are delimited with ". The columns in this file are:
music_id: An integer ID for each performance in the dataset.
performer_name: (Last) name of the performer.
piece_name: (Short) name of the piece.
performance_name: Name of the the performance. All files in different modalities (alignments, MIDI, loudness features, etc) corresponding to a single performance will have the same name (but possibly different extensions).
composer: Name of the composer of the piece.
piece: Full name of the piece.
album: Name of the album.
performer_name_full: Full name of the performer.
year_of_CD_issue: Year of the issue of the CD.
track_number: Number of the track in the CD.
length_of_excerpt_seconds: Length of the excerpt in seconds.
start_of_excerpt_seconds: Start of the excerpt in its corresponding track (in seconds).
end_of_excerpt_seconds: End of the excerpt in its corresponding track (in seconds).
con_espressione_game_answers.csv: This is the main file of the dataset which contains listener’s descriptions of expressive character. This CSV file contains the following columns:
answer_id: An integer representing the ID of the answer. Each answer gets a unique ID.
participant_id: An integer representing the ID of a participant. Answers with the same ID come from the same participant.
music_id: An integer representing the ID of the performance. This is the same as the music_id in piece_performer_data.csv described above.
answer: (cleaned/formatted) participant description. All answers have been written as lower-case, typos were corrected, spaces replaced by underscores (_) and individual terms are separated by commas. See cleanup_rules.txt for a more detailed description of how the answers were formatted.
original_answer: Raw answers provided by the participants.
timestamp: Timestamp of the answer.
favorite: A boolean (0 or 1) indicating if this performance of the piece is the participant’s favorite.
translated_to_english. Raw translation (from German, Russian, Spanish and Italian).
performer. (Last) name of the performer. See piece_performer_data.csv described above.
piece_name. (Short) name of the piece. See piece_performer_data.csv described above.
performance_name. Name of the performance. See piece_performer_data.csv described above.
participant_profiles.csv. A CSV file containing musical background information of the participants. Empty cells mean that the participant did not provide an answer. This file contains the following columns:
participant_id: An integer representing the ID of a participant.
music_education_years: (Self reported) number of years of musical education of the participants
listening_to_classical_music: Answers to the question “How often do you listen to classical music?”. The possible answers are:
1: Never
2: Very rarely
3: Rarely
4: Occasionally
5: Frequently
6: Very frequently
registration_date: Date and time of registration of the participant.
playing_piano: Answer to the question “Do you play the piano?”. The possible answers are
1: No
2: A little bit
3: Quite well
4: Very well
cleanup_rules.txt: Rules for cleaning/formatting the terms in the participant’s answers.
translations_GERMAN.txt: How the translations from German to English were made.
Metadata
Related meta data is stored in the MetaData folder.
Alignments. This folders contains the manually-corrected score-to-performance alignments for each of the pieces in the dataset. Each of these alignments is a text file.
ApproximateMIDI. This folder contains reconstructed MIDI performances created from the alignments and the loudness curves. The onset time and offset times of the notes were determined from the alignment times and the MIDI velocity was computed from the loudness curves.
Match. This folder contains score-to-performance alignments in Matchfile format.
Scores_MuseScore. Manually encoded sheet music in MuseScore format (.mscz)
Scores_MusicXML. Sheet music in MusicXML format.
Scores_pdf. Images of the sheet music in pdf format.
Audio Features
Audio features computed from the audio files. These features are located in the AudioFeatures folder.
Loudness: Text files containing loudness curves in dB of the audio files. These curves were computed using code provided by Olivier Lartillot. Each of these files contains the following columns:
performance_time_(seconds): Performance time in seconds.
loudness_(db): Loudness curve in dB.
smooth_loudness_(db): Smoothed loudness curve.
Spectrograms. Numpy files (.npy) containing magnitude spectrograms (as Numpy arrays). The shape of each array is (149 frequency bands, number of frames of the performance). The spectrograms were computed from the audio files with the following parameters:
Sample rate (sr): 22050 samples per second
Window length: 2048
Frames per Second (fps): 31.3 fps
Hop size: sample_rate // fps = 704
Filterbank: log scaled filterbank with 24 bands per octave and min frequency 20 Hz
MIDI Performances
Since the dataset consists of commercial recordings, we cannot include the audio files in the dataset. We can, however, share the 2 synthesized MIDI performances used in the Con Espressione game (for Bach’s Prelude in C and the second movement of Mozart’s Sonata in C K 545) in mp3 format. These performances can be found in the MIDIPerformances folder.
ViZDoom is an AI research platform based on the classical First Person Shooter game Doom. The most popular game mode is probably the so-called Death Match, where several players join in a maze and fight against each other. After a fixed time, the match ends and all the players are ranked by the FRAG scores defined as kills minus suicides. During the game, each player can access various observations, including the first-person view screen pixels, the corresponding depth-map and segmentation-map (pixel-wise object labels), the bird-view maze map, etc. The valid actions include almost all the keyboard-stroke and mouse-control a human player can take, accounting for moving, turning, jumping, shooting, changing weapon, etc. ViZDoom can run a game either synchronously or asynchronously, indicating whether the game core waits until all players’ actions are collected or runs in a constant frame rate without waiting.
RL Unplugged is suite of benchmarks for offline reinforcement learning. The RL Unplugged is designed around the following considerations: to facilitate ease of use, we provide the datasets with a unified API which makes it easy for the practitioner to work with all data in the suite once a general pipeline has been established.
The datasets follow the RLDS format to represent steps and episodes.
We are releasing a large and diverse dataset of gameplay following the protocol described by Agarwal et al., 2020, which can be used to evaluate several discrete offline RL algorithms. The dataset is generated by running an online DQN agent and recording transitions from its replay during training with sticky actions Machado et al., 2018. As stated in Agarwal et al., 2020, for each game we use data from five runs with 50 million transitions each. We release datasets for 46 Atari games. For details on how the dataset was generated, please refer to the paper. Please see this note about the ROM versions used to generate the datasets.
Atari is a standard RL benchmark. We recommend you to try offline RL methods on Atari if you are interested in comparing your approach to other state of the art offline RL methods with discrete actions.
The reward of each step is clipped (obtained with [-1, 1] clipping) and the episode includes the sum of the clipped reward per episode.
To use this dataset:
import tensorflow_datasets as tfds
ds = tfds.load('rlu_atari', split='train')
for ex in ds.take(4):
print(ex)
See the guide for more informations on tensorflow_datasets.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
This dataset is designed to aid the development of object detection models in the sport of lacrosse. The task involves identifying various elements within the game, helping systems understand and analyze gameplay accurately. The classes in the dataset are as follows:
Players actively engaged in the game, wearing typical sports attire, often helmets, and are usually observed holding a lacrosse stick.
Annotate individuals visibly participating in the game with a focus on players holding any type of stick. Exclude referees and goalies, even if they carry sticks. Draw the bounding box covering the entire body, including any stick held.
The player stationed in or around the goal area, typically wearing additional protective gear and often distinguished by a helmet with a throat guard.
Identify and mark the player in or near the goal area. Include their protective gear in the annotation. Ensure the bounding box includes the entirety of their body and equipment. Distinguish from regular players by proximity to the goal and additional gear.
A lacrosse stick used defensively, noticeably longer than an offensive stick, gripped by defensive players.
Focus on the longer sticks carried by defensive players. Annotate the length of the longpole from the player's hands to its end. Avoid labeling sticks of regular length carried by attacking players.
Officials tasked with overseeing the game, often distinguished by their unique uniform typically featuring black and white stripes.
Identify individuals in referee uniforms, ignoring the players' attire and positions. Essential to annotate only individuals clearly identifiable as game officials through their distinctive clothing and their position relative to the players.
The standard lacrosse stick used by most players, shorter in length compared to longpoles.
Label standard-length sticks grasped by players, ensuring the annotation fits the length from the player's grip to the stick's end. Distinguish from longpoles by the stick's relatively shorter length.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The UpStory dataset is an anonymized child-child interaction dataset, with an experimental manipulation for the level of rapport. It contains data pertaining to pairs of classmates (ages 8-10) playing a storytelling game in a naturalistic setting; pairs are selected to either promote close and friendly interactions (high-rapport condition), or promote distant interactions between acquaintances (low-rapport condition). Due to the experimental design, most children participated in two pairs: one high-rapport and one low-rapport.
A copy of this text is included in the ZIP file.
The dataset contains data for 35 pairs. Each pair is given an ID starting with P
(high-rapport condition) or N
(low-rapport condition), followed by the academic year (2 or 3), and 2 additional digits. E.g.: N251 is a low-rapport pair from year 2; P318 is a high-rapport pair from year 3. Similarly, each child is given a 2-digit ID. E.g.: child 17 participated in pairs P245 and N255.
Each pair played between 1 and 5 rounds of the game. Each round is provided as an individual sample, with its own associated time series as CSV files. In total, 106 rounds are provided.
The top-level CSV file child-info.csv
offers child-level information, including the following items:
child_id
: the child's ID (a random 2-digit unique identifier).gender
: boy
or girl
.year
: academic year the child belonged to (2
or 3
).age
: the child's age in years at the beginning of the data collection effort. Either an exact value (9
or 10
), or a range (8-9
).The top-level CSV file pair-info.csv
offers pair-level information, including the following items:
pair_id
: the pair ID, as described above.condition
: the experimental condition this pair belonged to (low_rapport
or high_rapport
).distance
: the distance between the two participants in their year's friendship network (integer in range 2 <= n <=56
for Year 2 pairs, and 2 <= n <= 20
for Year 3 pairs).year
: academic year the children belonged to (2
or 3
).rounds
: number of game rounds the pair played (1 <= n <= 5
).child_1
: first child in the pair (lower ID; 2 digits).child_2
: second child in the pair (higher ID; 2 digits).The dataset contains time-series data extracted from two different video sources, each one overviewing the play area from one side: the left-camera
and right-camera
. Each video source has its own top-level folder, with data extracted from that source inside it.
In each source folder, you will find CSV files named -
(e.g., left-camera-N249-round-1-face.csv
). There is a separate file for each round of the game; each pair typically played ~3 rounds (min: 1, max: 5). As the names suggest, face
files contain information related to head pose and facial expression, while pose
files contain information related to full body pose.
Face data was extracted with OpenFace, and contains most information that is produced by the tool. See the OpenFace documentation for more details. Time series are given at 25Hz; entries are indexed by frame
(0-indexed) and child_id
. Included data:
confidence
and success
indicators.Pose data was extracted with OpenPose. Time series are given at 25Hz; entries are indexed by frame
(0-indexed), child_id,
and joint
(named body part that the row refers to). Data provided per row:
x
: horizontal position in the frame, in pixels, left-to-right (float; range 0-width).y
: vertical position in the frame, in pixels, top-to-bottom (float; range 0-height).confidence
: OpenPose's reported prediction confidence (float; range 0-1).http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/
Blackjack, also known as 21, is one of the most popular card games worldwide. Blackjack remains a favourite due to its mix of simplicity, luck, strategy, and fast paced game play, making it a staple in casinos.
The casino typically has a small edge due to rules favouring the dealer (e.g., the player acts first, so they can bust before the dealer plays): - Basic strategy can minimise the house edge: - Strategy charts show the optimal play based on the player's hand and the dealer's up card. - Advanced players use card counting to track high value cards remaining in the deck, gaining an advantage.
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F13231939%2Faa4b5d8819430e46c3203b3597666578%2FScreenshot%202024-12-21%2010.36.57.png?generation=1734781714095911&alt=media" alt="">
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F13231939%2F86038e4d98f429825106bb2e8b5f74e8%2FScreenshot%202024-12-21%2010.38.18.png?generation=1734781738030008&alt=media" alt="">
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F13231939%2F5b634959e2292840ce454745ca80062f%2FScreenshot%202024-12-21%2010.39.12.png?generation=1734781761032959&alt=media" alt="">
A Markdown document with the R code for the game of Black Jack. link
The provided R code implements a simplified version of the game Blackjack. It includes f...
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The Quick Draw Dataset is a collection of 50 million drawings across 345 categories, contributed by players of the game "Quick, Draw!". The drawings were captured as timestamped vectors, tagged with metadata including what the player was asked to draw and in which country the player was located.
Example drawings: https://raw.githubusercontent.com/googlecreativelab/quickdraw-dataset/master/preview.jpg" alt="preview">
This physiological data was collected from pilot/copilot pairs in and out of a flight simulator. It was collected to train machine-learning models to aid in the detection of pilot attentive states. The benchmark training set is comprised of a set of controlled experiments collected in a non-flight environment, outside of a flight simulator. The test set (abbreviated LOFT = Line Oriented Flight Training) consists of a full flight (take off, flight, and landing) in a flight simulator. The pilots experienced distractions intended to induce one of the following three cognitive states: Channelized Attention (CA) is the state of being focused on one task to the exclusion of all others. This is induced in benchmarking by having the subjects play an engaging puzzle-based video game. Diverted Attention (DA) is the state of having one’s attention diverted by actions or thought processes associated with a decision. This is induced by having the subjects perform a display monitoring task. Periodically, a math problem showed up which had to be solved before returning to the monitoring task. Startle/Surprise (SS) is induced by having the subjects watch movie clips with jump scares. For each experiment, a pair of pilots (each with its own crew ID) was recorded over time and subjected to the CA, DA, or SS cognitive states. The training set contains three experiments (one for each state) in which the pilots experienced just one of the states. For example, in the experiment labelled CA, the pilots were either in a baseline state (no event) or the CA state. The test set contains a full flight simulation during which the pilots could experience any of the states (but never more than one at a time). Each sensor operated at a sample rate of 256 Hz. Please note that since this is physiological data from real people, there will be noise and artifacts in the data.
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
This data-based effort documents and categorizes the relevant digital games in gaming history by analyzing two types of documental sources: (A) the most playable games (based on reports on sales) and (B) the most acclaimed Digital games by the specialized Critic. (A) The approach to ascertain the most played Digital games was thought by the analysis of the Entertainment Software Association (ESA) (ESA, 2009, 2010, 2011, 2012, 2013,2014, 2016, 2018). Regarding (B), six specialized critics' opinions on the most acclaimed digital games were selected for document analysis. From specialized Digital game journalism, the list of best games on the websites Euro Gamer, GamesRadar, IGN, and Polygon. Two Generalist News Media fonts were added: the British newspaper The Guardian and the American news magazine Time. The last source selected was Metacritic, a specialized entertainment evaluation and criticism website. (A) is justified considering that ESA is a nonprofit association with goals to observe, analyze, and unify the gaming industry, helping in terms of justice support by creating expansion opportunities for associates. ESA has several associates such as ElectronicArts, Konami, Microsoft, Bandai Namco Entertainment, Nintendo, Sony Interactive Entertainment, Square Enix Take-Two Interactive, Ubisoft, Warner Bros, and other companies with a less economic impact on the digital games business. Since 2008, ESA has published its annual report with data illustrating the industry's state in the corresponding year by reporting data as its total revenue. With the digital games selected in (B), it is possible to construct a broader sample and bridge the gaps in the ESA reports. The sources chosen in (B) are based on geographical representation covering both digital and printed press: Eurogamer is a British digital game journalism website publishing in-depth analyses and criticism of digital games and gaming culture. The top 10 games of the generation is a list published in September 2020 by the editor-in-chief. The author invited 19 persons from the game business: game developers, critics, and journalists. They were asked to submit a list of their five favorite games, any kind or platform. "Games Radar", or "GamesRadar+" in its original denomination brand, is an entertainment website dedicated to digital game-related news, previews, and reviews and is a member of the Independent Press Standards Organisation. In its list of The 100 best games ever, the editorial team (50 specialized journalists) picked the most-critical digital games, privileging fun and enjoyment for the Player before the historical significance of the Digital game. "Polygon" is an American Digital game specialized website that publishes reviews, guides, videos, and news on popular culture and entertainment-related. For their list, the editorial staff asked everyone (including the audience) to vote for the best Digital game based on innovation, polish, and durability rather than personal taste. They excluded games released in 2017 to eliminate recency bias (the year before the list was published). They left out sequels and were too similar to the games before them. In addition to gathering votes from the "Polygon" team, the editorial staff worked with a group of external and freelance writers to pull in their input. Collecting all those votes, they combined the data and the final list. "The Guardian" and "TIME" are generalistic press companies from the United Kingdom and the United States of America, respectively. The Guardian's list reflects the authors' vision. (journalists specializing in digital games) on the best 50 video games of the 21st century. TIME's list reports the tech's team of multiple generations of gamers, resulting in a list of 50 digital games. The last documentary source analyzed, "Metacritic," is a specialized critic website on entertainment (digital games, television, cinema, and music). To evaluate media content, "Metacritic" attributes a "Metascore." Any content featured on "Metacritic" gets a "Metascore" when had collected at least four critics' reviews. It was picked from the website the top 100 meta-scores from Metacritic's top-ranked digital games. In addition, the Digital games were selected by their titles, meaning that repeated games (different platforms) were disregarded. The list culminated in 393 different digital games; on the data based, it is possible to check: the documental sources where the digital games were cited; the year in which digital games were published for the first time; the series or franchise were each gama belongs; the Playability mode of each game; the Spatial Dimensionality; The number of platforms louched; the name of Digital Game Protagonist (if applicable); the name of the main playable Character (if applicable); the taxonomy of the playable Character; the perspective of the playable Character; check if there is a Narrative and the digital game genre.
RL Unplugged is suite of benchmarks for offline reinforcement learning. The RL Unplugged is designed around the following considerations: to facilitate ease of use, we provide the datasets with a unified API which makes it easy for the practitioner to work with all data in the suite once a general pipeline has been established.
The datasets follow the RLDS format to represent steps and episodes.
We are releasing a large and diverse dataset of gameplay following the protocol described by Agarwal et al., 2020, which can be used to evaluate several discrete offline RL algorithms. The dataset is generated by running an online DQN agent and recording transitions from its replay during training with sticky actions Machado et al., 2018. As stated in Agarwal et al., 2020, for each game we use data from five runs with 50 million transitions each. We release datasets for 46 Atari games. For details on how the dataset was generated, please refer to the paper. Please see this note about the ROM versions used to generate the datasets.
Atari is a standard RL benchmark. We recommend you to try offline RL methods on Atari if you are interested in comparing your approach to other state of the art offline RL methods with discrete actions.
The reward of each step is clipped (obtained with [-1, 1] clipping) and the episode includes the sum of the clipped reward per episode.
Each of the configurations is broken into splits. Splits correspond to checkpoints of 1M steps (note that the number of episodes may difer). Checkpoints are ordered in time (so checkpoint 0 ran before checkpoint 1).
Episodes within each split are ordered. Check https://www.tensorflow.org/datasets/determinism if you want to ensure that you read episodes in order.
This dataset corresponds to the one used in the DQN replay paper. https://research.google/tools/datasets/dqn-replay/
To use this dataset:
import tensorflow_datasets as tfds
ds = tfds.load('rlu_atari_checkpoints_ordered', split='train')
for ex in ds.take(4):
print(ex)
See the guide for more informations on tensorflow_datasets.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The dataset 'ds180_Chinook_pnts' is a product of the CalFish Adult Salmonid Abundance Database. Data in this shapefile are collected from point features, such as dams and hatcheries. Some escapement monitoring locations, such as spawning stock surveys, are logically represented by linear features. See the companion linear feature shapefile 'ds181_Chinook_ln' for information collected from stream reaches.
The CalFish Abundance Database contains a comprehensive collection of anadromous fisheries abundance information. Beginning in 1998, the Pacific States Marine Fisheries Commission, the California Department of Fish and Game, and the National Marine Fisheries Service, began a cooperative project aimed at collecting, archiving, and entering into standardized electronic formats, the wealth of information generated by fisheries resource management agencies and tribes throughout California.
The data format provides for sufficient detail to convey the relative accuracy of each population trend index record yet is simple and straight forward enough to be suited for public use. For those interested in more detail the database offers hyperlinks to digital copies of the original documents used to compile the information. In this way the database serves as an information hub directing the user to additional supporting information. This offers utility to field biologists and others interested in obtaining information for more in-depth analysis. Hyperlinks, built into the spatial data attribute tables used in the BIOS and CalFish I-map viewers, open the detailed index data archived in the on-line CalFish database application. The information can also be queried directly from the database via the CalFish Tabular Data Query. Once the detailed annual trend data are in view, another hyperlink opens a digital copy of the document used to compile each record.
During 2010, as a part of the Central Valley Chinook Comprehensive Monitoring Plan, the CalFish Salmonid Abundance Database was reorganized and updated. CalFish provides a central location for sharing Central Valley Chinook salmon escapement estimates and annual monitoring reports to all stakeholders, including the public. Annual Chinook salmon in-river escapement indices that were, in many cases, eight to ten years behind are now current though 2009. In some cases, multiple datasets were consolidated into a single, more comprehensive, dataset to more closely reflect how data are reported in the California Department of Fish and Game standard index, Grandtab.
Extensive data are currently available in the CalFish Abundance Database for California Chinook, coho, and steelhead. Major data categories include adult abundance population estimates, actual fish and/or carcass counts, counts of fish collected at dams, weirs, or traps, and redd counts. Harvest data has also been compiled for many streams.
This CalFish Abundance Database shapefile was generated from fully routed 1:100,000 hydrography. In a few cases streams had to be added to the hydrography dataset in order to provide a means to create shapefiles to represent abundance data associated with them. Streams added were digitized at no more than 1:24,000 scale based on stream line images portrayed in 1:24,000 Digital Raster Graphics (DRG).
The features in this layer represent the location for which abundance data records apply. In many cases there are multiple datasets associated with the same location, and so, features may overlap. Please view the associated datasets for detail regarding specific features. In CalFish these are accessed through the "link" field that is visible when performing an identify or query operation. A URL string is provided with each feature in the downloadable data which can also be used to access the underlying datasets.
The Chinook data that is available from the CalFish website is actually mirrored from the StreamNet website where the CalFish Abundance Database's tabular data is currently stored. Additional information about StreamNet may be downloaded at http://www.streamnet.org" STYLE="text-decoration:underline;">http://www.streamnet.org. Complete documentation for the StreamNet database may be accessed at http://www.streamnet.org/online-data/data_develop.html" STYLE="text-decoration:underline;">http://http://www.streamnet.org/def.html
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Analysis of ‘Cricket Dataset’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/notkrishna/cricket-statistics-for-all-formats on 28 January 2022.
--- Dataset description provided by original source is as follows ---
Cricket is a bat-and-ball game played between two teams of eleven players on a field at the centre of which is a 22-yard (20-metre) pitch with a wicket at each end, each comprising two bails balanced on three stumps. The game proceeds when a player on the fielding team, called the bowler, "bowls" (propels) the ball from one end of the pitch towards the wicket at the other end. The batting side's players score runs by striking the bowled ball with a bat and running between the wickets, while the fielding side tries to prevent this by keeping the ball within the field and getting it to either wicket, and also tries to dismiss each batter (so they are "out"). Means of dismissal include being bowled, when the ball hits the stumps and dislodges the bails, and by the fielding side either catching a hit ball before it touches the ground, or hitting a wicket with the ball before a batter can cross the crease line in front of the wicket to complete a run. When ten batters have been dismissed, the innings ends and the teams swap roles. The game is adjudicated by two umpires, aided by a third umpire and match referee in international matches.
Forms of cricket range from Twenty20, with each team batting for a single innings of 20 overs and the game generally lasting three hours, to Test matches played over five days. Traditionally cricketers play in all-white kit, but in limited overs cricket they wear club or team colours. In addition to the basic kit, some players wear protective gear to prevent injury caused by the ball, which is a hard, solid spheroid made of compressed leather with a slightly raised sewn seam enclosing a cork core layered with tightly wound string.
The earliest reference to cricket is in South East England in the mid-16th century. It spread globally with the expansion of the British Empire, with the first international matches in the second half of the 19th century. The game's governing body is the International Cricket Council (ICC), which has over 100 members, twelve of which are full members who play Test matches. The game's rules, the Laws of Cricket, are maintained by Marylebone Cricket Club (MCC) in London. The sport is followed primarily in South Asia, Australasia, the United Kingdom, southern Africa and the West Indies.[1] Women's cricket, which is organised and played separately, has also achieved international standard. The most successful side playing international cricket is Australia, which has won seven One Day International trophies, including five World Cups, more than any other country and has been the top-rated Test side more than any other country.
Cricket as any sport is full of important data and stats. Given, the game is generally is played in three different formats, one day (50 overs for each team to score and bowl), test (no limitations on overs but played for max 5 days with each team having two innings to score), and newest format twenty20 (each team has 20 overs to score).
Dataset contains 9 files (3 for each format). Each group of three files contains best stats for batsmen, bowlers and series/tournaments.
Source https://www.espncricinfo.com/
Play with it as you like.
--- Original source retains full ownership of the source dataset ---
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Analysis of ‘LEC Regular Season 2021 / LOL’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/jordipompas/lec-regular-season-2021 on 28 January 2022.
--- Dataset description provided by original source is as follows ---
League of Legends is one of the world's most famous video games. It's played by over 100 million active users every month. League of Legends is perhaps the most prominent Esport game. Nowadays, their competitions are becoming more and more professional and little by little new information about the players is appearing.
This dataset includes the statistics of all LEC (League of Legends European Championship) players for the 2021 regular season.
The same variables are found in all 3 datasets:
--- Original source retains full ownership of the source dataset ---
MagnaTagATune dataset contains 25,863 music clips. Each clip is a 29-seconds-long excerpt belonging to one of the 5223 songs, 445 albums and 230 artists. The clips span a broad range of genres like Classical, New Age, Electronica, Rock, Pop, World, Jazz, Blues, Metal, Punk, and more. Each audio clip is supplied with a vector of binary annotations of 188 tags. These annotations are obtained by humans playing the two-player online TagATune game. In this game, the two players are either presented with the same or a different audio clip. Subsequently, they are asked to come up with tags for their specific audio clip. Afterward, players view each other’s tags and are asked to decide whether they were presented the same audio clip. Tags are only assigned when more than two players agreed. The annotations include tags like ’singer’, ’no singer’, ’violin’, ’drums’, ’classical’, ’jazz’. The top 50 most popular tags are typically used for evaluation to ensure that there is enough training data for each tag. There are 16 parts, and researchers comonnly use parts 1-12 for training, part 13 for validation and parts 14-16 for testing.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Here are a few use cases for this project:
Sports Training: The "basketball_child" model can be used to track and analyze the movements, shots, and gaming strategies of young basketball players, offering valuable insights to coaches and trainers to improve training programs based on individual performance.
Player Detection: For security or performance monitoring, this model can help detect particular players in a crowded playground by tracking the ball, rim and the player involved.
Augmented Reality Games: It can be incorporated in creating AR-based basketball games for children, providing a more interactive and immersive gaming experience by identifying the virtual ball and rim movements.
Video Analysis: It can be utilized to analyze and index basketball match videos, which could be valuable for scouting or performance review. It could evaluate takes and misses contributing to statistics generation in real-time.
Automated Video Production: In live broadcasts or recorded content of basketball matches, the model can be used for automatic camera control, following the ball, the rim and players across the court to ensure high-quality coverage.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The ultimatum (UG) and dictator (DG) games are two tasks where a sum of money has to be divided between two players: a proposer and a receiver. Following the rational choice theory, proposers should offer the minimum in the UG and nothing in the DG, due to the presence/absence of the receivers’ bargaining power. The fact that people generally make non-negligible offers in both games has suggested divergent explicative hypotheses and has generated extensive research to examine exogenous and endogenous factors underlying such decisions. Among the contextual factors affecting the proposers’ offers, the sense of entitlement or of ownership has been shown to reduce offers significantly. A frequent way to induce the sense of entitlement/ownership has been to assign the role of proposer to the player who apparently has better scored in skill tasks executed before the UG or DG or has more contributed, through a previous luck game, to the amount to be shared. Such manipulations, however, could produce a possible overlapping between “ownership” and “merit,” that in this study we aimed to disentangle. We manipulated the participants’ initial endowment through a luck game, by increasing, decreasing or leaving it unchanged, to investigate whether winnings or losses by chance influenced offers in UG and DG in similar or different ways depending on their respective features. All participants played as proposers but this role was apparently random and disconnected from the outcomes of the luck game. Furthermore, we investigated whether the putative effect of experimental manipulation was mediated by the changes in emotions elicited by the luck game and/or by the emotions and beliefs related to decision-making. We used a non-economic version of the games, in which tokens were divided instead of money. In the study, 300 unpaid undergraduates (M = 152) from different degree programs, aged between 18 and 42 years, participated. The results revealed that the effect of outcome manipulation on offers was moderated by the specific structure of the UG and DG. Instead, emotional reactions barely mediated the effect of the experimental manipulation, suggesting that their role in those decisions is less relevant than is assumed in the literature.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Overview
Information of more than 110,000 games published on Steam. Maintained by Fronkon Games. This dataset has been created with this code (MIT) and use the API provided by Steam, the largest gaming platform on PC. Data is also collected from Steam Spy. Only published games, no DLCs, episodes, music, videos, etc. Here is a simple example of how to parse json information:
import os import json
dataset = {} if… See the full description on the dataset page: https://huggingface.co/datasets/FronkonGames/steam-games-dataset.