Predicting the difficulty of playing a musical score plays a pivotal role in structuring and exploring score collections, with significant implications for music education. The automatic difficulty classification of piano scores, however, remains an unsolved challenge. This is largely due to the scarcity of annotated data and the inherent subjectiveness in the annotation process. The "Can I Play It?" (CIPI) dataset represents a substantial step forward in this domain, providing a machine-readable collection of piano scores paired with difficulty annotations from the esteemed Henle Verlag.
The CIPI dataset is meticulously assembled by aligning public domain scores with their corresponding difficulty labels sourced from Henle Verlag. This initial pairing was subsequently reviewed and refined by an expert pianist to ensure accuracy and reliability. The dataset is structured to facilitate easy access and interpretation, making it a valuable resource for researchers and educators alike.
Our work makes two primary contributions to the field of score difficulty classification. Firstly, we address the critical issue of data scarcity, introducing the CIPI dataset to the academic community. Secondly, we delve into various input representations derived from score information, utilizing pre-trained machine learning models tailored for piano fingering and expressiveness. These models draw inspiration from musicological definitions of performance, offering nuanced insights into score difficulty.
Through extensive experimentation, we demonstrate that an ensemble approach—combining outputs from multiple classifiers—yields superior results compared to individual classifiers. This highlights the diverse facets of difficulty captured by different representations. Our comprehensive experiments lay a robust foundation for future endeavors in score difficulty classification, and our best-performing model reports a balanced accuracy of 39.5% and a median square error of 1.1 across the nine difficulty levels introduced in this study.
The CIPI dataset, along with the associated code and models, is made publicly available to ensure reproducibility and to encourage further research in this domain. Users are encouraged to reference this resource in their work and to contribute to its ongoing development.
Ramoneda, P., Jeong, D., Eremenko, V., Tamer, N. C., Miron, M., & Serra, X. (2024). Combining Piano Performance Dimensions for Score Difficulty Classification. Expert Systems with Applications, 238, 121776. DOI: 10.1016/j.eswa.2023.121776
@article{Ramoneda2024,
author = {Pedro Ramoneda and Dasaem Jeong and Vsevolod Eremenko and Nazif Can Tamer and Marius Miron and Xavier Serra},
title = {Combining Piano Performance Dimensions for Score Difficulty Classification},
journal = {Expert Systems with Applications},
volume = {238},
pages = {121776},
year = {2024},
doi = {10.1016/j.eswa.2023.121776},
url = {https://doi.org/10.1016/j.eswa.2023.121776}
}
pedro.ramoneda@upf.edu
xavier.serra@upf.edu
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
NBA anba WNBA dataset is a large-scale play-by-play and shot-detail dataset covering both NBA and WNBA games, collected from multiple public sources (e.g., official league APIs and stats sites). It provides every in-game event—from period starts, jump balls, fouls, turnovers, rebounds, and field-goal attempts through free throws—along with detailed shot metadata (shot location, distance, result, assisting player, etc.).
Also you can download dataset from github or GoogleDrive
Tutorials
I will be grateful for ratings and stars on github, but the best gratitude is use of dataset for your projects.
Useful links:
I made this dataset because I want to simplify and speed up work with play-by-play data so that researchers spend their time studying data, not collecting it. Due to the limits on requests on the NBA and WNBA website, and also because you can get play-by-play of only one game per request, collecting this data is a very long process.
Using this dataset, you can reduce the time to get information about one season from a few hours to a couple of seconds and spend more time analyzing data or building models.
I also added play-by-play information from other sources: pbpstats.com, data.nba.com, cdnnba.com. This data will enrich information about the progress of each game and hopefully add opportunities to do interesting things.
If you have any questions or suggestions about the dataset, you can write to me in a convenient channel for you:
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
## Overview
Play is a dataset for instance segmentation tasks - it contains Basket Makes annotations for 4,881 images.
## Getting Started
You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
## License
This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
If you use this dataset anywhere in your work, kindly cite as the below: L. Gupta, "Google Play Store Apps," Feb 2019. [Online]. Available: https://www.kaggle.com/lava18/google-play-store-apps
While many public datasets (on Kaggle and the like) provide Apple App Store data, there are not many counterpart datasets available for Google Play Store apps anywhere on the web. On digging deeper, I found out that iTunes App Store page deploys a nicely indexed appendix-like structure to allow for simple and easy web scraping. On the other hand, Google Play Store uses sophisticated modern-day techniques (like dynamic page load) using JQuery making scraping more challenging.
Each app (row) has values for catergory, rating, size, and more.
This information is scraped from the Google Play Store. This app information would not be available without it.
The Play Store apps data has enormous potential to drive app-making businesses to success. Actionable insights can be drawn for developers to work on and capture the Android market!
A survey conducted at the end of 2020 and beginning of 2021 in Mexico found that 75 percent of video gaming children aged 7 and older said they played video games on the internet. This represents an increase of 39 percentage points in comparison to the previous measurement when only 36 of the children responding to the survey claimed to play online.
According to March 2024 survey, about seven in ten adults in the United States were not aware of play to earn games. These are the games that allow users to earn cryptocurrency through gameplay. The age group most aware of such online games was 18 to 34-year-olds, with 50 percent of respondents in this age group stating that they knew of these games.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Source of the data: Sportsradar API (https://developer.sportradar.com/docs/read/basketball/NBA_v8)
NBA Play-by-Play Data Extraction and Analysis
Overview
This project aims to retrieve play-by-play data for NBA matches in the 2023 season using the Sportradar API. The play-by-play data is fetched from the API, saved into JSON files, and then used to extract relevant features for analysis and other applications. The extracted data is saved in Parquet files for easy access… See the full description on the dataset page: https://huggingface.co/datasets/farazjawed/NBA_PLAY_BY_PLAY_DATA_2023.
Video gaming is a popular way for gamers to connect with friends and family. A February 2025 survey found that 72 percent of gamers in the United States played with others online or in person, up from 65 percent of U.S. gamers who did so in 2020. According to U.S. gamers, friends are the most popular group of people to play online with.
Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
License information was derived automatically
Reviews on Messengers Dataset - Review dataset
The Reviews on Messengers Dataset is a comprehensive collection of 200 the most recent customer reviews on 6 messengers obtained from the popular app store, Google Play. See the list of the apps below. This dataset encompasses reviews written in 5 different languages: English, French, German, Italian, Japanese.
💴 For Commercial Usage: To discuss your requirements, learn about the price and buy the dataset, leave a request… See the full description on the dataset page: https://huggingface.co/datasets/TrainingDataPro/messengers-reviews-google-play.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
## Overview
Game Play Analysis is a dataset for object detection tasks - it contains Batter Fielder Umpire Keeper Bal annotations for 1,244 images.
## Getting Started
You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
## License
This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
In April 2025, approximately ** thousand mobile apps were released through the Google Play Store. This figure indicates a notable decrease compared to the previous examined period. In the measured period, the highest number of app releases via Google Play Store was recorded in March 2019, with over *** thousand apps released.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The School of Education at the University of Cape Town (UCT) investigated children’s learning through digital play. The aim of the study was to explore the intersection between child play, technology, creativity and learning among children aged between 3 and 11 years. The study also identified skills and dispositions children develop through both digital and non-digital play. The data shared emerged from a survey of parents of children in the stated age group, with particular reference to the parents views on children's play practices, including time parents spent playing with their children, concerns parents had on time children spend playing on various technologies, types of play children in South Africa engaged in and the concerns of parents when children played with some electronic devices. The following data files are shared:SA - Survey - Children, Technology and Play (CTAP) - Google Forms.pdfDescriptive Stats 2020.1.9 -Children Technology and Play SURVEY.xlsxParent Survey RAW PUBLIC DATA 2020.2.29 - Children Technology and Play Project.xlsxParent Survey RAW PUBLIC DATA 2020.2.29 - Children Technology and Play Project.csvParent Survey REPORT DATA 2020.2.29 - Children Technology and Play Project.xlsxParent Survey REPORT DATA 2020.2.29 - Children Technology and Play Project.csvParent Survey RAW and REPORT DATA SYNTAX 2020.2.29 - Children Technology and Play Project.spsNOTE: This survey was adapted from Marsh, J. Stjerne Thomsen, B., Parry, B., Scott, F. Bishop, J.C., Bannister, C., Driscoll, A., Margary, T., Woodgate, A., (2019) Children, Technology and Play. UK Survey Questions. LEGO Foundation.
The statistic shows the most popular app categories in the Google Play store ranked by number of downloads. In the second quarter of 2020, entertainment apps were the third-most popular category with 1.21 billion downloads during the measured period. Gaming apps were ranked first with 10.35 billion app downloads.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset tracks annual reading and language arts proficiency from 2011 to 2022 for Fair Play Elementary School vs. Missouri and Fair Play R-II School District
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset tracks annual total students amount from 1987 to 2023 for Fair Play Elementary School
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset tracks annual diversity score from 1993 to 2023 for Fair Play Elementary School vs. Missouri and Fair Play R-II School District
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
People have never played more video games and many stakeholders are worried that this activity might be bad for players. So far, research has not had adequate data to test whether these worries are justified and if policymakers should act to regulate video game play time. We attempt to provide much-needed evidence with adequate data. Whereas previous research had to rely on self-reported play behaviour, we collaborated with two games companies, Electronic Arts and Nintendo of America, to obtain players’ actual play behaviour. We surveyed players of Plants vs. Zombies: Battle for Neighborville and Animal Crossing: New Horizons for their well-being, motivations, and need satisfaction during play and merged their responses with telemetry data (i.e., logged game play). Contrary to many fears that excessive game time will lead to addiction and poor mental health, we found a small positive relation between game play and well-being. Need satisfaction and motivations during play did not interact with game time but were instead independently related to well-being. Our results advance the field in two important ways. First, we show that collaborations with industry partners can be done to high academic standards in an ethical and transparent fashion. Second, we deliver much-needed evidence to policymakers on the link between play and mental health.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This project focused on defining geothermal play fairways and development of a detailed geothermal potential map of a large transect across the Great Basin region (96,000 km2), with the primary objective of facilitating discovery of commercial-grade, blind geothermal fields (i.e. systems with no surface hot springs or fumaroles) and thereby accelerating geothermal development in this promising region. Data included in this submission consists of: structural settings (target areas, recency of faulting, slip and dilation potential, slip rates, quality), regional-scale strain rates, earthquake density and magnitude, gravity data, temperature at 3 km depth, permeability models, favorability models, degree of exploration and exploration opportunities, data from springs and wells, transmission lines and wilderness areas, and published maps and theses for the Nevada Play Fairway area.
GPS-derived Horizontal Velocities on the Hawaii island, provided by James Foster of the Pacific GPS Facility.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset tracks annual total classroom teachers amount from 2007 to 2023 for Wonderland Of Play Head Start
Predicting the difficulty of playing a musical score plays a pivotal role in structuring and exploring score collections, with significant implications for music education. The automatic difficulty classification of piano scores, however, remains an unsolved challenge. This is largely due to the scarcity of annotated data and the inherent subjectiveness in the annotation process. The "Can I Play It?" (CIPI) dataset represents a substantial step forward in this domain, providing a machine-readable collection of piano scores paired with difficulty annotations from the esteemed Henle Verlag.
The CIPI dataset is meticulously assembled by aligning public domain scores with their corresponding difficulty labels sourced from Henle Verlag. This initial pairing was subsequently reviewed and refined by an expert pianist to ensure accuracy and reliability. The dataset is structured to facilitate easy access and interpretation, making it a valuable resource for researchers and educators alike.
Our work makes two primary contributions to the field of score difficulty classification. Firstly, we address the critical issue of data scarcity, introducing the CIPI dataset to the academic community. Secondly, we delve into various input representations derived from score information, utilizing pre-trained machine learning models tailored for piano fingering and expressiveness. These models draw inspiration from musicological definitions of performance, offering nuanced insights into score difficulty.
Through extensive experimentation, we demonstrate that an ensemble approach—combining outputs from multiple classifiers—yields superior results compared to individual classifiers. This highlights the diverse facets of difficulty captured by different representations. Our comprehensive experiments lay a robust foundation for future endeavors in score difficulty classification, and our best-performing model reports a balanced accuracy of 39.5% and a median square error of 1.1 across the nine difficulty levels introduced in this study.
The CIPI dataset, along with the associated code and models, is made publicly available to ensure reproducibility and to encourage further research in this domain. Users are encouraged to reference this resource in their work and to contribute to its ongoing development.
Ramoneda, P., Jeong, D., Eremenko, V., Tamer, N. C., Miron, M., & Serra, X. (2024). Combining Piano Performance Dimensions for Score Difficulty Classification. Expert Systems with Applications, 238, 121776. DOI: 10.1016/j.eswa.2023.121776
@article{Ramoneda2024,
author = {Pedro Ramoneda and Dasaem Jeong and Vsevolod Eremenko and Nazif Can Tamer and Marius Miron and Xavier Serra},
title = {Combining Piano Performance Dimensions for Score Difficulty Classification},
journal = {Expert Systems with Applications},
volume = {238},
pages = {121776},
year = {2024},
doi = {10.1016/j.eswa.2023.121776},
url = {https://doi.org/10.1016/j.eswa.2023.121776}
}
pedro.ramoneda@upf.edu
xavier.serra@upf.edu