Predicting the difficulty of playing a musical score plays a pivotal role in structuring and exploring score collections, with significant implications for music education. The automatic difficulty classification of piano scores, however, remains an unsolved challenge. This is largely due to the scarcity of annotated data and the inherent subjectiveness in the annotation process. The "Can I Play It?" (CIPI) dataset represents a substantial step forward in this domain, providing a machine-readable collection of piano scores paired with difficulty annotations from the esteemed Henle Verlag.
The CIPI dataset is meticulously assembled by aligning public domain scores with their corresponding difficulty labels sourced from Henle Verlag. This initial pairing was subsequently reviewed and refined by an expert pianist to ensure accuracy and reliability. The dataset is structured to facilitate easy access and interpretation, making it a valuable resource for researchers and educators alike.
Our work makes two primary contributions to the field of score difficulty classification. Firstly, we address the critical issue of data scarcity, introducing the CIPI dataset to the academic community. Secondly, we delve into various input representations derived from score information, utilizing pre-trained machine learning models tailored for piano fingering and expressiveness. These models draw inspiration from musicological definitions of performance, offering nuanced insights into score difficulty.
Through extensive experimentation, we demonstrate that an ensemble approach—combining outputs from multiple classifiers—yields superior results compared to individual classifiers. This highlights the diverse facets of difficulty captured by different representations. Our comprehensive experiments lay a robust foundation for future endeavors in score difficulty classification, and our best-performing model reports a balanced accuracy of 39.5% and a median square error of 1.1 across the nine difficulty levels introduced in this study.
The CIPI dataset, along with the associated code and models, is made publicly available to ensure reproducibility and to encourage further research in this domain. Users are encouraged to reference this resource in their work and to contribute to its ongoing development.
Ramoneda, P., Jeong, D., Eremenko, V., Tamer, N. C., Miron, M., & Serra, X. (2024). Combining Piano Performance Dimensions for Score Difficulty Classification. Expert Systems with Applications, 238, 121776. DOI: 10.1016/j.eswa.2023.121776
@article{Ramoneda2024,
author = {Pedro Ramoneda and Dasaem Jeong and Vsevolod Eremenko and Nazif Can Tamer and Marius Miron and Xavier Serra},
title = {Combining Piano Performance Dimensions for Score Difficulty Classification},
journal = {Expert Systems with Applications},
volume = {238},
pages = {121776},
year = {2024},
doi = {10.1016/j.eswa.2023.121776},
url = {https://doi.org/10.1016/j.eswa.2023.121776}
}
pedro.ramoneda@upf.edu
xavier.serra@upf.edu
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
NBA anba WNBA dataset is a large-scale play-by-play and shot-detail dataset covering both NBA and WNBA games, collected from multiple public sources (e.g., official league APIs and stats sites). It provides every in-game event—from period starts, jump balls, fouls, turnovers, rebounds, and field-goal attempts through free throws—along with detailed shot metadata (shot location, distance, result, assisting player, etc.).
Also you can download dataset from github or GoogleDrive
Tutorials
I will be grateful for ratings and stars on github, but the best gratitude is use of dataset for your projects.
Useful links:
I made this dataset because I want to simplify and speed up work with play-by-play data so that researchers spend their time studying data, not collecting it. Due to the limits on requests on the NBA and WNBA website, and also because you can get play-by-play of only one game per request, collecting this data is a very long process.
Using this dataset, you can reduce the time to get information about one season from a few hours to a couple of seconds and spend more time analyzing data or building models.
I also added play-by-play information from other sources: pbpstats.com, data.nba.com, cdnnba.com. This data will enrich information about the progress of each game and hopefully add opportunities to do interesting things.
If you have any questions or suggestions about the dataset, you can write to me in a convenient channel for you:
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The database contains information of 61 different value chains on circular bio-based fertilisers derived from 7 secondary raw materials. This database undergoes periodic revisions. Find the most updated version on https://fer-play.eu/resources/#1675863959450-3b58785e-842e
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
## Overview
Game Play Analysis is a dataset for object detection tasks - it contains Batter Fielder Umpire Keeper Bal annotations for 1,244 images.
## Getting Started
You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
## License
This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
License information was derived automatically
Key Google Play StatisticsGoogle Play App and Game RevenueGoogle Play Gaming App RevenueGoogle Play App RevenueGoogle Play App and Game DownloadsGoogle Play Game DownloadsGoogle Play App...
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
NFL is one of the most popular sports in the world. Many of us are stat geeks who understanding not what just happened but also who and why. This NFL dataset provides a comprehensive view of NFL games, statistics, participation, the annual NFL combine, and the NFL draft. The dataset includes NFL play data from 2004 to the present.
This NFL dataset provides play-by-play data from the 2004 to 2019 seasons. Dataset also includes play and participation information for players, coaches, and game officials. Additional data tables included in this file includes NFL Draft from 1989 to present, NFL Combine 1999 to present, NFL rosters from 1998 to present, NFL schedules, stadium information and much more. The granularity of NFL statistics varies by NFL season. The current version of NFL statistics has been collected since 2012. All information sources used to create this dataset are from publically accessible websites and the NFL GSIS dataset.
All information sources used to create this dataset are from publically accessible websites and NFL documentation. Although my current life is focused on data science, this project has a special place in my heart, since it links my previous profession in the NFL with my current passion for data analysis.
A survey conducted at the end of 2020 and beginning of 2021 in Mexico found that ** percent of video gaming children aged 7 and older said they played video games on the internet. This represents an increase of ** percentage points in comparison to the previous measurement when only ** of the children responding to the survey claimed to play online.
Estimated monthly production derived from state administrative data. Data are back to January 2000.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This submission includes raster datasets for each layer of evidence used for weights of evidence analysis as well as the deterministic play fairway analysis (PFA). Data representative of heat, permeability and groundwater comprises some of the raster datasets. Additionally, the final deterministic PFA model is provided along with a certainty model. All of these datasets are best used with an ArcGIS software package, specifically Spatial Data Modeler.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
## Overview
Football Play Pre Post Snap is a dataset for classification tasks - it contains Football Plays annotations for 1,432 images.
## Getting Started
You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
## License
This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Source of the data: Sportsradar API (https://developer.sportradar.com/docs/read/basketball/NBA_v8)
NBA Play-by-Play Data Extraction and Analysis
Overview
This project aims to retrieve play-by-play data for NBA matches in the 2023 season using the Sportradar API. The play-by-play data is fetched from the API, saved into JSON files, and then used to extract relevant features for analysis and other applications. The extracted data is saved in Parquet files for easy access… See the full description on the dataset page: https://huggingface.co/datasets/farazjawed/NBA_PLAY_BY_PLAY_DATA_2023.
This project focused on defining geothermal play fairways and development of a detailed geothermal potential map of a large transect across the Great Basin region (96,000 km2), with the primary objective of facilitating discovery of commercial-grade, blind geothermal fields (i.e. systems with no surface hot springs or fumaroles) and thereby accelerating geothermal development in this promising region. Data included in this submission consists of: structural settings (target areas, recency of faulting, slip and dilation potential, slip rates, quality), regional-scale strain rates, earthquake density and magnitude, gravity data, temperature at 3 km depth, permeability models, favorability models, degree of exploration and exploration opportunities, data from springs and wells, transmission lines and wilderness areas, and published maps and theses for the Nevada Play Fairway area. Play Fairway Analysis Model Layer - Error of Quaternary fault slip rate distribution model
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
People have never played more video games and many stakeholders are worried that this activity might be bad for players. So far, research has not had adequate data to test whether these worries are justified and if policymakers should act to regulate video game play time. We attempt to provide much-needed evidence with adequate data. Whereas previous research had to rely on self-reported play behaviour, we collaborated with two games companies, Electronic Arts and Nintendo of America, to obtain players’ actual play behaviour. We surveyed players of Plants vs. Zombies: Battle for Neighborville and Animal Crossing: New Horizons for their well-being, motivations, and need satisfaction during play and merged their responses with telemetry data (i.e., logged game play). Contrary to many fears that excessive game time will lead to addiction and poor mental health, we found a small positive relation between game play and well-being. Need satisfaction and motivations during play did not interact with game time but were instead independently related to well-being. Our results advance the field in two important ways. First, we show that collaborations with industry partners can be done to high academic standards in an ethical and transparent fashion. Second, we deliver much-needed evidence to policymakers on the link between play and mental health.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The School of Education at the University of Cape Town (UCT) investigated children’s learning through digital play. The aim of the study was to explore the intersection between child play, technology, creativity and learning among children aged between 3 and 11 years. The study also identified skills and dispositions children develop through both digital and non-digital play. The data shared emerged from a survey of parents of children in the stated age group, with particular reference to the parents views on children's play practices, including time parents spent playing with their children, concerns parents had on time children spend playing on various technologies, types of play children in South Africa engaged in and the concerns of parents when children played with some electronic devices. The following data files are shared:SA - Survey - Children, Technology and Play (CTAP) - Google Forms.pdfDescriptive Stats 2020.1.9 -Children Technology and Play SURVEY.xlsxParent Survey RAW PUBLIC DATA 2020.2.29 - Children Technology and Play Project.xlsxParent Survey RAW PUBLIC DATA 2020.2.29 - Children Technology and Play Project.csvParent Survey REPORT DATA 2020.2.29 - Children Technology and Play Project.xlsxParent Survey REPORT DATA 2020.2.29 - Children Technology and Play Project.csvParent Survey RAW and REPORT DATA SYNTAX 2020.2.29 - Children Technology and Play Project.spsNOTE: This survey was adapted from Marsh, J. Stjerne Thomsen, B., Parry, B., Scott, F. Bishop, J.C., Bannister, C., Driscoll, A., Margary, T., Woodgate, A., (2019) Children, Technology and Play. UK Survey Questions. LEGO Foundation.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
## Overview
Just Play Bot V2 is a dataset for object detection tasks - it contains Carre annotations for 887 images.
## Getting Started
You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
## License
This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
According to March 2024 survey, about seven in ten adults in the United States were not aware of play to earn games. These are the games that allow users to earn cryptocurrency through gameplay. The age group most aware of such online games was 18 to 34-year-olds, with 50 percent of respondents in this age group stating that they knew of these games.
This project focused on defining geothermal play fairways and development of a detailed geothermal potential map of a large transect across the Great Basin region (96,000 km2), with the primary objective of facilitating discovery of commercial-grade, blind geothermal fields (i.e. systems with no surface hot springs or fumaroles) and thereby accelerating geothermal development in this promising region. Data included in this submission consists of: structural settings (target areas, recency of faulting, slip and dilation potential, slip rates, quality), regional-scale strain rates, earthquake density and magnitude, gravity data, temperature at 3 km depth, permeability models, favorability models, degree of exploration and exploration opportunities, data from springs and wells, transmission lines and wilderness areas, and published maps and theses for the Nevada Play Fairway area.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset tracks annual total classroom teachers amount from 2007 to 2023 for Wonderland Of Play Head Start
Gravity model for the state of Hawaii. Data is from the following source: Flinders, A.F., Ito, G., Garcia, M.O., Sinton, J.M., Kauahikaua, J.P., and Taylor, B., 2013, Intrusive dike complexes, cumulate cores, and the extrusive growth of Hawaiian volcanoes: Geophysical Research Letters, v. 40, p. 3367-3373, doi:10.1002/grl.50633.
Final Report describing data collection, evaluation, modeling and analysis. Ranking of Cascade and Aleutian volcanic centers for geothermal potential.
Predicting the difficulty of playing a musical score plays a pivotal role in structuring and exploring score collections, with significant implications for music education. The automatic difficulty classification of piano scores, however, remains an unsolved challenge. This is largely due to the scarcity of annotated data and the inherent subjectiveness in the annotation process. The "Can I Play It?" (CIPI) dataset represents a substantial step forward in this domain, providing a machine-readable collection of piano scores paired with difficulty annotations from the esteemed Henle Verlag.
The CIPI dataset is meticulously assembled by aligning public domain scores with their corresponding difficulty labels sourced from Henle Verlag. This initial pairing was subsequently reviewed and refined by an expert pianist to ensure accuracy and reliability. The dataset is structured to facilitate easy access and interpretation, making it a valuable resource for researchers and educators alike.
Our work makes two primary contributions to the field of score difficulty classification. Firstly, we address the critical issue of data scarcity, introducing the CIPI dataset to the academic community. Secondly, we delve into various input representations derived from score information, utilizing pre-trained machine learning models tailored for piano fingering and expressiveness. These models draw inspiration from musicological definitions of performance, offering nuanced insights into score difficulty.
Through extensive experimentation, we demonstrate that an ensemble approach—combining outputs from multiple classifiers—yields superior results compared to individual classifiers. This highlights the diverse facets of difficulty captured by different representations. Our comprehensive experiments lay a robust foundation for future endeavors in score difficulty classification, and our best-performing model reports a balanced accuracy of 39.5% and a median square error of 1.1 across the nine difficulty levels introduced in this study.
The CIPI dataset, along with the associated code and models, is made publicly available to ensure reproducibility and to encourage further research in this domain. Users are encouraged to reference this resource in their work and to contribute to its ongoing development.
Ramoneda, P., Jeong, D., Eremenko, V., Tamer, N. C., Miron, M., & Serra, X. (2024). Combining Piano Performance Dimensions for Score Difficulty Classification. Expert Systems with Applications, 238, 121776. DOI: 10.1016/j.eswa.2023.121776
@article{Ramoneda2024,
author = {Pedro Ramoneda and Dasaem Jeong and Vsevolod Eremenko and Nazif Can Tamer and Marius Miron and Xavier Serra},
title = {Combining Piano Performance Dimensions for Score Difficulty Classification},
journal = {Expert Systems with Applications},
volume = {238},
pages = {121776},
year = {2024},
doi = {10.1016/j.eswa.2023.121776},
url = {https://doi.org/10.1016/j.eswa.2023.121776}
}
pedro.ramoneda@upf.edu
xavier.serra@upf.edu