100+ datasets found
  1. h

    NBA-Player-Career-Stats

    • huggingface.co
    Updated May 19, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mr. Stack (2024). NBA-Player-Career-Stats [Dataset]. https://huggingface.co/datasets/Hatman/NBA-Player-Career-Stats
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 19, 2024
    Authors
    Mr. Stack
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Dataset Description

    This dataset contains a single CSV file with lifetime statistics for NBA players. The data includes various box score stats and personal information for each player's career.

      Data Fields
    

    The CSV file contains the following columns:

    FULL_NAME: The player's full name AST: Total career assists BLK: Total career blocks DREB: Total career defensive rebounds FG3A: Total 3-point field goal attempts FG3M: Total 3-point field goals made FG3_PCT: 3-point field… See the full description on the dataset page: https://huggingface.co/datasets/Hatman/NBA-Player-Career-Stats.

  2. Tennis Weather

    • kaggle.com
    Updated Oct 2, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Pranav Pandey (2018). Tennis Weather [Dataset]. https://www.kaggle.com/datasets/pranavpandey2511/tennis-weather
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Oct 2, 2018
    Dataset provided by
    Kaggle
    Authors
    Pranav Pandey
    Description

    Dataset

    This dataset was created by Pranav Pandey

    Contents

  3. f

    Data_Sheet_1_Adolescent Exploratory Strategies and Behavioral Types in the...

    • figshare.com
    txt
    Updated Mar 4, 2019
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Stina Lundberg; Cecilia Högman; Erika Roman (2019). Data_Sheet_1_Adolescent Exploratory Strategies and Behavioral Types in the Multivariate Concentric Square FieldTM Test.CSV [Dataset]. http://doi.org/10.3389/fnbeh.2019.00041.s001
    Explore at:
    txtAvailable download formats
    Dataset updated
    Mar 4, 2019
    Dataset provided by
    Frontiers
    Authors
    Stina Lundberg; Cecilia Högman; Erika Roman
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Adolescence is an important developmental phase with extensive changes in behavior due to remodeling of the brain and hormonal systems. Validation of animal behavioral tests in this age group is therefore of importance as differences to adult behavior are often not clarified. The aim of the present study was to investigate adolescent behavior in the multivariate concentric square fieldTM (MCSF) test and its relationship to other common behavioral tests as well as to a literature dataset of adult animals. Sixty adolescent male Wistar rats were tested in the MCSF and one of four reference tests; the elevated plus maze, the open field with or without start box, or the social play behavior test. Additionally, 12 animals were tested twice in the MCSF. When analyzing the first encounter with the MCSF test, a distinct grouping of the individuals into three behavioral types was observed. Approximately 20% of the animals had high levels of activity and an additional 20% had high levels of shelter seeking-behavior, these groups composed the outlying behavioral types named Explorers and Shelter seekers, respectively, which were distinct from the Main type of animals. When tested in the MCSF for a second time, the adolescent animals showed a recollection of the arena as they changed their behavior in relation to the first encounter. When comparing the MCSF performance to the reference tests, a relationship was found between the MCSF and the other behavioral test entailing forced exploration, while no relationship was found between the MCSF and social play. The adolescent behavioral profile was characterized by decreased risk assessment and a different activity profile than adults. In conclusion, the MCSF test is useful for profiling adolescent rats but the behavioral interpretation differs from that of adults due to differences in behavioral manifestation during adolescence and the presence of natural subgroups. Adolescent exploration shows a relationship across tests, but the MCSF gives more information than any of the other behavioral tests based on forced exploration. Further studies into the neurobiology behind the behavioral types and how different manipulations affect the distribution into the behavioral types are of interest.

  4. Overwatch 2 statistics

    • kaggle.com
    Updated Jun 27, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mykhailo Kachan (2023). Overwatch 2 statistics [Dataset]. https://www.kaggle.com/datasets/mykhailokachan/overwatch-2-statistics
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jun 27, 2023
    Dataset provided by
    Kaggle
    Authors
    Mykhailo Kachan
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    This dataset is built on data from Overbuff with the help of python and selenium. Development environment - Jupyter Notebook.

    The tables contain the data for competitive seasons 1-4 and for quick play for each hero and rank along with the standard statistics (common to each hero as well as information belonging to a specific hero).

    Note: data for some columns are missing on Overbuff site (there is '—' instead of a specific value), so they were dropped: Scoped Crits for Ashe and Widowmaker, Rip Tire Kills for Junkrat, Minefield Kills for Wrecking Ball. 'Self Healing' column for Bastion was dropped too as Bastion doesn't have this property anymore in OW2. Also, there are no values for "Javelin Spin Kills / 10min" for Orisa in season 1 (the column was dropped). Overall, all missing values were cleaned.

    Attention: Overbuff doesn't contain info about OW 1 competitive seasons (when you change a skill tier, the data isn't changed). If you know a site where it's possible to get this data, please, leave a comment. Thank you!

    The code on GitHub .

    All procedure is done in 5 stages:

    Stage 1:

    Data is retrieved directly from HTML elements on the page with the selenium tool on python.

    Stage 2:

    After scraping, data was cleansed: 1) Deleted comma separator on thousands (e.g. 1,009 => 1009). 2) Translated time representation (e.g. '01:23') to seconds (1*60 + 23 => 83). 3) Lúcio has become Lucio, Torbjörn - Torbjorn.

    Stage 3:

    Data were arranged into a table and saved to CSV.

    Stage 4:

    Columns which are supposed to have only numeric values are checked. All non-numeric values are dropped. This stage helps to find missing values which contain '—' instead and delete them.

    Stage 5:

    Additional missing values are searched for and dealt with. It's either column rename that happens (as the program cannot infer the correct column name for missing values) or a column drop. This stage ensures all wrong data are truly fixed.

    The procedure to fetch the data takes 7 minutes on average.

    This project and code were born from this GitHub code.

  5. t

    NBA Player Dataset & Prediction Model Artifacts

    • test.researchdata.tuwien.ac.at
    bin, csv, json, png +2
    Updated Apr 28, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The citation is currently not available for this dataset.
    Explore at:
    json, png, csv, bin, txt, text/markdownAvailable download formats
    Dataset updated
    Apr 28, 2025
    Dataset provided by
    TU Wien
    Authors
    Burak Baltali; Burak Baltali
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Description

    This dataset contains end-of-season box-score aggregates for NBA players over the 2012–13 through 2023–24 seasons, split into training and test sets for both regular season and playoffs. Each CSV has one row per player per season with columns for points, rebounds, steals, turnovers, 3-pt attempts, FG attempts, plus identifiers.

    Brief overview of Files

    1. end-of-season box-score aggregates (2012–13 – 2023–24) split into train/test;

    2. the Jupyter notebook (Analysis.ipynb); All the code can be executed in there

    3. the trained model binary (nba_model.pkl); Serialized Random Forest model artifact

    4. Evaluation plots (LAL vs. whole‐league) for regular & playoff predictions are given as png outputs and uploaded in here

    5. FAIR4ML metadata (fair4ml_metadata.jsonld);
      see README.md and abbreviations.txt for file details.”

    6. For further information you can go to the github site (Link below)

    File Details

    Notebook

    Analysis.ipynb: Involves the graphica output of the trained and tested data.

    Trained/ Test csv Data

    NameDescriptionPID
    regular_train.csvFor training purposes, the seasons 2012-2013 through 2021-2022 were selected as training purpose4421e56c-4cd3-4ec1-a566-a89d7ec0bced
    regular_test.csv:For testing purpose of the regular season, the 2022-2023 season was selectedf9d84d5e-db01-4475-b7d1-80cfe9fe0e61
    playoff_train.csvFor training purposes of the playoff season, the seasons 2012-2013 through 2022-2023 were selected bcb3cf2b-27df-48cc-8b76-9e49254783d0
    playoff_test.csvFor testing purpose of the playoff season, 2023-2024 season was selectedde37d568-e97f-4cb9-bc05-2e600cc97102

    Others

    abbrevations.txt: Involves the fundemental abbrevations of the columns in csv data

    Additional Notes

    Raw csv files are taken from Kaggle (Source: https://www.kaggle.com/datasets/shivamkumar121215/nba-stats-dataset-for-last-10-years/data)

    Some preprocessing has to be done before uploading into dbrepo

    Plots have also been uploaded as an output for visual purposes.

    A more detailed version can be found on github (Link: https://github.com/bubaltali/nba-prediction-analysis/)

  6. d

    Data from: Hawaii Play Fairway Analysis: Hawaii Water Well Temperature and...

    • catalog.data.gov
    • data.openei.org
    • +2more
    Updated Jan 20, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    University of Hawaii (2025). Hawaii Play Fairway Analysis: Hawaii Water Well Temperature and Hydraulic Head [Dataset]. https://catalog.data.gov/dataset/hawaii-play-fairway-analysis-hawaii-water-well-temperature-and-hydraulic-head-10658
    Explore at:
    Dataset updated
    Jan 20, 2025
    Dataset provided by
    University of Hawaii
    Area covered
    Hawaii
    Description

    .csv file consisting of the water well temperature and water table elevation for wells in the State of Hawaii. Data source, Hawaii Commission of Water Resources Management.

  7. A

    ‘Playstore Analysis’ analyzed by Analyst-2

    • analyst-2.ai
    Updated Nov 12, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com) (2021). ‘Playstore Analysis’ analyzed by Analyst-2 [Dataset]. https://analyst-2.ai/analysis/kaggle-playstore-analysis-2b2d/41638844/?iid=022-994&v=presentation
    Explore at:
    Dataset updated
    Nov 12, 2021
    Dataset authored and provided by
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com)
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Analysis of ‘Playstore Analysis’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/madhav000/playstore-analysis on 30 September 2021.

    --- Dataset description provided by original source is as follows ---

    Google Play Store team had launched a new feature wherein, certain apps that are promising, are boosted in visibility. The boost will manifest in multiple ways including higher priority in recommendations sections (“Similar apps”, “You might also like”, “New and updated games”). These will also get a boost in search results visibility. This feature will help bring more attention to newer apps that have the potential.

    Analysis to be done:

    The problem is to identify the apps that are going to be good for Google to promote. App ratings, which are provided by the customers, is always a great indicator of the goodness of the app. The problem reduces to: predict which apps will have high ratings.

    Problem Statement:

    Google Play Store team is about to launch a new feature wherein, certain apps that are promising, are boosted in visibility. The boost will manifest in multiple ways including higher priority in recommendations sections (“Similar apps”, “You might also like”, “New and updated games”). These will also get a boost in search results visibility. This feature will help bring more attention to newer apps that have the potential.

    Content:

    Dataset: Google Play Store data (“googleplaystore.csv”)

    Fields in the data: App: Application name Category: Category to which the app belongs Rating: Overall user rating of the app Reviews: Number of user reviews for the app Size: Size of the app Installs: Number of user downloads/installs for the app Type: Paid or Free Price: Price of the app Content Rating: Age group the app is targeted at - Children / Mature 21+ / Adult Genres: An app can belong to multiple genres (apart from its main category). For example, a musical family game will belong to Music, Game, Family genres. Last Updated: Date when the app was last updated on Play Store Current Ver: Current version of the app available on Play Store Android Ver: Minimum required Android version

    --- Original source retains full ownership of the source dataset ---

  8. p

    1. data all field studies CSV.csv

    • psycharchives.org
    Updated Aug 5, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2022). 1. data all field studies CSV.csv [Dataset]. https://psycharchives.org/en/item/5bb80531-2812-4a0a-9b75-b396c8543d34
    Explore at:
    Dataset updated
    Aug 5, 2022
    License

    https://doi.org/10.23668/psycharchives.4988https://doi.org/10.23668/psycharchives.4988

    Description

    Citizen Science (CS) projects play a crucial role in engaging citizens in conservation efforts. While implicitly mostly considered as an outcome of CS participation, citizens may also have a certain attitude toward engagement in CS when starting to participate in a CS project. Moreover, there is a lack of CS studies that consider changes over longer periods of time. Therefore, this research presents two-wave data from four field studies of a CS project about urban wildlife ecology using cross-lagged panel analyses. We investigated the influence of attitudes toward engagement in CS on self-related, ecology-related, and motivation-related outcomes. We found that positive attitudes toward engagement in CS at the beginning of the CS project had positive influences on participants’ psychological ownership and pride in their participation, their attitudes toward and enthusiasm about wildlife, and their internal and external motivation two months later. We discuss the implications for CS research and practice. Dataset for: Greving, H., Bruckermann, T., Schumann, A., Stillfried, M., Börner, K., Hagen, R., Kimmig, S. E., Brandt, M., & Kimmerle, J. (2023). Attitudes Toward Engagement in Citizen Science Increase Self-Related, Ecology-Related, and Motivation-Related Outcomes in an Urban Wildlife Project. BioScience, 73(3), 206–219. https://doi.org/10.1093/biosci/biad003: Data (CSV format) collected for all field studies

  9. Game by Game MLB Batter Data (2017-2020)

    • kaggle.com
    Updated Aug 5, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    John Adamek (2022). Game by Game MLB Batter Data (2017-2020) [Dataset]. https://www.kaggle.com/datasets/johnadamek/game-by-game-mlb-batter-data-20172020
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Aug 5, 2022
    Dataset provided by
    Kaggle
    Authors
    John Adamek
    Description

    Content

    This dataset utilized raw data from Advanced Sports Analytics (https://www.advancedsportsanalytics.com/).

    This is a great website that provides raw MLB game data for every game. It is quite messy and requires a quite a bit cleaning but the data is worth it! Batting, Pitching, and play by play data was exported into csv files for the 2017-2020 seasons. R script is provided

    Columns

    Key Column information:

    Batting Order = Where the player batted in the lineup for that given day Position = The position they played for that game Pit = Total amount of pitches they saw over the course of the game Str = Total amount of strikes they saw over the course of the game Team.R = Total runs scored by the batters team in the game Team.H = Total hits by the batters team in the game Opponent.R = Total runs scored by the opposing team in the game Opponent.H = Total hits by the opposing team in the game X1b.Ump = First base umpire for the game X2b.Ump = Second base umpire for the game X3b.Ump = Third base umpire for the game HP.Ump = Home Plate umpire for the game Date = Date of the game Game.Time = Game time H.A = Home or Away Precipitation = yes/no Sky = Whether it was sunny, cloudy, overcast, rain, drizzle, night, or in dome Stadium = Stadium played in Temperature = Temperature at game time Weather = Character combining temperature, wind speed, wind direction, and stadium/sky ** Wind.Direction** = Direction of the wind speed Wind.Speed = Wind speed in mph Starting.Pitcher = Starting pitcher Over.Under = Over/Under of the game Moneyline = The moneyline for the batters team Wagers = Amount of wagers placed on the game

    UPDATE

    Unfortunately, it seems like they no longer have this raw data available on their website so I will be uploading the raw data along with the cleaned files so that other's can manipulate the data anyway they like!

  10. s

    DLR Playing Pitches - Dataset - data.smartdublin.ie

    • data.smartdublin.ie
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    DLR Playing Pitches - Dataset - data.smartdublin.ie [Dataset]. https://data.smartdublin.ie/dataset/dlr_pitches
    Explore at:
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The CSV file contains the games pitches under the control of DLRCoCo. It includes grass and synthetic surfaces. Sports included are GAA, Soccer, Rugby and others. It includes a number, size and its location (Lat/Long). A link to the DLRCoCo pitch playability notice is also provided.

  11. o

    Scrabble data from woogles.io

    • opendatabay.com
    • kaggle.com
    .undefined
    Updated Jun 23, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Datasimple (2025). Scrabble data from woogles.io [Dataset]. https://www.opendatabay.com/data/ai-ml/218b5feb-90aa-46ac-b12a-4aa6cf2dba30
    Explore at:
    .undefinedAvailable download formats
    Dataset updated
    Jun 23, 2025
    Dataset authored and provided by
    Datasimple
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Area covered
    Entertainment & Media Consumption
    Description

    Context This is a dataset of Scrabble games played by the BasicBot on the website woogles.io. It was created using two notebooks:

    Fetch the data: https://www.kaggle.com/mrisdal/fetch-scrabble-data-from-woogles-io Process the data: https://www.kaggle.com/mrisdal/process-scrabble-data-from-woogles-io Content There are four CSV files:

    games.csv contains metadata about individual games, time control, how the game ended, the winner, timestamps, etc. scores.csv contains final score and rating data for games and players turns.csv contains data about individual plays in each turn of every scrabble game in the dataset games_raw.csv is the file before some processing done in this notebook: https://www.kaggle.com/mrisdal/process-scrabble-data-from-woogles-io Acknowledgements Thank you to woogles.io for providing this platform for playing Scrabble!

    License

    CC0

    Original Data Source: Scrabble data from woogles.io

  12. f

    Additional file 2: of Effect of play-based family-centered...

    • springernature.figshare.com
    txt
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Teklu Abessa; Berhanu Worku; Mekitie Wondafrash; Tsinuel Girma; Johan Valy; Johan Lemmens; Liesbeth Bruckers; Patrick Kolsteren; Marita Granitzer (2023). Additional file 2: of Effect of play-based family-centered psychomotor/psychosocial stimulation on the development of severely acutely malnourished children under six in a low-income setting: a randomized controlled trial [Dataset]. http://doi.org/10.6084/m9.figshare.9838511.v1
    Explore at:
    txtAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    figshare
    Authors
    Teklu Abessa; Berhanu Worku; Mekitie Wondafrash; Tsinuel Girma; Johan Valy; Johan Lemmens; Liesbeth Bruckers; Patrick Kolsteren; Marita Granitzer
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Dataset file. (CSV 466 kb)

  13. K

    Utica Play Boundary

    • koordinates.com
    csv, dwg, geodatabase +6
    Updated Aug 27, 2016
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    US Energy Information Administration (2016). Utica Play Boundary [Dataset]. https://koordinates.com/layer/13310-utica-play-boundary/
    Explore at:
    dwg, mapinfo tab, pdf, geodatabase, geopackage / sqlite, shapefile, csv, kml, mapinfo mifAvailable download formats
    Dataset updated
    Aug 27, 2016
    Dataset authored and provided by
    US Energy Information Administration
    Area covered
    Description

    Geospatial data about Utica Play Boundary. Export to CAD, GIS, PDF, CSV and access via API.

  14. o

    stats_euroleague_players

    • explore.openaire.eu
    • zenodo.org
    Updated Oct 28, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Alvaro Diaz; David Leiva (2020). stats_euroleague_players [Dataset]. http://doi.org/10.5281/zenodo.4147075
    Explore at:
    Dataset updated
    Oct 28, 2020
    Authors
    Alvaro Diaz; David Leiva
    Description

    Web scraping Euroleage players data of season 2020-2021. euroleaguePlayers_average.csv: provides mean stats for each player that plays 2020-2021 season euroleaguePlayers_season.csv: Provides statistics for each season for each player playing the 2020-21 season. This stats are of this season and previous ones. Datasets contain information on basic basketball stats and personal data of each player like born date, height or team he palyed for

  15. f

    Data_Sheet_2_Virtual play and real connections: unpacking the impact of rice...

    • frontiersin.figshare.com
    csv
    Updated Sep 9, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Takeshi Nishimura; Junko Taguchi; Terukazu Kumazawa; Kengo Hayashi (2024). Data_Sheet_2_Virtual play and real connections: unpacking the impact of rice farming simulation video games.csv [Dataset]. http://doi.org/10.3389/fcomp.2024.1392862.s002
    Explore at:
    csvAvailable download formats
    Dataset updated
    Sep 9, 2024
    Dataset provided by
    Frontiers
    Authors
    Takeshi Nishimura; Junko Taguchi; Terukazu Kumazawa; Kengo Hayashi
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This study investigates how the rice farming simulation video game Sakuna: Of Rice and Ruin affects interest in real-world agriculture and the inclination to start farming amidst Japan’s declining farming population. We surveyed 428 Japanese residents, including not only game players but also those who watch the game live or are merely aware of its existence. We also interviewed an individual who started rice farming after playing the game. The findings indicate that the game successfully stimulates greater interest in agriculture and somewhat motivates players to consider farming, more than just viewers or those who are aware of it. Moreover, individuals with real-life connections to agriculture, such as farming experience or professional connections, were optimistic about the transition from game to reality. The study suggests that rice farming simulation games can foster expectations of developing an interest in agriculture and potentially embarking on farming careers, demonstrating the game’s significant impact beyond entertainment.

  16. Cross-language corpora of privacy policies

    • zenodo.org
    • explore.openaire.eu
    • +1more
    csv, zip
    Updated Jun 17, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Francesco Ciclosi; Francesco Ciclosi; Silvia Vidor; Silvia Vidor; Fabio Massacci; Fabio Massacci (2023). Cross-language corpora of privacy policies [Dataset]. http://doi.org/10.5281/zenodo.7729546
    Explore at:
    csv, zipAvailable download formats
    Dataset updated
    Jun 17, 2023
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Francesco Ciclosi; Francesco Ciclosi; Silvia Vidor; Silvia Vidor; Fabio Massacci; Fabio Massacci
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The dataset consists of three different privacy policy corpora (in English and Italian) composed of 81 unique privacy policy texts spanning the period 2018-2021. This dataset makes available an example of three corpora of privacy policies. The first corpus is the English-language corpus, the original used in the study by Tang et al. [2]. The other two are cross-language corpora built (one, the source corpus, in English, and the other, the replication corpus, in Italian, which is the language of a potential replication study) from the first corpus.

    The policies were collected from:

    1. the Alexa top 10 Italy and U.S. websites rank;
    2. the Play Store apps rank in the "most profitable games" category of the Play Store for Italy and the U.S.

    We manually analyzed the Alexa top 10 Italy websites as of November 2021. Analogously, we analyzed selected apps that, in the same period, had ranked better in the "most profitable games" category of the Play Store for Italy.

    All the privacy policies are ANSI-encoded text files and have been manually read and verified.
    The dataset is helpful as a starting point for building comparable cross-language privacy policies corpora. The availability of these comparable cross-language privacy policies corpora helps replicate studies in different languages.
    Details on the methodology can be found in the accompanying paper.

    The available files are as follows:

    • policies-texts.zip --> contains a directory of text files with the policy texts. File names are the SHA1 hashes of the policy text.
    • policy-metadata.csv --> Contains a CSV file with the metadata for each privacy policy.

    This dataset is the original dataset used in the publication [1]. The original English U.S. corpus is described in the publication [2].

    [1] F. Ciclosi, S. Vidor and F. Massacci. "Building cross-language corpora for human understanding of privacy policies." Workshop on Digital Sovereignty in Cyber Security: New Challenges in Future Vision. Communications in Computer and Information Science. Springer International Publishing, 2023, In press.

    [2] J. Tang, H. Shoemaker, A. Lerner, and E. Birrell. Defining Privacy: How Users Interpret Technical Terms in Privacy Policies. Proceedings on Privacy Enhancing Technologies, 3:70–94, 2021.

  17. Z

    The Con Espressione Game Dataset

    • data.niaid.nih.gov
    • zenodo.org
    Updated Nov 5, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Chowdhury, Shreyan (2020). The Con Espressione Game Dataset [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_3968827
    Explore at:
    Dataset updated
    Nov 5, 2020
    Dataset provided by
    Cancino-Chacón, Carlos Eduardo
    Chowdhury, Shreyan
    Widmer, Gerhard
    Peter, Silvan
    Aljanaki, Anna
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Con Espressione Game Dataset

    A piece of music can be expressively performed, or interpreted, in a variety of ways. With the help of an online questionnaire, the Con Espressione Game, we collected some 1,500 descriptions of expressive character relating to 45 performances of 9 excerpts from classical piano pieces, played by different famous pianists. More specifically, listeners were asked to describe, using freely chosen words (preferably: adjectives), how they perceive the expressive character of the different performances. The aim of this research is to find the dimensions of musical expression (in Western classical piano music) that can be attributed to a performance, as perceived and described in natural language by listeners.

    The Con Espressione Game was launched on the 3rd of April 2018.

    Dataset structure

    Listeners’ Descriptions of Expressive performance

    piece_performer_data.csv: A comma separated file (CSV) containing information about the pieces in the dataset. Strings are delimited with ". The columns in this file are:

    music_id: An integer ID for each performance in the dataset.

    performer_name: (Last) name of the performer.

    piece_name: (Short) name of the piece.

    performance_name: Name of the the performance. All files in different modalities (alignments, MIDI, loudness features, etc) corresponding to a single performance will have the same name (but possibly different extensions).

    composer: Name of the composer of the piece.

    piece: Full name of the piece.

    album: Name of the album.

    performer_name_full: Full name of the performer.

    year_of_CD_issue: Year of the issue of the CD.

    track_number: Number of the track in the CD.

    length_of_excerpt_seconds: Length of the excerpt in seconds.

    start_of_excerpt_seconds: Start of the excerpt in its corresponding track (in seconds).

    end_of_excerpt_seconds: End of the excerpt in its corresponding track (in seconds).

    con_espressione_game_answers.csv: This is the main file of the dataset which contains listener’s descriptions of expressive character. This CSV file contains the following columns:

    answer_id: An integer representing the ID of the answer. Each answer gets a unique ID.

    participant_id: An integer representing the ID of a participant. Answers with the same ID come from the same participant.

    music_id: An integer representing the ID of the performance. This is the same as the music_id in piece_performer_data.csv described above.

    answer: (cleaned/formatted) participant description. All answers have been written as lower-case, typos were corrected, spaces replaced by underscores (_) and individual terms are separated by commas. See cleanup_rules.txt for a more detailed description of how the answers were formatted.

    original_answer: Raw answers provided by the participants.

    timestamp: Timestamp of the answer.

    favorite: A boolean (0 or 1) indicating if this performance of the piece is the participant’s favorite.

    translated_to_english. Raw translation (from German, Russian, Spanish and Italian).

    performer. (Last) name of the performer. See piece_performer_data.csv described above.

    piece_name. (Short) name of the piece. See piece_performer_data.csv described above.

    performance_name. Name of the performance. See piece_performer_data.csv described above.

    participant_profiles.csv. A CSV file containing musical background information of the participants. Empty cells mean that the participant did not provide an answer. This file contains the following columns:

    participant_id: An integer representing the ID of a participant.

    music_education_years: (Self reported) number of years of musical education of the participants

    listening_to_classical_music: Answers to the question “How often do you listen to classical music?”. The possible answers are:

    1: Never

    2: Very rarely

    3: Rarely

    4: Occasionally

    5: Frequently

    6: Very frequently

    registration_date: Date and time of registration of the participant.

    playing_piano: Answer to the question “Do you play the piano?”. The possible answers are

    1: No

    2: A little bit

    3: Quite well

    4: Very well

    cleanup_rules.txt: Rules for cleaning/formatting the terms in the participant’s answers.

    translations_GERMAN.txt: How the translations from German to English were made.

    Metadata

    Related meta data is stored in the MetaData folder.

    Alignments. This folders contains the manually-corrected score-to-performance alignments for each of the pieces in the dataset. Each of these alignments is a text file.

    ApproximateMIDI. This folder contains reconstructed MIDI performances created from the alignments and the loudness curves. The onset time and offset times of the notes were determined from the alignment times and the MIDI velocity was computed from the loudness curves.

    Match. This folder contains score-to-performance alignments in Matchfile format.

    Scores_MuseScore. Manually encoded sheet music in MuseScore format (.mscz)

    Scores_MusicXML. Sheet music in MusicXML format.

    Scores_pdf. Images of the sheet music in pdf format.

    Audio Features

    Audio features computed from the audio files. These features are located in the AudioFeatures folder.

    Loudness: Text files containing loudness curves in dB of the audio files. These curves were computed using code provided by Olivier Lartillot. Each of these files contains the following columns:

    performance_time_(seconds): Performance time in seconds.

    loudness_(db): Loudness curve in dB.

    smooth_loudness_(db): Smoothed loudness curve.

    Spectrograms. Numpy files (.npy) containing magnitude spectrograms (as Numpy arrays). The shape of each array is (149 frequency bands, number of frames of the performance). The spectrograms were computed from the audio files with the following parameters:

    Sample rate (sr): 22050 samples per second

    Window length: 2048

    Frames per Second (fps): 31.3 fps

    Hop size: sample_rate // fps = 704

    Filterbank: log scaled filterbank with 24 bands per octave and min frequency 20 Hz

    MIDI Performances

    Since the dataset consists of commercial recordings, we cannot include the audio files in the dataset. We can, however, share the 2 synthesized MIDI performances used in the Con Espressione game (for Bach’s Prelude in C and the second movement of Mozart’s Sonata in C K 545) in mp3 format. These performances can be found in the MIDIPerformances folder.

  18. f

    0. penaltis_clean.csv

    • figshare.com
    • portalcientifico.universidadeuropea.com
    txt
    Updated Feb 28, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rubén Maneiro Dios; Iyán Iván-Baragaño; jose luis losada; antonio arda (2024). 0. penaltis_clean.csv [Dataset]. http://doi.org/10.6084/m9.figshare.25306198.v1
    Explore at:
    txtAvailable download formats
    Dataset updated
    Feb 28, 2024
    Dataset provided by
    figshare
    Authors
    Rubén Maneiro Dios; Iyán Iván-Baragaño; jose luis losada; antonio arda
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    database of penalty kicks in high-performance professional football

  19. Data

    • figshare.com
    zip
    Updated Jun 2, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Anonymous Anonymous (2023). Data [Dataset]. http://doi.org/10.6084/m9.figshare.23279870.v1
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jun 2, 2023
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Anonymous Anonymous
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    data ├── reviews │ ├── com.reddit.frontpage_cleaned.csv │ ├── com.snapchat.android_cleaned.csv │ ├── com.soundcloud.android_cleaned.csv │ └── com.twitter.android_cleaned.csv └── training └── truthset.csv

  20. g

    Territory: Localization of Play Areas | gimi9.com

    • gimi9.com
    Updated Dec 22, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). Territory: Localization of Play Areas | gimi9.com [Dataset]. https://gimi9.com/dataset/eu_ds724/
    Explore at:
    Dataset updated
    Dec 22, 2024
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The dataset contains a list of public spaces where there are play areas dedicated to play activities, equipped with usable equipment to play. This dataset has been issued by the Municipality of Milan. The downloadable csv resource contains columns of georeferenced points, which correspond to the centroid of the polygonal geometry used in its GeoJSON resource.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Mr. Stack (2024). NBA-Player-Career-Stats [Dataset]. https://huggingface.co/datasets/Hatman/NBA-Player-Career-Stats

NBA-Player-Career-Stats

Hatman/NBA-Player-Career-Stats

Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
May 19, 2024
Authors
Mr. Stack
License

MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically

Description

Dataset Description

This dataset contains a single CSV file with lifetime statistics for NBA players. The data includes various box score stats and personal information for each player's career.

  Data Fields

The CSV file contains the following columns:

FULL_NAME: The player's full name AST: Total career assists BLK: Total career blocks DREB: Total career defensive rebounds FG3A: Total 3-point field goal attempts FG3M: Total 3-point field goals made FG3_PCT: 3-point field… See the full description on the dataset page: https://huggingface.co/datasets/Hatman/NBA-Player-Career-Stats.

Search
Clear search
Close search
Google apps
Main menu