4 datasets found
  1. Sportradar Baseball dataset

    • kaggle.com
    zip
    Updated Aug 30, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sportradar (2019). Sportradar Baseball dataset [Dataset]. https://www.kaggle.com/datasets/sportradar/baseball
    Explore at:
    zip(0 bytes)Available download formats
    Dataset updated
    Aug 30, 2019
    Dataset authored and provided by
    Sportradarhttp://sportradar.com/
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Context

    This public data includes pitch-by-pitch data for Major League Baseball (MLB) games in 2016. With this data you can effectively replay a game and rebuild basic statistics for players and teams.

    Content

    games_wide - Every pitch, steal, or lineup event for each at bat in the 2016 regular season.

    games_post_wide - Every pitch, steal, or lineup event for each at-bat in the 2016 post season.

    schedules - The schedule for every team in the regular season.

    *The schemas for the games_wide and games_post_wide tables are identical.

    Querying BigQuery tables

    You can use the BigQuery Python client library to query tables in this dataset in Kernels. Note that methods available in Kernels are limited to querying data. Tables are at bigquery-public-data.github_repos.[TABLENAME]. Fork this kernel to get started to learn how to safely manage analyzing large BigQuery datasets.

    Acknowledgements

    Dataset Source: Sportradar LLC

    Use: Copyright Sportradar LLC. Access to data is intended solely for internal research and testing purposes, and is not to be used for any business or commercial purpose. Data are not to be exploited in any manner without express approval from Sportradar. Display of data must include the phrase, “Data provided by Sportradar LLC,” and be hyperlinked to www.sportradar.com.

  2. The History of Baseball

    • kaggle.com
    zip
    Updated Nov 14, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    SeanLahman (2019). The History of Baseball [Dataset]. https://www.kaggle.com/seanlahman/the-history-of-baseball
    Explore at:
    zip(21463012 bytes)Available download formats
    Dataset updated
    Nov 14, 2019
    Authors
    SeanLahman
    License

    Attribution-ShareAlike 3.0 (CC BY-SA 3.0)https://creativecommons.org/licenses/by-sa/3.0/
    License information was derived automatically

    Description

    Baffled why your team traded for that 34-year-old pitcher? Convinced you can create a new and improved version of WAR? Wondering what made the 1907 Cubs great and if can they do it again?

    The History of Baseball is a reformatted version of the famous Lahman’s Baseball Database. It contains Major League Baseball’s complete batting and pitching statistics from 1871 to 2015, plus fielding statistics, standings, team stats, park stats, player demographics, managerial records, awards, post-season data, and more.

    Scripts, Kaggle’s free, in-browser analytics tool, makes it easy to share detailed sabermetrics, predict the next hall of fame inductee, illustrate how speed scores runs, or publish a definitive analysis on why the Los Angeles Dodgers will never win another World Series.

    We have more ideas for analysis than games in a season, but here are a few we’d really love to see:

    • Is there a most error-prone position?
    • When do players at different positions peak?
    • Are the best performers selected for all-star game?
    • How many walks does it take for a starting pitcher to get pulled?
    • Do players with a high ground into double play (GIDP) have a lower batting average?
    • Which players are the most likely to choke during the post-season?
    • Why should or shouldn’t the National League adopt the designated hitter rule?

    See the full SQLite schema.

  3. FiveThirtyEight MLB Elo Dataset

    • kaggle.com
    Updated Apr 26, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    FiveThirtyEight (2019). FiveThirtyEight MLB Elo Dataset [Dataset]. https://www.kaggle.com/datasets/fivethirtyeight/fivethirtyeight-mlb-elo-dataset/versions/113
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Apr 26, 2019
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    FiveThirtyEight
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Content

    files: - https://projects.fivethirtyeight.com/mlb-api/mlb_elo.csv

    - https://projects.fivethirtyeight.com/mlb-api/mlb_elo_latest.csv

    MLB Elo

    This file contains links to the data behind The Complete History Of MLB and our MLB Predictions.

    mlb_elo.csv contains game-by-game Elo ratings and forecasts back to 1871. mlb_elo_latest.csv contains game-by-game Elo ratings and forecasts for only the latest season.

    The data contains two separate systems for rating teams; the simpler Elo ratings, used for The Complete History Of MLB, and the more involved — and confusingly named — "ratings" that are used in our MLB Predictions. The main difference is that Elo ratings are reverted to the mean between seasons, while the more involved ratings use preseason team projections from several projection systems and account for starting pitchers. More information can be found in this article.

    ColumnDefinition
    dateDate of game
    seasonYear of season
    neutralWhether game was on a neutral site
    playoffWhether game was in playoffs, and the playoff round if so
    team1Abbreviation for home team
    team2Abbreviation for away team
    elo1_preHome team's Elo rating before the game
    elo2_preAway team's Elo rating before the game
    elo_prob1Home team's probability of winning according to Elo ratings
    elo_prob2Away team's probability of winning according to Elo ratings
    elo1_postHome team's Elo rating after the game
    elo2_postAway team's Elo rating after the game
    rating1_preHome team's rating before the game
    rating2_preAway team's rating before the game
    pitcher1Name of home starting pitcher
    pitcher2Name of away starting pitcher
    pitcher1_rgsHome starting pitcher's rolling game score before the game
    pitcher2_rgsAway starting pitcher's rolling game score before the game
    pitcher1_adjHome starting pitcher's adjustment to their team's rating
    pitcher2_adjAway starting pitcher's adjustment to their team's rating
    rating_prob1Home team's probability of winning according to team ratings and starting pitchers
    rating_prob2Away team's probability of winning according to team ratings and starting pitchers
    rating1_postHome team's rating after the game
    rating2_postAway team's rating after the game
    score1Home team's score
    score2Away team's score

    Context

    This is a dataset from FiveThirtyEight hosted on their GitHub. Explore FiveThirtyEight data using Kaggle and all of the data sources available through the FiveThirtyEight organization page!

    • Update Frequency: This dataset is updated daily.

    Acknowledgements

    This dataset is maintained using GitHub's API and Kaggle's API.

    This dataset is distributed under the Attribution 4.0 International (CC BY 4.0) license.

  4. U.S. Major League Soccer Salaries

    • kaggle.com
    Updated Jul 13, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Chris Crawford (2017). U.S. Major League Soccer Salaries [Dataset]. https://www.kaggle.com/crawford/us-major-league-soccer-salaries/discussion
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 13, 2017
    Dataset provided by
    Kaggle
    Authors
    Chris Crawford
    Area covered
    United States
    Description

    Context

    The Major League Soccer Union releases the salaries of every MLS player each year. This is a collection of salaries from 2007 to 2017.

    Content

    Each file contains the following fields:

    • club: Team abbreviation
    • last_name: Player last name
    • first_name: Player first name
    • position: Position abbreviation
    • base_salary: Base salary
    • guaranteed_compensation: Guaranteed compensation

    Acknowledgements

    Jeremy Singer-Vine over at Data is Plural scraped the PDF's released by the MLS Union and put the data in a nice little package of CSV files for everyone.

    I downloaded this dataset from: https://github.com/data-is-plural/mls-salaries MIT License

    Inspiration

    Who in the MLS makes the most money? Are they worth it? I make about $900 bazillion each year, can I afford a soccer team?

  5. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Sportradar (2019). Sportradar Baseball dataset [Dataset]. https://www.kaggle.com/datasets/sportradar/baseball
Organization logo

Sportradar Baseball dataset

Play-by-play data for every Baseball game in 2016

Explore at:
zip(0 bytes)Available download formats
Dataset updated
Aug 30, 2019
Dataset authored and provided by
Sportradarhttp://sportradar.com/
License

https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

Description

Context

This public data includes pitch-by-pitch data for Major League Baseball (MLB) games in 2016. With this data you can effectively replay a game and rebuild basic statistics for players and teams.

Content

games_wide - Every pitch, steal, or lineup event for each at bat in the 2016 regular season.

games_post_wide - Every pitch, steal, or lineup event for each at-bat in the 2016 post season.

schedules - The schedule for every team in the regular season.

*The schemas for the games_wide and games_post_wide tables are identical.

Querying BigQuery tables

You can use the BigQuery Python client library to query tables in this dataset in Kernels. Note that methods available in Kernels are limited to querying data. Tables are at bigquery-public-data.github_repos.[TABLENAME]. Fork this kernel to get started to learn how to safely manage analyzing large BigQuery datasets.

Acknowledgements

Dataset Source: Sportradar LLC

Use: Copyright Sportradar LLC. Access to data is intended solely for internal research and testing purposes, and is not to be used for any business or commercial purpose. Data are not to be exploited in any manner without express approval from Sportradar. Display of data must include the phrase, “Data provided by Sportradar LLC,” and be hyperlinked to www.sportradar.com.

Search
Clear search
Close search
Google apps
Main menu