100+ datasets found
  1. Game by Game MLB Batter Data (2017-2020)

    • kaggle.com
    Updated Aug 5, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    John Adamek (2022). Game by Game MLB Batter Data (2017-2020) [Dataset]. https://www.kaggle.com/datasets/johnadamek/game-by-game-mlb-batter-data-20172020
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Aug 5, 2022
    Dataset provided by
    Kaggle
    Authors
    John Adamek
    Description

    Content

    This dataset utilized raw data from Advanced Sports Analytics (https://www.advancedsportsanalytics.com/).

    This is a great website that provides raw MLB game data for every game. It is quite messy and requires a quite a bit cleaning but the data is worth it! Batting, Pitching, and play by play data was exported into csv files for the 2017-2020 seasons. R script is provided

    Columns

    Key Column information:

    Batting Order = Where the player batted in the lineup for that given day Position = The position they played for that game Pit = Total amount of pitches they saw over the course of the game Str = Total amount of strikes they saw over the course of the game Team.R = Total runs scored by the batters team in the game Team.H = Total hits by the batters team in the game Opponent.R = Total runs scored by the opposing team in the game Opponent.H = Total hits by the opposing team in the game X1b.Ump = First base umpire for the game X2b.Ump = Second base umpire for the game X3b.Ump = Third base umpire for the game HP.Ump = Home Plate umpire for the game Date = Date of the game Game.Time = Game time H.A = Home or Away Precipitation = yes/no Sky = Whether it was sunny, cloudy, overcast, rain, drizzle, night, or in dome Stadium = Stadium played in Temperature = Temperature at game time Weather = Character combining temperature, wind speed, wind direction, and stadium/sky ** Wind.Direction** = Direction of the wind speed Wind.Speed = Wind speed in mph Starting.Pitcher = Starting pitcher Over.Under = Over/Under of the game Moneyline = The moneyline for the batters team Wagers = Amount of wagers placed on the game

    UPDATE

    Unfortunately, it seems like they no longer have this raw data available on their website so I will be uploading the raw data along with the cleaned files so that other's can manipulate the data anyway they like!

  2. h

    mlb-play-by-plays-v1

    • huggingface.co
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Finn Eyles, mlb-play-by-plays-v1 [Dataset]. https://huggingface.co/datasets/finnnnnnnnnnnn/mlb-play-by-plays-v1
    Explore at:
    Authors
    Finn Eyles
    Description

    finnnnnnnnnnnn/mlb-play-by-plays-v1 dataset hosted on Hugging Face and contributed by the HF Datasets community

  3. MLB all-time games played leaders December 2024

    • statista.com
    Updated Dec 11, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2024). MLB all-time games played leaders December 2024 [Dataset]. https://www.statista.com/statistics/856664/all-time-mlb-games-played-leaders/
    Explore at:
    Dataset updated
    Dec 11, 2024
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    Dec 2024
    Area covered
    North America, Canada, United States
    Description

    Pete Rose has played the most games in Major League Baseball history, taking to the field in 3,562 games between 1963 and 1986. Second in the ranking is Carl Yastrzemski, who played in 3,308 MLB games.

  4. Sportradar Baseball dataset

    • kaggle.com
    zip
    Updated Aug 30, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sportradar (2019). Sportradar Baseball dataset [Dataset]. https://www.kaggle.com/datasets/sportradar/baseball
    Explore at:
    zip(0 bytes)Available download formats
    Dataset updated
    Aug 30, 2019
    Dataset authored and provided by
    Sportradarhttp://sportradar.com/
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Context

    This public data includes pitch-by-pitch data for Major League Baseball (MLB) games in 2016. With this data you can effectively replay a game and rebuild basic statistics for players and teams.

    Content

    games_wide - Every pitch, steal, or lineup event for each at bat in the 2016 regular season.

    games_post_wide - Every pitch, steal, or lineup event for each at-bat in the 2016 post season.

    schedules - The schedule for every team in the regular season.

    *The schemas for the games_wide and games_post_wide tables are identical.

    Querying BigQuery tables

    You can use the BigQuery Python client library to query tables in this dataset in Kernels. Note that methods available in Kernels are limited to querying data. Tables are at bigquery-public-data.github_repos.[TABLENAME]. Fork this kernel to get started to learn how to safely manage analyzing large BigQuery datasets.

    Acknowledgements

    Dataset Source: Sportradar LLC

    Use: Copyright Sportradar LLC. Access to data is intended solely for internal research and testing purposes, and is not to be used for any business or commercial purpose. Data are not to be exploited in any manner without express approval from Sportradar. Display of data must include the phrase, “Data provided by Sportradar LLC,” and be hyperlinked to www.sportradar.com.

  5. Play Sustainaball: An environmental footprint for an MLB team season

    • zenodo.org
    • datadryad.org
    bin, txt
    Updated Jun 10, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Hannah Brady; Hannah Brady; Gabrielle Barsotti; Jordan Davis; Carly Norris; Eric Shaphran; Gabrielle Barsotti; Jordan Davis; Carly Norris; Eric Shaphran (2022). Play Sustainaball: An environmental footprint for an MLB team season [Dataset]. http://doi.org/10.25349/d9rg87
    Explore at:
    txt, binAvailable download formats
    Dataset updated
    Jun 10, 2022
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Hannah Brady; Hannah Brady; Gabrielle Barsotti; Jordan Davis; Carly Norris; Eric Shaphran; Gabrielle Barsotti; Jordan Davis; Carly Norris; Eric Shaphran
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    In recent years, there has been increased attention and focus from the public on the environmental impact of professional sports organizations. Significant opportunities exist for Major League Baseball (MLB) teams to both reduce their own environmental footprint, and that of their fans, through sustainability initiatives. Despite stadiums using upwards of ten million gallons of water per year and having the same energy needs as a small city, no MLB team has completed a public-facing quantification of their total environmental footprint. This project calculated the carbon footprint and water consumption of the Tampa Bay Rays for the 2019 regular season. We analyzed Scope 1, 2, and 3 GHG emissions to identify hotspots within the Rays' operations, supply chains, and transportation. Fan transportation was found to be the largest source of GHGs, followed by food production for concessions. The cooling tower and restrooms were identified as the largest sources of onsite water usage. We created a repository of best practices as a resource for stadium managers that includes strategies to reduce GHGs and water use coupled with scenario analyses estimating potential reductions. The following recommendations are highlighted as the largest reduction opportunities: (1) prioritizing fan engagement to switch to more sustainable modes of transportation, and (2) offering and highlighting more vegetarian options at concessions. To further reduce emissions and water usage, MLB teams should prioritize sub-metering electricity and water lines and installing more efficient equipment.

  6. MLB average game length 2000-2024

    • statista.com
    Updated Jun 24, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). MLB average game length 2000-2024 [Dataset]. https://www.statista.com/statistics/1310998/mlb-game-length/
    Explore at:
    Dataset updated
    Jun 24, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Area covered
    North America
    Description

    Ahead of the 2023 Major League Baseball season, a pitch clock was introduced to speed up the pace of the game. As a result, an average game during the 2024 MLB season lasted * hours and ** minutes. This was more than ** minutes shorter than an average game during the 2022 season, when the pitch clock had not yet been introduced.

  7. Comparative Analysis of MLB Players wOBA

    • kaggle.com
    Updated Jan 15, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Devastator (2023). Comparative Analysis of MLB Players wOBA [Dataset]. https://www.kaggle.com/datasets/thedevastator/comparative-analysis-of-mlb-players-2014-woba-st
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jan 15, 2023
    Dataset provided by
    Kaggle
    Authors
    The Devastator
    Description

    Comparative Analysis of MLB Players wOBA Statistics

    Analyzing Skillsets in The Old Ballgame

    By Devi Ramanan [source]

    About this dataset

    This dataset features a comprehensive look into the performance of 311 professional Major League Baseball players. It comprises key batting statistics including name, team, age, plate appearances (PA), batting average (AVG), on-base plus slugging percentage - average (OBP-AVG), isolated power (ISO), stolen bases (SB), and ultimate zone rating per 150 games (UZR/150). Additionally, the dataset contains more detailed and complex metrics for each player such as weighted values for singles (1Bw), doubles (2Bw), triples(3Bw), home runs(HRw) unintentional walks(uBBw), hit by pitches(HBPw) ,stolen bases attempted/successful(SBW/CSW) and weighted On-Base Average(WOBA). All these data points create an effective way to measure the offensive performance that is both insightful and objective. Jeff Long's Spira Award winning article analyzed this very same data to compare MLB players who have similar skillsets than would otherwise be expected

    More Datasets

    For more datasets, click here.

    Featured Notebooks

    • 🚨 Your notebook can be here! 🚨!

    How to use the dataset

    This dataset can be used to analyse the wOBA stats of MLB players with at least 250 plate appearances (PA). This dataset has data on 31 baseball players. The data includes the player's name, their team, age, PA, batting average (AVG), on-base percentage minus batting average (OBP-AVG), isolated power (ISO), stolen bases (SB), Ultimate Zone Rating per 150 games (UZR/150), weighted value of singles(1Bw) , weighted value of doubles(2Bw) , weighted value of triples(3Bw) , weighted value of home runs(HRw) ,weighted value of unintentional walks(uBBw) ,weighted value of hit by pitches(HBPw )and stolen base attempt success rate (CSW). By using this dataset you can compare different MLB Players' stats in the same year.

    Research Ideas

    • Analyzing and predicting batting performance. With this dataset, researchers could create models to observe correlations between batting metrics such as strikeouts, walks, home runs, stolen bases etc and overall wOBA scores for the players. This could be used to generate insights into the most important batting factors that contribute the greatest benefit for a team's success.
    • Comparing players from different teams in terms of their batting performance. By comparing two players with similar stats (for example two offensive power hitters) across different teams it would be possible to analyze whether certain teams consistently have better offensive players or if they just have higher quantity in particular positions of play.
    • Creating a predictive model for MLB draft prospects or free agents signing potentials based on their stats and previous yearly changes in OBP-AVG or UZR/150 score could provide meaningful insight into which emerging talents are likely to see substantial improvement in their career trajectory over time when compared with aging stars who may gradually decline over time due to age related attrition factors such as injury and fatigue amongst others

    Acknowledgements

    If you use this dataset in your research, please credit the original authors. Data Source

    License

    Unknown License - Please check the dataset description for more information.

    Columns

    File: Batting Key Stats2.csv | Column name | Description | |:--------------|:--------------------------------------------------| | Name | Name of the player. (String) | | Team | Team the player is on. (String) | | Age | Age of the player. (Integer) | | PA | Plate Appearances. (Integer) | | AVG | Batting Average. (Float) | | OBP-AVG | On-Base Percentage minus Batting Average. (Float) | | ISO | Isolated Power. (Float) | | SB | Stolen Bases. (Integer) | | UZR/150 | Ultimate Zone Rating per 150 games. (Float) |

    File: 2014 wOBA Stats 3.csv | Column name | Description | |:--------------|:-----------------------------------------------| | Name | Name of the player. (String) | | Team | Team the player is on. (String) | | PA | Plate Appearances. (Integer) | | 1Bw | Weighted value of singles. (Float) | | 2Bw | Weighted value of doubles. (Float)...

  8. MLB World Series most games by players 2024

    • statista.com
    Updated Nov 1, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2024). MLB World Series most games by players 2024 [Dataset]. https://www.statista.com/statistics/1369849/mlb-world-series-most-games-players/
    Explore at:
    Dataset updated
    Nov 1, 2024
    Dataset authored and provided by
    Statistahttp://statista.com/
    Area covered
    North America
    Description

    Yogi Berra played in a record 75 MLB World Series games in a career spanning from 1946 to 1965. Berra spent his whole career in New York, first playing for the Yankees, before playing a single season for the Mets in 1965. The catcher won the World Series 10 times as a player, before claiming three more rings as a coach and manager.

  9. o

    Major-League Baseball Player Salaries by Year, 1880-1919

    • openicpsr.org
    stata
    Updated Jan 3, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    John Charles Bradbury (2017). Major-League Baseball Player Salaries by Year, 1880-1919 [Dataset]. http://doi.org/10.3886/E100390V1
    Explore at:
    stataAvailable download formats
    Dataset updated
    Jan 3, 2017
    Dataset provided by
    Kennesaw State University
    Authors
    John Charles Bradbury
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Jan 1, 1880 - Dec 31, 1919
    Description

    During the early days of professional baseball, the dominant major leagues imposed a “reserve clause” designed to limit player wages by restricting competition for labor. Entry into the market by rival leagues challenged the incumbent monopsony cartel’s ability to restrict compensation. Using a sample of player salaries from the first 40 years of the reserve clause (1880-1919), this study examines the impact of inter-league competition on player wages. This study finds a positive salary effect associated with rival league entry that is consistent with monopsony wage suppression, but the effect is stronger during the 20th century than the 19th century. Changes in levels of market saturation and minor-league competition may explain differences in the effects between the two eras.

  10. U.S. console owners interested in playing MLB The Show 2021, by platform

    • statista.com
    Updated Jun 14, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2022). U.S. console owners interested in playing MLB The Show 2021, by platform [Dataset]. https://www.statista.com/statistics/1237534/share-us-console-owners-play-mlb-the-show/
    Explore at:
    Dataset updated
    Jun 14, 2022
    Dataset authored and provided by
    Statistahttp://statista.com/
    Area covered
    United States
    Description

    As of the first quarter of 2021, 10.4 percent of Xbox console owners in the United States said that they would consider playing the baseball game MLB The Show, in comparison to the 5.6 percent of PlayStation console owners in the same time period. The sports game was released in April 2021 and available on Xbox for the first time since 2006.

  11. Major League Baseball Stadiums

    • hub.arcgis.com
    • gateway-kids-nysdos.hub.arcgis.com
    • +1more
    Updated Feb 9, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Esri (2019). Major League Baseball Stadiums [Dataset]. https://hub.arcgis.com/datasets/f60004d3037e42ad93cb03b9590cafec
    Explore at:
    Dataset updated
    Feb 9, 2019
    Dataset authored and provided by
    Esrihttp://esri.com/
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Description

    This layer shows the locations of Major League Baseball (MLB) stadiums in the United States and Canada. The layer includes a popup with info on the stadium, including the city where it is located, the name of the home team, and its seating capacity.This layer was originally sourced from the Major Sports Venues layer from the Homeland Infrastructure Foundation - Level Data (HIFLD) database (https://gii.dhs.gov/HIFLD). This layer includes a subset of Major Sports Venues, which has been updated with some additional info on the MLB teams that play in these stadiums and re-published from the Esri organization in ArcGIS Online to provide access. Minor updates have been made to the data to add new stadiums and update existing stadium names.

  12. Average age of players in Major League Baseball by club 2023

    • statista.com
    Updated Sep 12, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2023). Average age of players in Major League Baseball by club 2023 [Dataset]. https://www.statista.com/statistics/236223/major-league-baseball-clubs-by-average-age-of-players/
    Explore at:
    Dataset updated
    Sep 12, 2023
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    2023
    Area covered
    United States
    Description

    In 2023, the average of players within each MLB team was between around 26-30 years old. This is considered to be the prime of a player's career, as they are typically at their peak physical and athletic ability at this age.

    Who is the oldest player in the MLB?

    In 2023, the average age of the players on the New York Yankees' roster was 28.3 years. Out of all the teams in MLB, the Los Angeles Dodgers had the highest average player age. In the same year, the Toronto Blue Jays' average player age was 29.6 years.

    What is Major League Baseball? Major League Baseball (MLB) is the highest level of professional baseball in the United States and Canada. It comprises 30 teams, 29 of which are located in the United States and one in Canada. The teams are divided into two leagues: the American League (AL) and the National League (NL), and each league is further divided into three divisions: East, Central, and West. The teams play a 162-game regular season schedule, with the goal of earning a spot in the postseason, which consists of the AL and NL Championship Series, and the World Series. The team that wins the World Series is declared the champion of the MLB.

    Fans watch at home and live in the stadiums There are many ways to enjoy MLB games, whether you are a die-hard fan, a casual viewer, or a player yourself. You can watch games on TV, or stream them live online. In 2022, the average TV viewership of MLB World Series games stood at 11.8 million. Additionally, many teams have their own websites, social media accounts, and mobile apps that allow fans to stay up-to-date with the latest news, scores, and player stats. It is also possible to purchase tickets to games and watch the action live at the stadium. In 2022, the average attendance at the games in the MLB was 26,808.

  13. MLB - Unpacked json to csv

    • kaggle.com
    Updated Jul 26, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Quvotha (2021). MLB - Unpacked json to csv [Dataset]. https://www.kaggle.com/tomokikmogura/mlb-unpacked-json-to-csv/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 26, 2021
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Quvotha
    Description

    Context

    Competition page: https://www.kaggle.com/c/mlb-player-digital-engagement-forecasting/data

    Content

    "train.csv"'s json-like columns are unpacked by json.loads then stored in csv format. All csv files have date_ column, which indicates "train.csv"'s date column.

  14. o

    ECIN Replication Package for "Temporary Employment and the Protection of...

    • openicpsr.org
    delimited
    Updated Jun 11, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Richard J. Paulsen (2025). ECIN Replication Package for "Temporary Employment and the Protection of Investments in Human Capital: Examining the Major League Baseball Player Market" [Dataset]. http://doi.org/10.3886/E232561V1
    Explore at:
    delimitedAvailable download formats
    Dataset updated
    Jun 11, 2025
    Dataset provided by
    University of Michigan
    Authors
    Richard J. Paulsen
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    2009 - 2017
    Description

    The data included in this replication package include Major League Baseball player performance and contract data. The study looks at how temporary/permanent employment status of MLB players impacts injury management. The study's abstract is as follows:When employees are employed in a temporary capacity, employers should be less willing to invest in their human capital relative to permanent employees. This study uses the context of injury management by Major League Baseball teams to test for differential investment in the protection of player human capital. Injury management is inherently uncertain as medical professionals can give differing opinions, so teams may be able to influence recovery times. Using a panel dataset and estimating player fixed-effects regressions, players are found to miss significantly fewer games to injury when employed on a temporary basis.

  15. mlb-schedule-formatted-for-digital-engagement

    • kaggle.com
    Updated Jul 26, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    mlandry (2021). mlb-schedule-formatted-for-digital-engagement [Dataset]. https://www.kaggle.com/mlandry/mlbscheduleformattedfordigitalengagement/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 26, 2021
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    mlandry
    Description

    Context

    MLB 2021 schedule, formatted to fit the format of the MLB Player Digital Engagement competition. The schedule is available from numerous sources, but this data set was created using: https://www.baseball-reference.com/leagues/MLB-schedule.shtml It was obtained on June 18th, and has not been updated. There surely have been changes to the schedule since June 18th, and these will not be reflected.

    Content

    Each team's schedule is available twice, once as the primary teamId, but also as the opponentId, listed again with the opposite frame of reference. Double-headers may not accounted for properly, both forward/future and backward/history (as of June 18th).

    Acknowledgements

    baseball-reference.com

  16. a

    Major League Baseball Stadiums (with Placekey)

    • disasters-geoplatform.hub.arcgis.com
    • share-open-data-njtpa.hub.arcgis.com
    • +1more
    Updated Oct 3, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Esri (2020). Major League Baseball Stadiums (with Placekey) [Dataset]. https://disasters-geoplatform.hub.arcgis.com/datasets/esri::major-league-baseball-stadiums-with-placekey
    Explore at:
    Dataset updated
    Oct 3, 2020
    Dataset authored and provided by
    Esri
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Description

    This layer shows the locations of Major League Baseball (MLB) stadiums in the United States and Canada, with an added Placekey to enable joining with other datasets. The layer includes a popup with info on the stadium, including the city where it is located, the name of the home team, and its seating capacity.This layer was originally sourced from the Major Sports Venues layer from the Homeland Infrastructure Foundation - Level Data (HIFLD) database (https://gii.dhs.gov/HIFLD). This layer includes a subset of Major Sports Venues, which has been updated with some additional info on the MLB teams that play in these stadiums and re-published from the Esri organization in ArcGIS Online to provide access. Minor updates have been made to the data to add new stadiums and update existing stadium names.

  17. o

    SLD: Sports Leagues Dataset

    • explore.openaire.eu
    • data.niaid.nih.gov
    Updated Jun 25, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    André A. Bastos; Matheus O. Salim; Wladmir C. Brandão (2019). SLD: Sports Leagues Dataset [Dataset]. http://doi.org/10.5281/zenodo.3256432
    Explore at:
    Dataset updated
    Jun 25, 2019
    Authors
    André A. Bastos; Matheus O. Salim; Wladmir C. Brandão
    Description

    The Sports Leagues Dataset (SLD) contains statistical data of the major professional sports leagues in the United States: NFL (National Football League), NBA (National Basketball Association), NHL (National Hockey League) and MLB (Major League Baseball). One collect five topics (Player Expenses, Player Salaries, Players Performance, Team Salaries, Team Valuation) of two dimensions (Finance and Performance) in different seasons (2000-2007) from three data sources (Forbes, Spotrac and Sports Reference). Please consider citing https://doi.org/10.5281/zenodo.3256432 if you found this dataset useful: [1] André Albino Bastos, Matheus de Oliveira Salim, Wladmir Cardoso Brandão. (2019). SLD: The Sports Leagues Dataset (Version 1.0) [Data set]. Zenodo.

  18. Average player salary in MLB by team 2019

    • statista.com
    Updated Jun 13, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2022). Average player salary in MLB by team 2019 [Dataset]. https://www.statista.com/statistics/675254/average-mlb-salary-by-team/
    Explore at:
    Dataset updated
    Jun 13, 2022
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    2019
    Area covered
    Canada, United States
    Description

    The statistic shows the average player salary of the teams in Major League Baseball in 2019. The New York Yankees had an average player salary of 7.69 million U.S. dollars for the 2019 season.

  19. MLB 2017 Regular Season Top Hitters

    • kaggle.com
    zip
    Updated Dec 4, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    bitroy (2017). MLB 2017 Regular Season Top Hitters [Dataset]. https://www.kaggle.com/bitroy/mlb-2017-regular-season-top-hitters
    Explore at:
    zip(5846 bytes)Available download formats
    Dataset updated
    Dec 4, 2017
    Authors
    bitroy
    Description

    Context

    The top ranked 144 hitters in MLB during 2017's regular season.

    Content

    The data for each player includes:
    Team - MLB Team Pos - Field Position G - Games Played AB - At Bats R - Runs Scored H - Hits 2B - Doubles 3B - Triples HR - Home Runs RBI _ Runs Batted In BB - Walks SO - Strike Outs SB - Stolen Bases CS - Times picked off while trying to steal AVG - Batting Average (hits/At Bats) OBP - On Base Percentage (H+BB+HBP)/(AB+BB+HBP+SF) SLG - Slugging Percentage (TB/AB) Total bases divided by at bats
    OPS - On base percentage plus slugging (OBP + SLG)

    Acknowledgements

    Major League Baseball makes a lot of statistics available for you at: http://mlb.mlb.com/stats

    Inspiration

    This is just a fun data set to play with for nebies. Inspired by the fun of Baseball.

  20. d

    Performance and social connection data for baseball and basketball from 2001...

    • search.dataone.org
    • datadryad.org
    Updated May 4, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Emily Evans; Benjamin Webb; Rebecca Jones (2025). Performance and social connection data for baseball and basketball from 2001 to 2020 [Dataset]. http://doi.org/10.5061/dryad.g4f4qrfs5
    Explore at:
    Dataset updated
    May 4, 2025
    Dataset provided by
    Dryad Digital Repository
    Authors
    Emily Evans; Benjamin Webb; Rebecca Jones
    Time period covered
    Jan 1, 2022
    Description

    We examine whether social data can be used to predict how members of Major League Baseball (MLB) and members of the National Basketball Association (NBA) transition between teams during their career. We find that incorporating social data into various machine learning algorithms substantially improves the algorithms' ability to correctly determine these transitions in the NBA but only marginally in MLB. We also measure the extent to which player performance and team fitness data can be used to predict transitions between teams. This data, however, only slightly improves our predictions for players for both basketball and baseball players. We also consider whether social, performance, and team fitness data can be used to infer past transitions. Here we find that social data significantly improves our inference accuracy in both the NBA and MLB but player performance and team fitness data again does little to improve this score.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
John Adamek (2022). Game by Game MLB Batter Data (2017-2020) [Dataset]. https://www.kaggle.com/datasets/johnadamek/game-by-game-mlb-batter-data-20172020
Organization logo

Game by Game MLB Batter Data (2017-2020)

Individual Batter data by at-bats

Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Aug 5, 2022
Dataset provided by
Kaggle
Authors
John Adamek
Description

Content

This dataset utilized raw data from Advanced Sports Analytics (https://www.advancedsportsanalytics.com/).

This is a great website that provides raw MLB game data for every game. It is quite messy and requires a quite a bit cleaning but the data is worth it! Batting, Pitching, and play by play data was exported into csv files for the 2017-2020 seasons. R script is provided

Columns

Key Column information:

Batting Order = Where the player batted in the lineup for that given day Position = The position they played for that game Pit = Total amount of pitches they saw over the course of the game Str = Total amount of strikes they saw over the course of the game Team.R = Total runs scored by the batters team in the game Team.H = Total hits by the batters team in the game Opponent.R = Total runs scored by the opposing team in the game Opponent.H = Total hits by the opposing team in the game X1b.Ump = First base umpire for the game X2b.Ump = Second base umpire for the game X3b.Ump = Third base umpire for the game HP.Ump = Home Plate umpire for the game Date = Date of the game Game.Time = Game time H.A = Home or Away Precipitation = yes/no Sky = Whether it was sunny, cloudy, overcast, rain, drizzle, night, or in dome Stadium = Stadium played in Temperature = Temperature at game time Weather = Character combining temperature, wind speed, wind direction, and stadium/sky ** Wind.Direction** = Direction of the wind speed Wind.Speed = Wind speed in mph Starting.Pitcher = Starting pitcher Over.Under = Over/Under of the game Moneyline = The moneyline for the batters team Wagers = Amount of wagers placed on the game

UPDATE

Unfortunately, it seems like they no longer have this raw data available on their website so I will be uploading the raw data along with the cleaned files so that other's can manipulate the data anyway they like!

Search
Clear search
Close search
Google apps
Main menu