Facebook
TwitterMajor League Baseball is one of the most popular professional sports leagues in North America. The survey depicts the level of interest in the MLB in the United States and it showed that 36 percent of Hispanic respondents were avid fans of the league.
Facebook
TwitterMajor League Baseball (MLB) is a professional sports league in North America made up of 30 teams that compete in the American League and the National League. In 2023, just over ** percent of players within the league were Hispanic or Latino.
Facebook
TwitterA January 2024 survey in the United States revealed that almost 69 percent of MLB fans who attended or watched games were Caucasian. Meanwhile, close to 19 percent of MLB fans were Hispanic.
Facebook
TwitterThis dataset was created by Omar Pelcastre
Facebook
TwitterThere were a total of 949 players on opening day rosters of Major League Baseball teams ahead of the 2024 season. Of these players, almost 28 percent were from countries and territories outside the United States, with the Dominican Republic being the most represented nation.
Facebook
TwitterAttribution-ShareAlike 3.0 (CC BY-SA 3.0)https://creativecommons.org/licenses/by-sa/3.0/
License information was derived automatically
The Lahman Baseball Database is a comprehensive, open-source compilation of statistics and player data for Major League Baseball (MLB). It contains relational data from the 19th century through the most recent complete season, including batting, pitching, and fielding statistics, player demographics, awards, team performance, and managerial records.
This dataset is widely used for exploratory data analysis, statistical modeling, predictive analysis, machine learning, and sports performance forecasting.
This dataset is the latest CSV release of the Lahman Baseball Database, downloaded directly from https://sabr.org/lahman-database/. It includes historical MLB data spanning from 1871 to 2024, organized across 27 structured tables such as: - Batting: Player-level batting stats per year - Pitching: Season-level metrics - People: Biographical data (birth/death, handedness, debut/finalGame) - Teams, Managers: Team records - BattingPost, PitchingPost, FieldingPost: Post-season stats - AllstarFull: all star game - statsHallOfFame: Historical awards and recognitions
Items to explore: - Track league-wide trends in home runs, strikeouts, or batting averages over time - Compare player performance by era, position, or righty/lefty - Create a timeline showing changes in a teams win-loss records - Map birthplace distributions of MLB players over time - Estimate the impact of rule changes on player stats (pitch clock, DH) - Model factors that influence MVP or Cy Young award wins - Predict a players future performance based on historical stats
📘 License
This dataset is released under the Creative Commons Attribution-ShareAlike 3.0 Unported (CC BY-SA 3.0) license. Attribution is required. Derivative works must be shared under the same license.
📝 Official source: https://sabr.org/lahman-database/ 📥 Direct data page: https://www.seanlahman.com/baseball-archive/statistics/ 🖊️ R-Package Documentation: https://cran.r-project.org/web/packages/Lahman/Lahman.pdf
0.1 Copyright Notice & Limited Use License This database is copyright 1996-2025 by SABR, via generious donation from Sean Lahman. This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License. For details see: http://creativecommons.org/licenses/by-sa/3.0/ For licensing information or further information, contact Scott Bush at: sbush@sabr.org 0.2 Contact Information Web site: https://sabr.org/lahman-database/ E-Mail: jpomrenke@sabr.org
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset contains scraped Major League Baseball (MLB) batting statistics from Baseball Reference for the seasons 2015 through 2024. It was collected using a custom Python scraping script and then cleaned and processed in SQL for use in analytics and machine learning workflows.
The data provides a rich view of offensive player performance across a decade of MLB history. Each row represents a player’s season, with key batting metrics such as Batting Average (BA), On-Base Percentage (OBP), Slugging (SLG), OPS, RBI, and Games Played (G). This dataset is ideal for sports analytics, predictive modeling, and trend analysis.
Data was scraped directly from Baseball Reference using a Python script that:
Columns include: - Player – Name of the player - Year – Season year - Age – Age during the season - Team – Team code (2TM for multiple teams) - Lg – League (AL, NL, or 2LG) - G – Games played - AB, H, 2B, 3B, HR, RBI – Core batting stats - BA, OBP, SLG, OPS – Rate statistics - Pos – Primary fielding position
Raw data sourced from Baseball Reference .
Inspired by open baseball datasets and community-driven sports analytics.
Facebook
TwitterA January 2024 survey in the United States revealed that over one quarter of MLB fans who attended or watched games were aged between 50 and 64. Meanwhile, just over four percent of NHL fans were aged between 13 and 17.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Data Collected:
1. team name
2. year
3. wins
4. losses
5. winning percentage
6. games behind
7. wild card games behind
8. record in last 10 games
9. current streak
10. runs scored
11. runs allowed
12. run differential
13. expected win/loss record
14. record at home
15. record when away
16. record against top 50 percent
Facebook
TwitterMajor League Baseball is one of the most popular professional sports leagues in North America. The survey depicts the level of interest in the MLB in the United States and it showed that 33 percent of respondents aged 35 to 44 were avid fans of the league.
Facebook
Twitterhttp://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/
In this dataset, I gather statistics of the top 50 batters from both the American League and National League, ranked by WAR from highest to lowest, from 2004 to 2024. I also include the awards earned by the players throughout their careers, with the goal of helping fans, researchers, and commentators correlate certain variables or statistics with a specific award.
Selection of the top 1 to 50 of each league in each year is based on the WAR metric, since it measures the total contribution of a player to his team.
Find more about WAR on MLB: https://www.mlb.com/glossary/advanced-stats/wins-above-replacement
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Offensive statistics on MLB Players between 1947 and 2017 were used to develop a prediction model for MLB Hall of Fame selection.
Baseball-Reference.com - https://stathead.com/tiny/4tEG2
Facebook
TwitterBy Andy Kriebel [source]
About this dataset
This dataset contains MLB hitting statistics for the 2013 season. The original source of the data is Lahman’s Baseball Database. The original visualization can be found here.
This dataset is interesting because it allows us to see which players were the most cost effective in terms of salary and production. For example, we can see that Miguel Cabrera was the highest paid player in 2013, but he was also one of the most productive hitters in terms of runs batted in (RBIs). On the other hand, we can see that players like Mike Trout and Clayton Kershaw were among the league leaders in production but they were not among the highest paid players.
There are a number of ways to measure a player's cost effectiveness, but one simple method is to compare their salary to their production (measured by runs created, or RC). Players who create a lot of runs while being paid relatively little are more cost effective than players who are paid more but produce less. By this metric, some of the most cost effective players in 2013 were Delmon Young, Wilson Ramos, and Shane Victorino
For more datasets, click here.
- Your notebook can be here!
https://www.kaggle.com/andrewmvd/most-cost-effective-players-of-2019
How to Use This Dataset
This dataset consists of Major League Baseball's most cost effective players of 2019, as measured by WAR per dollar of salary (wWAR/$). WAR is a metric that attempts to measure a player's overall contributions to their team, and includes both offense and defense. You can read more about it here. The dataset includes each player's name, position, team, salary, and wWAR/$.
To use this dataset, you may want to consider the following questions: * Who are the most cost effective players in baseball? * What positions do these players tend to play? * Which teams have the most cost effective players?
- finding the most cost-effective baseball players
- comparing different salary structures among teams
- improving player performance through analytics
If you use this dataset in your research, please credit the original authors.
License
License: Dataset copyright by authors - You are free to: - Share - copy and redistribute the material in any medium or format for any purpose, even commercially. - Adapt - remix, transform, and build upon the material for any purpose, even commercially. - You must: - Give appropriate credit - Provide a link to the license, and indicate if changes were made. - ShareAlike - You must distribute your contributions under the same license as the original. - Keep intact - all notices that refer to this license, including copyright notices.
File: MLB Stats.csv | Column name | Description | |:----------------|:---------------------------------------------------------------| | Player Name | The player's name. (String) | | weight | The player's weight in pounds. (Numeric) | | height | The player's height in inches. (Numeric) | | bats | The player's batting handedness. (String) | | throws | The player's throwing handedness. (String) | | Season | The season in which the statistics were accrued. (String) | | League | The league in which the player played. (String) | | Team | The team for which the player played. (String) | | Franchise | The franchise to which the team belongs. (String) | | G | The number of games the player played. (Numeric) | | AB | The number of at-bats the player had. (Numeric) | | R | The number of runs the player scored. (Numeric) | | H | The number of hits the player had. (Numeric) | | 2B | The number of doubles the player hit. (Numeric) ...
Facebook
TwitterMajor League Baseball (MLB) is a professional sports league in North America made up of ** teams that compete in the American League and the National League. In 2023, only *** percent of MLB players were African American.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Basic statistics for the MLB and NBA Twitter networks using mathematica.
Facebook
TwitterFinancial overview and grant giving statistics of Major League Baseball Youth Foundation
Facebook
TwitterIn this dataset, I gather data on the key statistics of the top 100 hitters in the league and filter them based on whether they were MVPs or not in their respective year. I hope this dataset will be helpful for researchers, fans, and anyone interested in baseball.
The data was extracted from MLB Stats and Statcast
Facebook
TwitterBaseball is a popular American sport played on a diamond-shaped field. Games are 9 innings long and each inning has two halves, the first in which the visiting team bats and the second where the home team bats. Innings end after three outs. An out is when a player from the hitting team is removed from play for the half of the inning due to various reasons. Batters aim to get on base by hitting a ball pitched to them by the pitcher. Batters can get to first, second, or third base depending on how far they hit the ball and how fast they run. If a batter hits the ball past the outfield fences, they, along with any runners on base, automatically score, this is called a home run. Runners can also score if another player hits the ball and then they reach home. The team with the most runs wins the game.
There are 9 defensive positions in baseball, the layout of these positions is labeled in the below diagram.
https://data.scorenetwork.org/_prep/mlb_umpires_2008-2023/images/images.png" alt="">
SOURCE: https://en.wikipedia.org/wiki/Baseball_positions
Behind the catcher, at home plate is an official known as the home plate umpire. The umpire’s role is to enforce the rules and make decisions during a game. Many of these decisions involve calling balls and strikes. Pitches that are considered strikes are pitched within the zone outlined below. Anything outside of that zone is called a ball. If a batter gets 3 strikes, they are out on a strike out. If the batter gets 4 balls they get to go to first base on what is called a walk.
https://data.scorenetwork.org/_prep/mlb_umpires_2008-2023/images/5bd08351ae57fd50e3c91538_Dimensions-Guide-Sports-Baseball-Strike-Zone-Dimensions.svg" alt="">
SOURCE: https://www.dimensions.com/element/strike-zone Major League Baseball (MLB) is a professional baseball league with 30 teams and a 162 game season. The MLB has 76 umpires in total with four umpires in each game. Umpires are stationed at 1st, 2nd, and 3rd base in addition to home plate but the home plate umpire is the only one who makes calls on pitches.
The mlb_umpires.csv dataset looks at cumulative data from MLB homeplate umpires dating as far back as 2008. The boost statistics in the dataset investigate how certain umpires compare to the “average” Major League Baseball umpire. The dataset provides insight on if umpires favor defensive players or offensive plaeyrs more.
The data set has 954 rows with 11 variables. Each row is an MLB home plate umpire combined with a boost_stat ranking how they compare with the average umpire. There are 159 umpires in the dataset with 6 rows per umpire. The data is cumulative from 2008 until 2024.
Facebook
Twitterhttps://www.versussportssimulator.com/terms-of-servicehttps://www.versussportssimulator.com/terms-of-service
Get the latest Major League Baseball game predictions, power and performance rankings, offensive and defensive rankings, and other useful statistics from VersusSportsSimulator.com.
Facebook
TwitterFinancial overview and grant giving statistics of Mlb Foundation
Facebook
TwitterMajor League Baseball is one of the most popular professional sports leagues in North America. The survey depicts the level of interest in the MLB in the United States and it showed that 36 percent of Hispanic respondents were avid fans of the league.