Open Data Commons Attribution License (ODC-By) v1.0https://www.opendatacommons.org/licenses/by/1.0/
License information was derived automatically
IPL match data from 2008-2023 was downloaded from cricsheet.org in JSON format. I used the pandas Python library to transform this data into ball-by-ball data with a number of relevant and useful columns. After this, the data was saved as a csv file.
This dataset was created as part of a project where I created metrics to rank players for T20 Internationals and the Indian Premier League (IPL). In the IPL Data Analysis notebook found below, I perform some exploratory data analysis before creating metrics to rank batters, bowlers and all-rounders based on the Runs Added/Reduced Compared to the Average Player. The entire project materials can be found at https://github.com/jamiewelsh25/Cricket_Data_Project/.
Furthermore, If you are interested in how I used this data to train predictive models for second innings chase success and first innings scores, then check out the notebooks below!
Since its introduction in 2003, T20 has become the star attraction within cricket for its high voltage action and exciting phases of play. As part of its popularity, the IPL (Indian Premier League), formed in 2008, began and kicked off one of the largest annual sports events in the world alongside the Football Premier League and NBA (National Basketball Association) until today. (Rumsby B., 2018) As a result of its increasing demand, teams are spending money towards a variety of avenues to gain a distinct competitive edge. One of these avenues is within data science to analyse player performance and opposition performance. As a result, the analysis of a league like this IPL is of growing importance. The data found from Kaggle. It uses two sets of data based from 2008 -2017; match-by-match and ball by ball statistics. The datasets are suitable to extract some key data, provide some statistical descriptive and visualizations and apply some machine learning techniques using Python. In this report, we aim to answer 2 key questions: 1. What is the best classification algorithm to predict the results of two teams accurately, demonstrating the most important features within it? 2. What is the best regression algorithm to predict a bowler’s runs conceded?
Not seeing a result you expected?
Learn how you can add new datasets to our index.
Open Data Commons Attribution License (ODC-By) v1.0https://www.opendatacommons.org/licenses/by/1.0/
License information was derived automatically
IPL match data from 2008-2023 was downloaded from cricsheet.org in JSON format. I used the pandas Python library to transform this data into ball-by-ball data with a number of relevant and useful columns. After this, the data was saved as a csv file.
This dataset was created as part of a project where I created metrics to rank players for T20 Internationals and the Indian Premier League (IPL). In the IPL Data Analysis notebook found below, I perform some exploratory data analysis before creating metrics to rank batters, bowlers and all-rounders based on the Runs Added/Reduced Compared to the Average Player. The entire project materials can be found at https://github.com/jamiewelsh25/Cricket_Data_Project/.
Furthermore, If you are interested in how I used this data to train predictive models for second innings chase success and first innings scores, then check out the notebooks below!