Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
The dataset contains every player drafted in the NHL Draft from (1963 - 2022).
The data was collected from Sports Reference then cleaned for data analysis.
Tabular data includes:
- year: Year of draft
- overall_pick: Overall pick player was drafted
- team: Team player drafted to
- player: Player drafted
- nationality: Nationality of player drafted
- position: Player position
- age: Player age
- to_year: Year draft pick played to
- amateur_team: Amateur team drafted from
- games_played: Total games played by player (non-goalie)
- goals: Total goals
- assists: Total assists
- points: Total points
- plus_minus: Plus minus of player
- penalties_minutes: Penalties in minutes
- goalie_games_played: Goalie games played
- goalie_wins
- goalie_losses
- goalie_ties_overtime: Ties plus overtime/shootout losses
- save_percentage
- goals_against_average
- point_shares
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset features the salaries of 874 nhl players for the 2016/2017 season. I have randomly split the players into a training (612 players) and test (262 players) populations. There are 151 predictor columns (described in column legend section, if you're not familiar with hockey the meaning of some of these may be a bit cryptic!) as well as a leading column with the players 2016/2017 annual salary. For the test population the actual salaries have been broken off into a separate .csv file.
Raw excel sheet was acquired http://www.hockeyabstract.com/
Can you build a model to predict NHL player's salaries? What are the best predictors of how much a player will make?
Acronym - Meaning
%FOT - Percentage of all on-ice faceoffs taken by this player.
+/- - Plus/minus
1G - First goals of a game
A/60 - Events Against per 60 minutes, defaults to Corsi, but can be set to another stat
A1 - First assists, primary assists
A2 - Second assists, secondary assists
BLK% - Percentage of all opposing shot attempts blocked by this player
Born - Birth date
C.Close - A player shot attempt (Corsi) differential when the game was close
C.Down - A player shot attempt (Corsi) differential when the team was trailing
C.Tied - A player shot attempt (Corsi) differential when the team was tied
C.Up - A player shot attempt (Corsi) differential when the team was in the lead
CA - Shot attempts allowed (Corsi, SAT) while this player was on the ice
Cap Hit - The player's cap hit
CBar - Crossbars hit
CF - The team's shot attempts (Corsi, SAT) while this player was on the ice
CF.QoC - A weighted average of the Corsi percentage of a player's opponents
CF.QoT - A weighted average of the Corsi percentage of a player's linemates
CHIP - Cap Hit of Injured Player is games lost to injury multiplied by cap hit per game
City - City of birth
Cntry - Country of birth
DAP - Disciplined aggression proxy, which is hits and takeaways divided by minor penalties
DFA - Dangerous Fenwick against, which is on-ice unblocked shot attempts weighted by shot quality
DFF - Dangerous Fenwick for, which is on-ice unblocked shot attempts weighted by shot quality
DFF.QoC - Quality of Competition metric based on Dangerous Fenwick, which is unblocked shot attempts weighted for shot quality
DftRd - Round in which the player was drafted
DftYr - Year drafted
Diff - Events for minus event against, defaults to Corsi, but can be set to another stat
Diff/60 - Events for minus event against, per 60 minutes, defaults to Corsi, but can be set to another stat
DPS - Defensive point shares, a catch-all stats that measures a player's defensive contributions in points in the standings
DSA - Dangerous shots allowed while this player was on the ice, which is rebounds plus rush shots
DSF - The team's dangerous shots while this player was on the ice, which is rebounds plus rush shots
DZF - Shifts this player has ended with an defensive zone faceoff
dzFOL - Faceoffs lost in the defensive zone
dzFOW - Faceoffs win in the defensive zone
dzGAPF - Team goals allowed after faceoffs taken in the defensive zone
dzGFPF - Team goals scored after faceoffs taken in the defensive zone
DZS - Shifts this player has started with an defensive zone faceoff
dzSAPF - Team shot attempts allowed after faceoffs taken in the defensive zone
dzSFPF - Team shot attempts taken after faceoffs taken in the defensive zone
E+/- - A player's expected +/-, based on his team and minutes played
ENG - Empty-net goals
Exp dzNGPF - Expected goal differential after faceoffs taken in the defensive zone, based on the number of them
Exp dzNSPF - Expected shot differential after faceoffs taken in the defensive zone, based on the number of them
Exp ozNGPF - Expected goal differential after faceoffs taken in the offensive zone, based on the number of them
Exp ozNSPF - Expected shot differential after faceoffs taken in the offensive zone, based on the number of them
F.Close - A player unblocked shot attempt (Fenwick) differential when the game was close
F.Down - A player unblocked shot attempt (Fenwick) differential when the team was trailing
F.Tied - A player unblocked shot attempt (Fenwick) differential when the team was tied
F.Up - A player unblocked shot attempt (Fenwick) differential when the team was in the lead. Not the best acronym.
F/60 - Events For per 60 minutes, defaults to Corsi, but can be set to another stat
FA - Unblocked shot attempts allowed (Fenwick, USAT) while this player was on the ice
FF - The team's unblocked shot attempts (Fenwick, USAT) while this player was on the ice
First Name -
FO% - Faceoff winning percentage
FO%vsL - Faceoff winning percentage against lefthanded opponents
FO%vsR - Faceoff winning percentage against righthanded opponents
FOL - The team's faceoff losses...
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
## Overview
Hockey Player (sample 2398) is a dataset for object detection tasks - it contains Hockey Players annotations for 2,398 images.
## Getting Started
You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
## License
This dataset is available under the [Public Domain license](https://creativecommons.org/licenses/Public Domain).
Facebook
TwitterFor some, statistical analysis increases the enjoyment of sport. My long-term aim is to build an automated database driven advanced stats website like many that have come and gone before however that first starts with statistical analysis of the game to assess what and how much individual actions contribute to the outcome of the game.
The data represents all the official metrics measured for each game in the NHL in the past 6 years. I intend to update it semi-regularly depending on development progress of my database server.
This is a mostly rational database, please refer to the "table_realtionships.jpg" for details on how the tables can be joined. This is not just the results and player stats of NHL games but also details on individual plays such as shots, goals and stoppages including date & time and x,y coordinates.
The dataset is incomplete, there are some games where no plays information is available on NHL.com. It is rare and I do not know the reasons.
Thanks to Kevin Sidwar who began documenting the still un-documented NHL stats API which was used to gather this data.
Compared to other sports, advanced statistics in Hockey are still in infancy. It has been suggested that the best models can only predict the winner 62% of the time due to variances in talent and "puck luck".
I would like to believe feature engineering and a suitably trained model can account for some of this variance and beat this seemingly low target.
Otherwise, what metrics can be developed to provide better indications than Corsi & Fenwick?
Facebook
Twitterhttp://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/
This dataset contains comprehensive player statistics and contract details for the 2024-25 National Hockey League (NHL) season. It merges both on-ice performance metrics and contractual information, making it valuable for fans, analysts, sports journalists, and data scientists interested in exploring hockey performance, salary cap dynamics, and advanced analytics.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The term “relative age effect” (RAE) is used to describe a bias in which participation in sports (and other fields) is higher among people who were born at the beginning of the relevant selection period than would be expected from the distribution of births. In sports, RAEs may affect the psychological experience of players as well as their performance. This article presents 2 studies. Study 1 aims to verify the prevalence of RAEs in minor hockey and test its associations with players' physical self-concept and attitudes toward physical activities in general. Study 2 verifies the prevalence of the RAE and analyzes the performance of Canadian junior elite players as a function of their birth quartile. In study 1, the sample is drawn from 404 minor hockey players who have evolved from a recreational to an elite level. Physical self-concept and attitudes toward different kinds of physical activities were assessed via questionnaires. Results showed that the RAE is prevalent in minor hockey at all competition levels. Minor differences in favor of Q1-born players were observed regarding physical self-concept, but not attitudes. In study 2, data analyses were conducted from the 2018–2019 Canadian Hockey League database. Birth quartiles were compared on different components of performance by using quantile regression on each variable. Results revealed that RAEs are prevalent in the CHL, with Q1 players tending to outperform Q4 players in games played and power-play points. No other significant differences were observed regarding anthropometric measures and other performance outcomes. RAEs are still prevalent in Canadian hockey. Building up perceived competence and providing game-time exposure are examples of aspects that need to be addressed when trying to minimize RAEs in ice hockey.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
In this survey, Canadian adults who have attended a youth hockey game in the past two years were surveyed about witnessing parent’s bad behaviour at games.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The dataset contains an attribute set, attribute outcomes and cases for a knowledge based system that might be usable to evaluate the Game Sense skillset of ice hockey players. The dataset is built from 10 high level experts' knowledge.
Facebook
TwitterI wanted to learn how to scrape data from web pages into my R sessions to analyze things I otherwise wouldn't be able to analyze. I found an incredibly helpful tutorial on DataCamp.com, but I also decided that, in order to *really *learn it, I needed to pick my own dataset to work with. I am a huge hockey fan and I've wanted to play with some hockey data for a while, but I hadn't quite found what I was looking for here on Kaggle... so I decided to kill two birds with one stone and make this dataset.
Within, there's year-by-year skater stats from 30 leagues across the most recent 38 seasons. There's also a "dim" table for each player where I scraped their height, weight, birthdate, birthplace, and draft position (if available).
All data was gathered from EliteProspects.com using the rvest package in R. Special thanks to EliteProspects for maintaining the most complete world ice hockey database that I've seen online, the creators of rvest, and to Arvid Kingl for the incredibly helpful rvest tutorial that helped me get up and going on this project.
I'm mostly excited to build some cool visuals and models with the data. I want to answer questions like: at what age do NHL players peak? Is it different depending on what round they're drafted in? How well do we expect a player to fare in X league based on how he did in Y league the preceding season?
Facebook
TwitterHockey-player birthplaces from the "Master" table of the Hockey Databank database (August 2015 update), joined to the "Scoring" and "Goalies" tables (each summarized by playerID, for NHL/WHA players only) and then exported.Subset containing only players who played in the NHL or WHA, and time-enabled on the range of each player's first and last NHL or WHA season (see firstSeason and lastSeason fields).
Facebook
TwitterThe Hockey Database is a collection of historical statistics from men's professional hockey teams in North America.
Note that as of v1, this dataset is missing a few files, due to Kaggle restrictions on the number of individual files that can be uploaded. The missing files will be noted in the description below.
The dataset contains the following tables (all are csv):
Descriptions of the individual fields in each file can be found in the file's description.
The Hockey Databank project allows for free usage of its data, including the production of a commercial product based upon the data, subject to the terms outlined below.
1) In exchange for any usage of data, in whole or in part, you agree to display the following statement prominently and in its entirety on your end product:
"The information used herein was obtained free of charge from and is copyrighted by the Hockey Databank project. For more information about the Hockey Databank project please visit http://sports.groups.yahoo.com/group/hockey-databank"
2) Your usage of the data constitutes your acknowledgment, acceptance, and agreement that the Hockey Databank project makes no guarantees regarding the accuracy of the data supplied, and will not be held responsible for any consequences arising from the use of the information presented.
This dataset was downloaded from the hockey database at Open Source Sports. The original acknowledgments are as follows:
A variety of sources were consulted while constructing this database. These are listed below in no particular order.
Books:
Periodicals:
On-line sources:
Facebook
TwitterAttribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
License information was derived automatically
The IDMT-ISA-PUCKS dataset (IIPD) was designed to simulate the challenging acoustic analysis conditions consistent with industrial manufacturing settings. The dataset contains audio recordings of multiple games of air-hockey played with pucks of different plastic materials. Data collection was performed by equipping the air hockey table with two sE8 microphones, each recording one side of the table, as seen in the image above, while a game is played. Additionally, there are recordings where no game was being played and only background noise was recorded.
We recorded the games played with different pucks at three different noise levels: Level 1 at room volume (vol_000), Level 2 with some background noise (vol_050 = 70 CBR) and Level 3 at loud background noise (vol_100 = 80 CBR). The background noise was played over four speakers in equal distances around the table and contains human voices.
The following materials were used for the four pucks:
Puck_A is the original factory puck (material unknown)
Puck_E from the 3D printer (material: ABS, print process: FDM)
Puck_G from the 3D printer (material: PA2200, print process: SLS)
Puck_I from the 3D printer (material: PA12, print process: MJF)
For each noise level and puck material, five three-minute games were played with different pucks of the specified material. Further, each game was played with different sets of players. The recordings were made via two sE8 microphones placed in the middle of the air-hockey table (about 10 cm above the surface).
Dataset total duration: 260 minutes (1 min per file)
Sampling rate: 44.1KHz
Resolution: 32-bit
Stereo audio
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
IntroductionIce hockey is a sport that has gained much attention in recent times, particularly concerning the development of young players. In the domain of youth sport development, one significant factor that must be considered is the perceived competence of players. This variable is closely linked to positive psychological outcomes and sustained practice. However, there is a lack of understanding about how other important developmental factors such as age, early sport specialization, players’ position and relative age affect players’ perceived competence. Therefore, the objective of this study is to explore the relationships between these developmental factors, perceived ice hockey competence and a global measure of perceived sport competence.MethodsData was drawn from 971 players (14.78 ± 1.61 mean age), who completed on-line questionnaires, from which we conducted path analyses involving all variables.ResultsYounger players tend to display higher perceived competence scores than older players. Additionally, players who opted to specialize earlier also reported higher perceived competence. Furthermore, forwards and defensemen had differing perceptions of their competence, which was in line with their respective roles on the ice. The study also showed relative age effects, in which players who were born earlier relative to the selection period tend to perceive themselves more advantageously in three components of perceived competence.DiscussionBased on these findings, several recommendations are proposed for coaches and decision-makers to encourage the positive development of ice hockey players. The study highlights that ice hockey-specific competencies are influenced by various factors, such as early sport specialization, relative age effect, player age, and position.
Facebook
TwitterThe Kontinental Hockey League is now past its 13th season. While this is a rather modest number compared to the NHL and many other leagues, it can still provide us with enough data points to try and learn things about the league's players.
The data presented here includes 3 files, each of them containing data on all players in the KHL history. Or at least all players that the KHL website has data on.
The first one is player information - how big is he, what shoot he uses and such.
The second file contains performance statistics for every season during which a player have participated in at least one official match. The data may be divided into several parts: regular season, playoffs and off-season tournaments such as Nadezhda Cup. There are two reasons behind this design: not all teams participate in playoffs or off-season tournaments every year, and the data is stored that way on the KHL website. Moreover, for each player there is also a combined statistics for all his KHL seasons. It follows the same style.
The third file is on a level of individual matches. Every official match a player has ever played in, with the season indicated. However, there is a certain quirk in the data. The off-season matches are not considered official matches (which makes sense) and they are not included in the match statistics, yet they are present in the season statistics as a separate line. That creates situations when a few players are only present in the player information and season statistics and not in the match statistics.
All data belongs to the Kontinental Hockey League and was taken from their website, https://en.khl.ru/
All code used to collect data as well as process and (attempt to) analyse it is available on https://github.com/Dark-Hobbit/khl
At the moment, I see three main questions which this dataset might attempt to answer.
Facebook
TwitterDespite the traditional use of average values for determining physical demands, the intermittent and fluctuating nature of team sports may lead to underestimation of the most demanding scenarios. All the most demanding scenario-related investigations to date only report one maximal scenario per game, the greatest. However, the latest research on this subject has shown additional scenarios of equal or similar magnitude that most researchers have not considered. This repetition concept started a new way of describing competition and training loads; then the study aims were: first, to quantify and assess differences between playing positions in terms of the most demanding scenarios in official matches; and second, to quantify and assess the differences between playing positions in the repetition of different intensity scenarios relative to the most demanding individual scenario. We monitored nine professional rink hockey players (7 exterior and 2 interior players) in 18 competitive matches using an electronic performance tracking system. The interior players are closest to the opponent’s goal, while the exterior players are farthest from it. Peak physical demands variables included total distance (m), distance covered at >18 km·h-1 (m), the number of accelerations (≥2 m∙s-2, count) and decelerations (≤-2 m∙s-2, count) in 30 s. An average from the top three individual most demanding scenarios was used to define a reference value to quantify the distribution scenario repetition during matches. The results showed that peak demands in rink hockey are position-dependent, with more distance covered by exterior players and more accelerations performed by interior players. In addition, rink hockey matches include multiple scenario exposures that are close to the peak physical demands of a match. Using the results of this study, coaches can prepare tailored training plans for each position, focusing on distances covered or accelerations for exterior players.
Facebook
TwitterThe National Hockey League (NHL) is the top professional men’s hockey league in the world. The league records every shot players take along with contextual information about the shot such as its location, the player’s distance and angle to the goal when attempting the shot, as well as the outcome (blocked, missed, or goal). Using this information, the hockey analytics community have developed measures of shot quality known as expected goals. With this dataset, you can create your own expected goals model to predict the shot outcome given relevant features.
This dataset contains information about 160,573 shots during the 2021-2022 NHL season.
Build a logistic regression model to predict whether or not the shot will result in a goal based on the shot distance and angle.
Build a classification model to predict the outcome based on the spatial x,y coordinates of the shot.
Create a visualization displaying the joint frequency of shot locations. Do there appear to be any clear modes of frequently taken shots? Create a conditional version of this display by shot outcome. Does the distribution shape vary by shot outcome? (You can also perform a similar analysis by team).
Morse D (2023). hockeyR: Collect and Clean Hockey Stats. R package version 1.3.1, https://github.com/danmorse314/hockeyR.
Facebook
TwitterWhole-organism performance capacity is thought to play a key role in sexual selection, through its impacts on both intrasexual competition and intersexual mate choice. Based on data from elite sports, several studies have reported a positive association between facial attractiveness and athletic performance in humans, leading to claims that facial correlates of sporting prowess in men reveal heritable or non-heritable mate quality. However, for most of the sports studied (soccer, ice hockey, American football and cycling) it is not possible to separate individual performance from team performance. Here, using photographs of athletes who compete annually in a multi-event World Cup, we examine the relationship between facial attractiveness and individual career-best performance metrics in the biathlon, a multidisciplinary sport that combines target shooting and cross-country skiing. Unlike all previous studies, which considered only male athletes, we report relationships for both sportsmen and sportswomen. As predicted by evolutionary arguments, we found that male biathletes were judged more attractive if (unknown to the raters) they had achieved a higher peak performance (World Cup points score) in their career, whereas there was no significant relationship for female biathletes. Our findings show that elite male athletes display visible, attractive cues that reliably reflect their athletic performance.
Facebook
TwitterData for all skaters, goalies, lines/defensive pairings, and teams are available for the current season going back to the 2008-2009 season.
The data was last updated at 2023-06-14 05:31 Eastern Time. Data is available summarized on the season level and on a game by game level going back to 2008-2009. Season level data is below.
All historical shot data is available to download. This includes 1,717,746 shots from the 2007-2008 to 2022-2023 seasons. Data for the 2023-2024 season will also be available and updated nightly on this page. Saved shots on goal, missed shots, and goals are included. Blocked shots are not included in these datasets. There are 124 attributes for each shot, including everything from the player and goalie involved in the shot to angles, distances, what happened before the shot, and how long players had been on the ice when the shot was taken. Each shot also has model scores for its probability of being a goal (xGoals) as well as other models such as for the chance there will be a rebound after the shot, the probability the shot will miss the net, and whether the goalie will freeze the puck after the shot. The data has been collected from several sources including the NHL and ESPN. A good amount of data cleaning has also been done on the data. Arena adjusted shot coordinates and distances are also calculated in the dataset using the strategy War-On-Ice used from the method proposed by Schuckers and Curros.
There are two separate files which contain a detailed column description!
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Demographic information of study participants.
Facebook
Twitterhttp://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/
The dataset this week comes from Statistics Canada, the NHL team list endpoint, and the NHL API. The dataset was inspired by the blog Are Birth Dates Still Destiny for Canadian NHL Players? by JLaw (via https://universeodon.com/@jlaw/111522860812359901)!
In the first chapter Malcolm Gladwell’s Outliers he discusses how in Canadian Junior Hockey there is a higher likelihood for players to be born in the first quarter of the year.
Because these kids are older within their year they make all the important teams at a young age which gets them better resources for skill development and so on.
While it seems clear that more players are born in the first few months of the year, what isn’t explored is whether or not this would be expected. Maybe more people in Canada in general are born earlier in the year.
I will explore whether Gladwell’s result is expected as well as whether this is still true in today’s NHL for Canadian-born players.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
The dataset contains every player drafted in the NHL Draft from (1963 - 2022).
The data was collected from Sports Reference then cleaned for data analysis.
Tabular data includes:
- year: Year of draft
- overall_pick: Overall pick player was drafted
- team: Team player drafted to
- player: Player drafted
- nationality: Nationality of player drafted
- position: Player position
- age: Player age
- to_year: Year draft pick played to
- amateur_team: Amateur team drafted from
- games_played: Total games played by player (non-goalie)
- goals: Total goals
- assists: Total assists
- points: Total points
- plus_minus: Plus minus of player
- penalties_minutes: Penalties in minutes
- goalie_games_played: Goalie games played
- goalie_wins
- goalie_losses
- goalie_ties_overtime: Ties plus overtime/shootout losses
- save_percentage
- goals_against_average
- point_shares