18 datasets found
  1. f

    Approaches and included features.

    • plos.figshare.com
    xls
    Updated Oct 30, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ishara Bandara; Sergiy Shelyag; Sutharshan Rajasegarar; Dan Dwyer; Eun-jin Kim; Maia Angelova (2024). Approaches and included features. [Dataset]. http://doi.org/10.1371/journal.pone.0312278.t002
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Oct 30, 2024
    Dataset provided by
    PLOS ONE
    Authors
    Ishara Bandara; Sergiy Shelyag; Sutharshan Rajasegarar; Dan Dwyer; Eun-jin Kim; Maia Angelova
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    In association football, predicting the likelihood and outcome of a shot at a goal is useful but challenging. Expected goal (xG) models can be used in a variety of ways including evaluating performance and designing offensive strategies. This study proposed a novel framework that uses the events preceding a shot, to improve the accuracy of the expected goals (xG) metric. A combination of previously explored and unexplored temporal features is utilized in the proposed framework. The new features include; “advancement factor”, and “player position column”. A random forest model was used, which performed better than published single-event-based models in the literature. Results further demonstrated a significant improvement in model performance with the inclusion of preceding event information. The proposed framework and model enable the discovery of event sequences that improve xG, which include; opportunities built up from the sides of the 18-yard box, shots attempted from in front of the goal within the opposition’s 18-yard box, and shots from successful passes to the far post.

  2. R

    Xg Dataset

    • universe.roboflow.com
    zip
    Updated May 16, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rostyslav (2025). Xg Dataset [Dataset]. https://universe.roboflow.com/rostyslav-egoyn/xg-ykiip/model/1
    Explore at:
    zipAvailable download formats
    Dataset updated
    May 16, 2025
    Dataset authored and provided by
    Rostyslav
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    Objects Bounding Boxes
    Description

    XG

    ## Overview
    
    XG is a dataset for object detection tasks - it contains Objects annotations for 340 images.
    
    ## Getting Started
    
    You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
    
      ## License
    
      This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
    
  3. f

    Test data results for comparison between expected goals statistic and...

    • plos.figshare.com
    bin
    Updated Jun 2, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    James Mead; Anthony O’Hare; Paul McMenemy (2023). Test data results for comparison between expected goals statistic and traditional metrics. [Dataset]. http://doi.org/10.1371/journal.pone.0282295.t004
    Explore at:
    binAvailable download formats
    Dataset updated
    Jun 2, 2023
    Dataset provided by
    PLOS ONE
    Authors
    James Mead; Anthony O’Hare; Paul McMenemy
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Test data results for comparison between expected goals statistic and traditional metrics.

  4. Model XG BOOST

    • kaggle.com
    Updated Mar 27, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    ShyamSUBEDI (2025). Model XG BOOST [Dataset]. https://www.kaggle.com/datasets/shyamsubedi/model-xg-boost
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Mar 27, 2025
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    ShyamSUBEDI
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Dataset

    This dataset was created by ShyamSUBEDI

    Released under MIT

    Contents

  5. f

    Summary of the results of our model compared to published models.

    • plos.figshare.com
    bin
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    James Mead; Anthony O’Hare; Paul McMenemy (2023). Summary of the results of our model compared to published models. [Dataset]. http://doi.org/10.1371/journal.pone.0282295.t003
    Explore at:
    binAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    PLOS ONE
    Authors
    James Mead; Anthony O’Hare; Paul McMenemy
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The AUC ROC for the optimal model in this research used test data, and used players’ FIFA ratings as a proxy for player ability.

  6. Football Events

    • kaggle.com
    zip
    Updated Jan 25, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Alin Secareanu (2017). Football Events [Dataset]. http://www.kaggle.com/secareanualin/football-events/home
    Explore at:
    zip(22142158 bytes)Available download formats
    Dataset updated
    Jan 25, 2017
    Authors
    Alin Secareanu
    Description

    Context

    Most publicly available football (soccer) statistics are limited to aggregated data such as Goals, Shots, Fouls, Cards. When assessing performance or building predictive models, this simple aggregation, without any context, can be misleading. For example, a team that produced 10 shots on target from long range has a lower chance of scoring than a club that produced the same amount of shots from inside the box. However, metrics derived from this simple count of shots will similarly asses the two teams.

    A football game generates much more events and it is very important and interesting to take into account the context in which those events were generated. This dataset should keep sports analytics enthusiasts awake for long hours as the number of questions that can be asked is huge.

    Content

    This dataset is a result of a very tiresome effort of webscraping and integrating different data sources. The central element is the text commentary. All the events were derived by reverse engineering the text commentary, using regex. Using this, I was able to derive 11 types of events, as well as the main player and secondary player involved in those events and many other statistics. In case I've missed extracting some useful information, you are gladly invited to do so and share your findings. The dataset provides a granular view of 9,074 games, totaling 941,009 events from the biggest 5 European football (soccer) leagues: England, Spain, Germany, Italy, France from 2011/2012 season to 2016/2017 season as of 25.01.2017. There are games that have been played during these seasons for which I could not collect detailed data. Overall, over 90% of the played games during these seasons have event data.

    The dataset is organized in 3 files:

    • events.csv contains event data about each game. Text commentary was scraped from: bbc.com, espn.com and onefootball.com
    • ginf.csv - contains metadata and market odds about each game. odds were collected from oddsportal.com
    • dictionary.txt contains a dictionary with the textual description of each categorical variable coded with integers

    Past Research

    I have used this data to:

    • create predictive models for football games in order to bet on football outcomes.
    • make visualizations about upcoming games
    • build expected goals models and compare players

    Inspiration

    There are tons of interesting questions a sports enthusiast can answer with this dataset. For example:

    • What is the value of a shot? Or what is the probability of a shot being a goal given it's location, shooter, league, assist method, gamestate, number of players on the pitch, time - known as expected goals (xG) models
    • When are teams more likely to score?
    • Which teams are the best or sloppiest at holding the lead?
    • Which teams or players make the best use of set pieces?
    • In which leagues is the referee more likely to give a card?
    • How do players compare when they shoot with their week foot versus strong foot? Or which players are ambidextrous?
    • Identify different styles of plays (shooting from long range vs shooting from the box, crossing the ball vs passing the ball, use of headers)
    • Which teams have a bias for attacking on a particular flank?

    And many many more...

  7. 4

    Research Data for the PhD thesis Advanced Electromagnetic Modelling of the...

    • data.4tu.nl
    zip
    Updated May 30, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Riccardo Ozzola; Daniele Cavallo; Andrea Neto (2024). Research Data for the PhD thesis Advanced Electromagnetic Modelling of the Next Generation (XG) Wireless Communication Systems [Dataset]. http://doi.org/10.4121/9d280382-b6ab-4bf7-9308-d48a7326a38a.v1
    Explore at:
    zipAvailable download formats
    Dataset updated
    May 30, 2024
    Dataset provided by
    4TU.ResearchData
    Authors
    Riccardo Ozzola; Daniele Cavallo; Andrea Neto
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    These are the measurements (S-parameters and farfield patterms) of the prototype discussed in Chapter 4 of the PhD thesis Advanced Electromagnetic Modelling of the Next Generation (XG) Wireless Communication Systems.

  8. f

    League positions resulting in specific consequences for teams in each...

    • plos.figshare.com
    bin
    Updated Jun 2, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    James Mead; Anthony O’Hare; Paul McMenemy (2023). League positions resulting in specific consequences for teams in each league. [Dataset]. http://doi.org/10.1371/journal.pone.0282295.t001
    Explore at:
    binAvailable download formats
    Dataset updated
    Jun 2, 2023
    Dataset provided by
    PLOS ONE
    Authors
    James Mead; Anthony O’Hare; Paul McMenemy
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    League positions resulting in specific consequences for teams in each league.

  9. La Liga - Players Stats Season - 24/25

    • kaggle.com
    Updated Dec 7, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Eduardo Palmieri (2024). La Liga - Players Stats Season - 24/25 [Dataset]. https://www.kaggle.com/datasets/eduardopalmieri/laliga-players-stats
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Dec 7, 2024
    Dataset provided by
    Kaggle
    Authors
    Eduardo Palmieri
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    La Liga Players Performance Dataset

    This dataset provides a comprehensive overview of player performance in the La Liga capturing a wide array of metrics related to gameplay, scoring, passing, and defensive actions. With records detailing individual player statistics across different teams, this dataset is a valuable resource for analysts, data scientists, and fans who are interested in diving into player performance data from one of the world’s top soccer leagues.

    Each entry represents a single player's profile, featuring data on expected goals (xG), expected assists (xAG), touches, dribbles, tackles, and more. This dataset is ideal for analyzing various aspects of player contribution, both offensively and defensively, and understanding their impact on team performance.

    Dataset Columns

    Player: Name of the player Team: Team the player belongs to '#' : Player's jersey number Nation: Nationality of the player Position: Primary playing position on the field Age: Age of the player Minutes: Total minutes played Goals: Number of goals scored Assists: Number of assists Penalty Shoot on Goal: Penalty shots taken on goal Penalty Shoot: Total penalty shots attempted Total Shoot: Total shots attempted Shoot on Target: Shots successfully on target Yellow Cards: Number of yellow cards received Red Cards: Number of red cards received Touches: Total ball touches Dribbles: Total dribbles attempted Tackles: Total tackles made Blocks: Total blocks Expected Goals (xG): Expected goals, calculated based on shooting positions and likelihood of scoring Non-Penalty xG (npxG): Expected goals excluding penalties Expected Assists (xAG): Expected assists, based on actions leading to an expected goal (xG) Shot-Creating Actions: Actions leading to a shot attempt Goal-Creating Actions: Actions leading to a goal Passes Completed: Successful passes completed Passes Attempted: Total passes attempted Pass Completion %: Pass completion rate, expressed as a percentage (some entries have missing values here) Progressive Passes: Passes advancing the ball significantly toward the opponent’s goal Carries: Total ball carries Progressive Carries: Carries advancing the ball significantly toward the opponent’s goal Dribble Attempts: Total dribbles attempted Successful Dribbles: Total successful dribbles Date: Date of record collection or game date

    Potential Use Cases

    Data Visualization: Explore relationships between various performance metrics to identify patterns.

    Player Comparisons: Compare individual players based on goals, assists, xG, xAG, and other metrics.

    Team Analysis: Evaluate contributions of players within the same team to gain insights into team dynamics.

    Predictive Modeling: Use the dataset to build models for predicting game outcomes, goals, or assists based on player performance metrics.

  10. GIS dataset of high-resolution rebound surfaces and ice-free paleotopography...

    • doi.pangaea.de
    html, tsv
    Updated Aug 26, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Etienne Brouard; Pierre-Marc Godbout; Martin Roy (2022). GIS dataset of high-resolution rebound surfaces and ice-free paleotopography of glaciated North America since the LGM based on the ICE-xG (VMy) models' predictions [Dataset]. http://doi.org/10.1594/PANGAEA.947536
    Explore at:
    tsv, htmlAvailable download formats
    Dataset updated
    Aug 26, 2022
    Dataset provided by
    PANGAEA
    Authors
    Etienne Brouard; Pierre-Marc Godbout; Martin Roy
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Variables measured
    File content, Binary Object, Binary Object (File Size)
    Description

    The postglacial rebound is component of the glacial isostatic adjustment which causes the Earth's crust to rebound in regions formerly covered by or adjacent to ice sheets, and subside beneath ocean basins. In North America, the observed postglacial rebound is mainly the result of the Laurentide Ice Sheet deglaciation after it reached its maximum thickness and extent at the Last Glacial Maximum (26.5-19 ka). Global-scale numerical models of glacial isostatic adjustment faithfully reproduce past and current changes in postglacial rebound, but the integration their predictions in a geographic information system to facilitate high-resolution paleotopographic reconstructions remains challenging. We therefore present high-resolution raster datasets of land-deformation and ice-free paleotopography of glaciated North America for several time slices since the Last Glacial Maximum to support geological, paleoenvironmental and archeological studies.

  11. R

    Data from: Lasi Dataset

    • universe.roboflow.com
    zip
    Updated Jan 10, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    MesAI (2025). Lasi Dataset [Dataset]. https://universe.roboflow.com/mesai/lasi/model/1
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jan 10, 2025
    Dataset authored and provided by
    MesAI
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    XG Bounding Boxes
    Description

    Lasi

    ## Overview
    
    Lasi is a dataset for object detection tasks - it contains XG annotations for 384 images.
    
    ## Getting Started
    
    You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
    
      ## License
    
      This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
    
  12. f

    A comparative analysis of DC, CTGAN-DC, XGBoost, CTGAN-XG, and TVAE-XG...

    • plos.figshare.com
    xls
    Updated Dec 31, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Chuan-Sheng Hung; Chun-Hung Richard Lin; Jain-Shing Liu; Shi-Huang Chen; Tsung-Chi Hung; Chih-Min Tsai (2024). A comparative analysis of DC, CTGAN-DC, XGBoost, CTGAN-XG, and TVAE-XG models in Kawasaki Disease experiments. [Dataset]. http://doi.org/10.1371/journal.pone.0314995.t003
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Dec 31, 2024
    Dataset provided by
    PLOS ONE
    Authors
    Chuan-Sheng Hung; Chun-Hung Richard Lin; Jain-Shing Liu; Shi-Huang Chen; Tsung-Chi Hung; Chih-Min Tsai
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Washington
    Description

    A comparative analysis of DC, CTGAN-DC, XGBoost, CTGAN-XG, and TVAE-XG models in Kawasaki Disease experiments.

  13. f

    Selected input feature variables for ML models.

    • plos.figshare.com
    bin
    Updated Sep 21, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Azaz Hassan Khan; Abdullah Shah; Abbas Ali; Rabia Shahid; Zaka Ullah Zahid; Malik Umar Sharif; Tariqullah Jan; Mohammad Haseeb Zafar (2023). Selected input feature variables for ML models. [Dataset]. http://doi.org/10.1371/journal.pone.0286362.t002
    Explore at:
    binAvailable download formats
    Dataset updated
    Sep 21, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Azaz Hassan Khan; Abdullah Shah; Abbas Ali; Rabia Shahid; Zaka Ullah Zahid; Malik Umar Sharif; Tariqullah Jan; Mohammad Haseeb Zafar
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Stock market forecasting is one of the most challenging problems in today’s financial markets. According to the efficient market hypothesis, it is almost impossible to predict the stock market with 100% accuracy. However, Machine Learning (ML) methods can improve stock market predictions to some extent. In this paper, a novel strategy is proposed to improve the prediction efficiency of ML models for financial markets. Nine ML models are used to predict the direction of the stock market. First, these models are trained and validated using the traditional methodology on a historic data captured over a 1-day time frame. Then, the models are trained using the proposed methodology. Following the traditional methodology, Logistic Regression achieved the highest accuracy of 85.51% followed by XG Boost and Random Forest. With the proposed strategy, the Random Forest model achieved the highest accuracy of 91.27% followed by XG Boost, ADA Boost and ANN. In the later part of the paper, it is shown that only classification report is not sufficient to validate the performance of ML model for stock market prediction. A simulation model of the financial market is used in order to evaluate the risk, maximum draw down and returns associate with each ML model. The overall results demonstrated that the proposed strategy not only improves the stock market returns but also reduces the risks associated with each ML model.

  14. f

    Github file.

    • plos.figshare.com
    html
    Updated Sep 21, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Azaz Hassan Khan; Abdullah Shah; Abbas Ali; Rabia Shahid; Zaka Ullah Zahid; Malik Umar Sharif; Tariqullah Jan; Mohammad Haseeb Zafar (2023). Github file. [Dataset]. http://doi.org/10.1371/journal.pone.0286362.s001
    Explore at:
    htmlAvailable download formats
    Dataset updated
    Sep 21, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Azaz Hassan Khan; Abdullah Shah; Abbas Ali; Rabia Shahid; Zaka Ullah Zahid; Malik Umar Sharif; Tariqullah Jan; Mohammad Haseeb Zafar
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The data and script has been uploaded to GitHub. It can be accessed using the following link: https://github.com/AzazHassankhan/Machine-Learning-based-Trading-Techniques/. (IPYNB)

  15. f

    Thermomechanical Fractional Model of TEMHD Rotational Flow

    • plos.figshare.com
    avi
    Updated Jun 2, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    F. Hamza; A. Abd El-Latief; W. Khatan (2023). Thermomechanical Fractional Model of TEMHD Rotational Flow [Dataset]. http://doi.org/10.1371/journal.pone.0168530
    Explore at:
    aviAvailable download formats
    Dataset updated
    Jun 2, 2023
    Dataset provided by
    PLOS ONE
    Authors
    F. Hamza; A. Abd El-Latief; W. Khatan
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    In this work, the fractional mathematical model of an unsteady rotational flow of Xanthan gum (XG) between two cylinders in the presence of a transverse magnetic field has been studied. This model consists of two fractional parameters α and β representing thermomechanical effects. The Laplace transform is used to obtain the numerical solutions. The fractional parameter influence has been discussed graphically for the functions field distribution (temperature, velocity, stress and electric current distributions). The relationship between the rotation of both cylinders and the fractional parameters has been discussed on the functions field distribution for small and large values of time.

  16. f

    The influence of the inner region Rin and peak on the velocity at α = β = 1...

    • plos.figshare.com
    xls
    Updated Jun 21, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    F. Hamza; A. Abd El-Latief; W. Khatan (2023). The influence of the inner region Rin and peak on the velocity at α = β = 1 for different time t. [Dataset]. http://doi.org/10.1371/journal.pone.0168530.t002
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 21, 2023
    Dataset provided by
    PLOS ONE
    Authors
    F. Hamza; A. Abd El-Latief; W. Khatan
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The influence of the inner region Rin and peak on the velocity at α = β = 1 for different time t.

  17. f

    The constants of the problem.

    • plos.figshare.com
    xls
    Updated Jun 2, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    F. Hamza; A. Abd El-Latief; W. Khatan (2023). The constants of the problem. [Dataset]. http://doi.org/10.1371/journal.pone.0168530.t001
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 2, 2023
    Dataset provided by
    PLOS ONE
    Authors
    F. Hamza; A. Abd El-Latief; W. Khatan
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The constants of the problem.

  18. f

    Comparative analysis of previous and proposed study.

    • plos.figshare.com
    bin
    Updated Sep 21, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Azaz Hassan Khan; Abdullah Shah; Abbas Ali; Rabia Shahid; Zaka Ullah Zahid; Malik Umar Sharif; Tariqullah Jan; Mohammad Haseeb Zafar (2023). Comparative analysis of previous and proposed study. [Dataset]. http://doi.org/10.1371/journal.pone.0286362.t001
    Explore at:
    binAvailable download formats
    Dataset updated
    Sep 21, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Azaz Hassan Khan; Abdullah Shah; Abbas Ali; Rabia Shahid; Zaka Ullah Zahid; Malik Umar Sharif; Tariqullah Jan; Mohammad Haseeb Zafar
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Comparative analysis of previous and proposed study.

  19. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Ishara Bandara; Sergiy Shelyag; Sutharshan Rajasegarar; Dan Dwyer; Eun-jin Kim; Maia Angelova (2024). Approaches and included features. [Dataset]. http://doi.org/10.1371/journal.pone.0312278.t002

Approaches and included features.

Related Article
Explore at:
xlsAvailable download formats
Dataset updated
Oct 30, 2024
Dataset provided by
PLOS ONE
Authors
Ishara Bandara; Sergiy Shelyag; Sutharshan Rajasegarar; Dan Dwyer; Eun-jin Kim; Maia Angelova
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

In association football, predicting the likelihood and outcome of a shot at a goal is useful but challenging. Expected goal (xG) models can be used in a variety of ways including evaluating performance and designing offensive strategies. This study proposed a novel framework that uses the events preceding a shot, to improve the accuracy of the expected goals (xG) metric. A combination of previously explored and unexplored temporal features is utilized in the proposed framework. The new features include; “advancement factor”, and “player position column”. A random forest model was used, which performed better than published single-event-based models in the literature. Results further demonstrated a significant improvement in model performance with the inclusion of preceding event information. The proposed framework and model enable the discovery of event sequences that improve xG, which include; opportunities built up from the sides of the 18-yard box, shots attempted from in front of the goal within the opposition’s 18-yard box, and shots from successful passes to the far post.

Search
Clear search
Close search
Google apps
Main menu