10 datasets found
  1. NOAA/WDS Paleoclimatology - Heyerdahl fire data from Big Hole - IMPD...

    • catalog.data.gov
    • datasets.ai
    • +1more
    Updated Dec 1, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (Point of Contact); NOAA World Data Service for Paleoclimatology (Point of Contact) (2024). NOAA/WDS Paleoclimatology - Heyerdahl fire data from Big Hole - IMPD USBGH001 [Dataset]. https://catalog.data.gov/dataset/noaa-wds-paleoclimatology-heyerdahl-fire-data-from-big-hole-impd-usbgh001
    Explore at:
    Dataset updated
    Dec 1, 2024
    Dataset provided by
    National Oceanic and Atmospheric Administrationhttp://www.noaa.gov/
    Description

    This archived Paleoclimatology Study is available from the NOAA National Centers for Environmental Information (NCEI), under the World Data Service (WDS) for Paleoclimatology. The associated NCEI study type is Fire. The data include parameters of fire history with a geographic location of Montana, United States Of America. The time period coverage is from 562 to -53 in calendar years before present (BP). See metadata information for parameter and study location details. Please cite this study when using the data.

  2. 2022/23 Big 5 Football Leagues Player Stats

    • kaggle.com
    zip
    Updated Jun 7, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    EmreGuv (2024). 2022/23 Big 5 Football Leagues Player Stats [Dataset]. https://www.kaggle.com/datasets/emreguv/202223-big-5-football-leagues-player-stats
    Explore at:
    zip(406928 bytes)Available download formats
    Dataset updated
    Jun 7, 2024
    Authors
    EmreGuv
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    All data taken from https://fbref.com/

    GitHub to my project: https://github.com/emreguvenilir/fifa23-ml-ratingsystem

    There is another statistics dataset here on Kaggle where the data is totally incomplete. So I took the time, mainly because of a final school project, to download the raw data from R. I then cleaned the data to the specifics of my project. The data contains only players from the big 5 leagues (prem, la liga, bundesliga, ligue 1, serie a.)

    Column Description

    squad: The team of a given player

    comp: The league of the team, only includes the “big 5”

    player: player name

    nation: nationality of the player

    pos: position of the player

    age: age of the player

    born: year born

    MP: matches played

    Minutes_Played: minutes played in the season

    Mn_per_MP: minutes per match played

    Mins_Per_90: minutes per 90 minutes (length of a soccer match)

    Starts: matches started

    PPM_Team.Success: avg # of point earned by the team from matches in which the player appeared with a minimum of 30 minutes

    OnG_Team.Success: goals scored by team while on pitch

    onGA_Team.Success: Goals allowed by team while on pitch plus_per_minus_Team.Success: goals scored minus allowed while on pitch

    Goals: goals scored

    Assists: assists that led to goal

    GoalsAssists: goals + assists

    NonPKG: non penalty kick goals

    PK: penalty kicks made

    PKatt: penalties attempted

    CrdY: yellow cards

    CrdR: red cards

    xG: expected goals based on all shots taken

    xAG: expected assisted goals

    npxG+xAG: non penalty expected goals + assisted goals

    PrgC: progressive carries in the attacking half of the pitch and went at least 10 yards

    PrgP: progressive carries in the attacking half of the pitch and went at least 10 yards

    Gls_Per90: goals per 90 minutes

    Ast_Per90: assists per 90 minutes

    G+A_Per90: goals + assists per 90

    G_minus_PK_Per: goals excluding penalties per 90

    G+A_minus_PK_Per: goals and assists excluding penalties per 90

    xG_Per: xG per 90

    xAG_Per: xAG per 90

    xG+xAG_Per: xG+xAG per 90

    Shots: shots taken

    Shots_On_Target: shots on goal frame

    SoT_percent: sh/SoT * 100

    G_per_Sh: goals per shot taken

    G_per_SoT: goal per shot on target

    Avg_Shot_Dist: avg shot dist

    FK_Standard: shots from free kicks

    G_minus_xG_expected: goals minus expected goals

    np:G_minus_xG_Expected: non penalty goals minus expected goals

    Passes_Completed: passes completed

    Passes_attempted: passes attempted

    Passes_Cmp_percent: pass completion percentage

    PrgDist_Total: progressive pass total distance

    Passes_Cmp_Short: short passes completed (5 to 15 yds)

    Passes_Att_Short: short passes Attempted (5 to 15 yds)

    Passes_Cmp_Percent_Short: short passes completed percentage (5 to 15 yds)

    Passes_Cmp_Medium: medium passes completed (15 to 30 yds)

    Passes_Att_medium: medium passes Attempted (15 to 30 yds)

    Passes_Cmp_Percent_Medium: medium passes completed percentage (15 to 30 yds)

    Passes_Cmp_long: long passes completed (30+ yds)

    Passes_Att_long : long passes Attempted (30+ yds)

    Passes_Cmp_Percent_long : long passes completed percentage (30+ yds)

    A_minus_xAG_expected: assists minus expected assists

    Key_Passes: passes that lead directly to a shot

    Final_third: passes that enter the final third of the field

    PPA: passes into the penalty area

    CrsPA: crosses into penalty area

    TB_pass: through ball passes

    Crs_Pass: number of crosses

    Offside_passes: passes that resulted in an offside

    Blocked_passes: passes blocked by an opponent

    Shot_Creating_Actions: shot creating actions

    SCA_90: shot creating actions per 90

    TakeOnTo_Shot: take ons that led to shot

    FoulTo_Shot: fouls draw that led to shot

    DefAction_Shot: defensive actions that led to a shot (pressing)

    GoalCreatingAction: goal creating actions

    GCA90: goal creating actions per 90

    TakeOn_Goal: take ons that led to a goal

    Fld_goal: fouls drawn that led to a goal

    DefAction_Goal: defensive actions that led to a goal (pressing)

    Tackles: number of tackles made

    Tackles_won: tackles won

    Def_3rd_Tackles: tackles in the defensive 1/3 of the pitch

    Mid_3rd_Tackles: tackles in the middle 1/3 of the pitch

    Att_3rd_Tackles: tackles in the attacking 1/3 of the pitch

    Tkl_percent_won: % of dribblers tackled

    Lost_challenges: lost challenges, unsuccessful attempts to win the ball

    Blocks: # of times blocking the ball by standing in path

    Sh_blocked: shots blocked

    Passes_blocked: number of passes blocked

    Interceptions: interceptions

    Clearances; clearances

    ErrorsLead_ToShot: errors made leading to a shot

    Att_Take: attacking take ons attempted

    Succ:Take: attacking take ons successful

    Succ_percent_take: percentage of attacking take ons successfully

    Tkld_Take: times tackled during a take on

    Tkld_percent_Take: percentage of times tackled during a take on

    TotDist_Carries: total distance carrying the ball in any direction

    PrgDist_carries: progressive carry distance total

    Miscontrolls: # of times a player...

  3. 5.7M+ Records -Most Comprehensive Football Dataset

    • kaggle.com
    zip
    Updated Sep 15, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    salimt (2025). 5.7M+ Records -Most Comprehensive Football Dataset [Dataset]. https://www.kaggle.com/datasets/xfkzujqjvx97n/football-datasets
    Explore at:
    zip(85313220 bytes)Available download formats
    Dataset updated
    Sep 15, 2025
    Authors
    salimt
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    About Dataset – TL;DR

    Comprehensive football (soccer) data lake from Transfermarkt, clean and structured for analysis and machine learning.

    • 93,000+ players worldwide
    • 2,200+ clubs across all major leagues
    • 5.7M+ total records across 10 categories
    • 902,000+ market valuations
    • 1.9M+ player performance stats
    • 1.2M+ player transfer histories
    • 144,000+ injuries & 93,000+ national team appearances
    • 1.3M+ teammate relationships

    Everything in raw CSV format – perfect for EDA, ML, and advanced football analytics.

    The Most Comprehensive Transfermarkt Football Dataset

    A complete football data lake covering players, teams, transfers, performances, market values, injuries, and national team stats. Perfect for analysts, data scientists, researchers, and enthusiasts.

    🗺 Entity-Relationship Overview

    Here’s the high-level schema to help you understand the dataset structure:

    https://i.imgur.com/WXLIx3L.png" alt="Transfermarkt Dataset ER Diagram">

    📊 Key Coverage

    • Players: 93,000+ professional players
    • Teams: 2,200+ clubs, 7,700+ club relationships
    • Data Volume: 5.7M+ total records
    • Global Scope: Major leagues and competitions worldwide

    🗂 Data Structure

    Organized into 10 well-structured CSV categories:

    Player Data (7 categories)

    • Player Profiles
    • Performances (matches, goals, assists, cards, minutes)
    • Market Values (historical valuations)
    • Transfer Histories
    • Injury Records
    • National Team Performances
    • Teammate Networks

    Team Data (3 categories)

    • Team Details (club info)
    • Competitions & Seasons
    • Parent/Child Team Relations

    🔗 What’s Inside?

    • 902K+ market value records to track valuation trends
    • 1.1M+ transfer histories with fees and movement
    • 1.9M+ performance stats across seasons and competitions
    • 144K+ injury records with days and matches missed
    • 93K+ national team appearances
    • 1.3M+ teammate relationships for chemistry analysis

    💡 Why This Dataset?

    Most football datasets are pre-processed and restrictive. This one is raw, rich, and flexible:

    • Build custom KPIs and models
    • Perform deep exploratory analysis (EDA)
    • Train machine learning and prediction pipelines
    • Combine with other football data sources

    🚀 Example Use Cases

    • Predictive Modeling – Player ratings, transfer value forecasts, injury risk
    • Data Visualization & Dashboards – Club comparisons, performance analytics
    • Scouting & Recruitment – Discover undervalued talent
    • Network Analysis – Teammate relationships and synergy

    🖥 Technical Details

    • Format: CSV files, UTF-8 encoded
    • Easy to Use: Ready for Python (pandas, numpy), R, SQL, BI tools
    • Scalable: 5.7M+ rows for big-data analysis

    💡 Working on a Cool Project?

    I’m always excited to collaborate on innovative football data projects. If you’ve got an idea, let’s make it happen together!

    📬 Contact Me

    • GitHub: @salimt
    • Issues: Feel free to use GitHub Issues if you’ve got dataset-specific questions.

    Support & Visibility

    If this dataset helps you:
    - Upvote on Kaggle
    - Star the GitHub repo
    - Share with others in the football analytics community

    Tags

    football analytics soccer dataset transfermarkt sports analytics machine learning football research player statistics

    🔥 Analyze football like never before. Your next AI or analytics project starts here.

  4. NOAA/WDS Paleoclimatology - Ballard fire data from Big Fish Lake - IMPD...

    • catalog.data.gov
    • s.cnmilf.com
    Updated Mar 1, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (Point of Contact); NOAA World Data Service for Paleoclimatology (Point of Contact) (2025). NOAA/WDS Paleoclimatology - Ballard fire data from Big Fish Lake - IMPD USBGF001 [Dataset]. https://catalog.data.gov/dataset/noaa-wds-paleoclimatology-ballard-fire-data-from-big-fish-lake-impd-usbgf0011
    Explore at:
    Dataset updated
    Mar 1, 2025
    Dataset provided by
    National Oceanic and Atmospheric Administrationhttp://www.noaa.gov/
    Description

    This archived Paleoclimatology Study is available from the NOAA National Centers for Environmental Information (NCEI), under the World Data Service (WDS) for Paleoclimatology. The associated NCEI study type is Fire. The data include parameters of fire history|paleolimnology with a geographic location of Michigan, United States Of America. The time period coverage is from Unavailable begin date to Unavailable end date in calendar years before present (BP). See metadata information for parameter and study location details. Please cite this study when using the data.

  5. u

    Data from: USHAP: Big Data Seamless 1 km Ground-level PM2.5 Dataset for the...

    • iro.uiowa.edu
    • data.niaid.nih.gov
    Updated May 1, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jing Wei; Jun Wang; Zhanqing Li (2023). USHAP: Big Data Seamless 1 km Ground-level PM2.5 Dataset for the United States [Dataset]. https://iro.uiowa.edu/esploro/outputs/dataset/USHAP-Big-Data-Seamless-1-km/9984702835302771
    Explore at:
    Dataset updated
    May 1, 2023
    Dataset provided by
    Zenodo
    Authors
    Jing Wei; Jun Wang; Zhanqing Li
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    May 1, 2023
    Area covered
    United States
    Description

    USHAP (USHighAirPollutants) is one of the series of long-term, full-coverage, high-resolution, and high-quality datasets of ground-level air pollutants for the United States. It is generated from the big data (e.g., ground-based measurements, satellite remote sensing products, atmospheric reanalysis, and model simulations) using artificial intelligence by considering the spatiotemporal heterogeneity of air pollution. This is the big data-derived seamless (spatial coverage = 100%) daily, monthly, and yearly 1 km (i.e., D1K, M1K, and Y1K) ground-level PM2.5 dataset in the United States from 2000 to 2020. Our daily PM2.5 estimates agree well with ground measurements with an average cross-validation coefficient of determination (CV-R2) of 0.82 and normalized root-mean-square error (NRMSE) of 0.40, respectively. All the data will be made public online once our paper is accepted, and if you want to use the USHighPM2.5 dataset for related scientific research, please contact us (Email: weijing_rs@163.com; weijing@umd.edu). Wei, J., Wang, J., Li, Z., Kondragunta, S., Anenberg, S., Wang, Y., Zhang, H., Diner, D., Hand, J., Lyapustin, A., Kahn, R., Colarco, P., da Silva, A., and Ichoku, C. Long-term mortality burden trends attributed to black carbon and PM2.5 from wildfire emissions across the continental USA from 2000 to 2020: a deep learning modelling study. The Lancet Planetary Health, 2023, 7, e963–e975. https://doi.org/10.1016/S2542-5196(23)00235-8 More air quality datasets of different air pollutants can be found at: https://weijing-rs.github.io/product.html

  6. nfl-big-data-bowl-2021 Feather files

    • kaggle.com
    zip
    Updated Oct 15, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mathurin Aché (2020). nfl-big-data-bowl-2021 Feather files [Dataset]. https://www.kaggle.com/mathurinache/nflbigdatabowl2021-feather-files
    Explore at:
    zip(475231363 bytes)Available download formats
    Dataset updated
    Oct 15, 2020
    Authors
    Mathurin Aché
    Description

    When a quarterback takes a snap and drops back to pass, what happens next may seem like chaos. As offensive players move in various patterns, the defense works together to prevent successful pass completions and then to quickly tackle receivers that do catch the ball. In this year’s Kaggle competition, your goal is to use data science to better understand the schemes and players that make for a successful defense against passing plays.

    In American football, there are a plethora of defensive strategies and outcomes. The National Football League (NFL) has used previous Kaggle competitions to focus on offensive plays, but as the old proverb goes, “defense wins championships.” Though metrics for analyzing quarterbacks, running backs, and wide receivers are consistently a part of public discourse, techniques for analyzing the defensive part of the game trail and lag behind. Identifying player, team, or strategic advantages on the defensive side of the ball would be a significant breakthrough for the game.

    This competition uses NFL’s Next Gen Stats data, which includes the position and speed of every player on the field during each play. You’ll employ player tracking data for all drop-back pass plays from the 2018 regular season. The goal of submissions is to identify unique and impactful approaches to measure defensive performance on these plays. There are several different directions for participants to ‘tackle’ (ha)—which may require levels of football savvy, data aptitude, and creativity. As examples:

    What are coverage schemes (man, zone, etc) that the defense employs? What coverage options tend to be better performing? Which players are the best at closely tracking receivers as they try to get open? Which players are the best at closing on receivers when the ball is in the air? Which players are the best at defending pass plays when the ball arrives? Is there any way to use player tracking data to predict whether or not certain penalties – for example, defensive pass interference – will be called? Who are the NFL’s best players against the pass? How does a defense react to certain types of offensive plays? Is there anything about a player – for example, their height, weight, experience, speed, or position – that can be used to predict their performance on defense? What does data tell us about defending the pass play? You are about to find out.

    Note: Are you a university participant? Students have the option to participate in a college-only Competition, where you’ll work on the identical themes above. Students can opt-in for either the Open or College Competitions, but not both.

  7. S

    Midjourney Statistics 2025: From Growth Data to Big Impact

    • sqmagazine.co.uk
    Updated Nov 1, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    SQ Magazine (2025). Midjourney Statistics 2025: From Growth Data to Big Impact [Dataset]. https://sqmagazine.co.uk/midjourney-statistics/
    Explore at:
    Dataset updated
    Nov 1, 2025
    Dataset authored and provided by
    SQ Magazine
    License

    https://sqmagazine.co.uk/privacy-policy/https://sqmagazine.co.uk/privacy-policy/

    Time period covered
    Jan 1, 2024 - Dec 31, 2025
    Area covered
    Global
    Description

    The AI image generator Midjourney has rapidly shifted from a niche tool to a mainstream creative engine. Artists and brands alike now use it for concept art, marketing visuals, and rapid prototyping, while design teams employ it to streamline workflows and reduce production time. In this article, you’ll find detailed...

  8. D

    Data from: Uncertainty in the modeling of spatial big data on a pattern of...

    • phys-techsciences.datastations.nl
    txt, zip
    Updated Mar 26, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    V. A. Tolpekin; V. A. Tolpekin (2018). Uncertainty in the modeling of spatial big data on a pattern of bushfires holes [Dataset]. http://doi.org/10.17026/DANS-Z2M-44E4
    Explore at:
    txt(47937), zip(13193), txt(32988)Available download formats
    Dataset updated
    Mar 26, 2018
    Dataset provided by
    DANS Data Station Physical and Technical Sciences
    Authors
    V. A. Tolpekin; V. A. Tolpekin
    License

    https://doi.org/10.17026/fp39-0x58https://doi.org/10.17026/fp39-0x58

    Description

    The datasets used are in public access. Here is the URL for the database of forest fire footprints: https://www.ffm.vic.gov.au/The Landsat images are also in public access: https://earthexplorer.usgs.gov/.The code for the analysis is attached (Bush_fires_v2_2.R). The results can be reproduced using the code.The figures for the publication were created using the other R code attached to this email.

  9. College Basketball Big 12 Conference (1996 - 2021)

    • kaggle.com
    zip
    Updated Jul 22, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Matt OP (2022). College Basketball Big 12 Conference (1996 - 2021) [Dataset]. https://www.kaggle.com/datasets/mattop/college-basketball-big-12-conference-1996-2021
    Explore at:
    zip(17330 bytes)Available download formats
    Dataset updated
    Jul 22, 2022
    Authors
    Matt OP
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    The dataset contains yearly College Basketball Big 12 Conference data for every active Big 12 season.

    The data was collected from Sports Reference then cleaned for data analysis.

    Tabular data includes: - year - rank - school - games: Total games played - wins - losses - win_percentage - conference_wins - conference_losses - home_wins - home_losses - away_wins - away_losses - offensive_rating - defensive_rating - net_rating - simple_rating

    Per Game ———————————————————— - field_goals - field_goal_attempts - field_goal_percentage - 3_pointers - 3_pointer_attempts - 3_pointer_percentage - effective_field_goal_percentage - free_throws - free_throw_attempts - free_throw_percentage - offensive_rebounds - total_rebounds - assists - steals - blocks - turnovers - personal_fouls - points - opponent_points

  10. f

    Data from: INTERLEAVED TACTICAL TRAINING OF BIG FOOTBALL TEAMS

    • scielo.figshare.com
    xls
    Updated May 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jiarong Wu (2023). INTERLEAVED TACTICAL TRAINING OF BIG FOOTBALL TEAMS [Dataset]. http://doi.org/10.6084/m9.figshare.19915198.v1
    Explore at:
    xlsAvailable download formats
    Dataset updated
    May 30, 2023
    Dataset provided by
    SciELO journals
    Authors
    Jiarong Wu
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    ABSTRACT Introduction Tactical football training is significant in teaching great football teams. Analyzing and discussing existing problems and proposals for corresponding countermeasures should be carried out periodically. Objective Investigate and understand the main factors that affect the development of tactical training activities of big football teams. Methods Large-scale soccer match tactics at the 2018 World Cup are evaluated and treated statistically by dividing the defensive behaviors in the game between individual defensive tactics and collective defensive tactics. Results The primary means of launching a fast defensive attack is a medium to long pass across the court. Launching a fast attack requires combining a pass with a sudden attack. Conclusion Attackers often take the initiative in their confrontation tactics. The aggressive style of the players excels in the initiative and midfield advantage. Evidence level II; Therapeutic Studies - Investigating the results.

  11. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
(Point of Contact); NOAA World Data Service for Paleoclimatology (Point of Contact) (2024). NOAA/WDS Paleoclimatology - Heyerdahl fire data from Big Hole - IMPD USBGH001 [Dataset]. https://catalog.data.gov/dataset/noaa-wds-paleoclimatology-heyerdahl-fire-data-from-big-hole-impd-usbgh001
Organization logo

NOAA/WDS Paleoclimatology - Heyerdahl fire data from Big Hole - IMPD USBGH001

Explore at:
Dataset updated
Dec 1, 2024
Dataset provided by
National Oceanic and Atmospheric Administrationhttp://www.noaa.gov/
Description

This archived Paleoclimatology Study is available from the NOAA National Centers for Environmental Information (NCEI), under the World Data Service (WDS) for Paleoclimatology. The associated NCEI study type is Fire. The data include parameters of fire history with a geographic location of Montana, United States Of America. The time period coverage is from 562 to -53 in calendar years before present (BP). See metadata information for parameter and study location details. Please cite this study when using the data.

Search
Clear search
Close search
Google apps
Main menu