59 datasets found
  1. Airline Dataset

    • kaggle.com
    Updated Sep 26, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sourav Banerjee (2023). Airline Dataset [Dataset]. https://www.kaggle.com/datasets/iamsouravbanerjee/airline-dataset
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Sep 26, 2023
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Sourav Banerjee
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Context

    Airline data holds immense importance as it offers insights into the functioning and efficiency of the aviation industry. It provides valuable information about flight routes, schedules, passenger demographics, and preferences, which airlines can leverage to optimize their operations and enhance customer experiences. By analyzing data on delays, cancellations, and on-time performance, airlines can identify trends and implement strategies to improve punctuality and mitigate disruptions. Moreover, regulatory bodies and policymakers rely on this data to ensure safety standards, enforce regulations, and make informed decisions regarding aviation policies. Researchers and analysts use airline data to study market trends, assess environmental impacts, and develop strategies for sustainable growth within the industry. In essence, airline data serves as a foundation for informed decision-making, operational efficiency, and the overall advancement of the aviation sector.

    Content

    This dataset comprises diverse parameters relating to airline operations on a global scale. The dataset prominently incorporates fields such as Passenger ID, First Name, Last Name, Gender, Age, Nationality, Airport Name, Airport Country Code, Country Name, Airport Continent, Continents, Departure Date, Arrival Airport, Pilot Name, and Flight Status. These columns collectively provide comprehensive insights into passenger demographics, travel details, flight routes, crew information, and flight statuses. Researchers and industry experts can leverage this dataset to analyze trends in passenger behavior, optimize travel experiences, evaluate pilot performance, and enhance overall flight operations.

    Dataset Glossary (Column-wise)

    • Passenger ID - Unique identifier for each passenger
    • First Name - First name of the passenger
    • Last Name - Last name of the passenger
    • Gender - Gender of the passenger
    • Age - Age of the passenger
    • Nationality - Nationality of the passenger
    • Airport Name - Name of the airport where the passenger boarded
    • Airport Country Code - Country code of the airport's location
    • Country Name - Name of the country the airport is located in
    • Airport Continent - Continent where the airport is situated
    • Continents - Continents involved in the flight route
    • Departure Date - Date when the flight departed
    • Arrival Airport - Destination airport of the flight
    • Pilot Name - Name of the pilot operating the flight
    • Flight Status - Current status of the flight (e.g., on-time, delayed, canceled)

    Structure of the Dataset

    https://i.imgur.com/cUFuMeU.png" alt="">

    Acknowledgement

    The dataset provided here is a simulated example and was generated using the online platform found at Mockaroo. This web-based tool offers a service that enables the creation of customizable Synthetic datasets that closely resemble real data. It is primarily intended for use by developers, testers, and data experts who require sample data for a range of uses, including testing databases, filling applications with demonstration data, and crafting lifelike illustrations for presentations and tutorials. To explore further details, you can visit their website.

    Cover Photo by: Kevin Woblick on Unsplash

    Thumbnail by: Airplane icons created by Freepik - Flaticon

  2. R

    CADO airplane database

    • entrepot.recherche.data.gouv.fr
    Updated Aug 13, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nicolas MONROLIN; Nicolas MONROLIN; Thierry DRUOT; Nicolas PETEILH; Pascal ROCHES; Yri-Amandine KAMBIRI; Thierry DRUOT; Nicolas PETEILH; Pascal ROCHES; Yri-Amandine KAMBIRI (2024). CADO airplane database [Dataset]. http://doi.org/10.57745/LLRJO0
    Explore at:
    tsv(134740), text/comma-separated-values(63422), txt(25819)Available download formats
    Dataset updated
    Aug 13, 2024
    Dataset provided by
    Recherche Data Gouv
    Authors
    Nicolas MONROLIN; Nicolas MONROLIN; Thierry DRUOT; Nicolas PETEILH; Pascal ROCHES; Yri-Amandine KAMBIRI; Thierry DRUOT; Nicolas PETEILH; Pascal ROCHES; Yri-Amandine KAMBIRI
    License

    https://entrepot.recherche.data.gouv.fr/api/datasets/:persistentId/versions/1.3/customlicense?persistentId=doi:10.57745/LLRJO0https://entrepot.recherche.data.gouv.fr/api/datasets/:persistentId/versions/1.3/customlicense?persistentId=doi:10.57745/LLRJO0

    Dataset funded by
    Fédération de recherche ENAC ISAE-SUPAERO ONERA
    Description

    This database contains data of nearly 230 airplanes. Each airplane is described by 31 parameters such as: name, IATA code and category (general, commuter, regional, short-medium, long range), geometry, mass, max speed, typical cruise mach number, typical range, typical approach speed, take-off field length, landing field length, number of engine, type of engine, typical engine model, bypass ratio, max thrust or max power. This database relies on various sources such as the manufacturer website, flight manual (if available), books, Eurocontrol aircraft performances. This data are NOT intended to be used in an operational context. They were gathered in order to provide orders of magnitude and find trends between various parameters in a preliminary aircraft design context. Please consider citing the following publication if you reuse this dataset.

  3. Data from: Large Landing Trajectory Data Set for Go-Around Analysis

    • zenodo.org
    • data.niaid.nih.gov
    application/gzip, bin +1
    Updated Dec 16, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Raphael Monstein; Raphael Monstein; Benoit Figuet; Benoit Figuet; Timothé Krauth; Timothé Krauth; Manuel Waltert; Manuel Waltert; Marcel Dettling; Marcel Dettling (2022). Large Landing Trajectory Data Set for Go-Around Analysis [Dataset]. http://doi.org/10.5281/zenodo.7148117
    Explore at:
    application/gzip, bin, zipAvailable download formats
    Dataset updated
    Dec 16, 2022
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Raphael Monstein; Raphael Monstein; Benoit Figuet; Benoit Figuet; Timothé Krauth; Timothé Krauth; Manuel Waltert; Manuel Waltert; Marcel Dettling; Marcel Dettling
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Large go-around, also referred to as missed approach, data set. The data set is in support of the paper presented at the OpenSky Symposium on November the 10th.

    If you use this data for a scientific publication, please consider citing our paper.

    The data set contains landings from 176 (mostly) large airports from 44 different countries. The landings are labelled as performing a go-around (GA) or not. In total, the data set contains almost 9 million landings with more than 33000 GAs. The data was collected from OpenSky Network's historical data base for the year 2019. The published data set contains multiple files:

    go_arounds_minimal.csv.gz

    Compressed CSV containing the minimal data set. It contains a row for each landing and a minimal amount of information about the landing, and if it was a GA. The data is structured in the following way:

    Column nameTypeDescription
    timedate timeUTC time of landing or first GA attempt
    icao24stringUnique 24-bit (hexadecimal number) ICAO identifier of the aircraft concerned
    callsignstringAircraft identifier in air-ground communications
    airportstringICAO airport code where the aircraft is landing
    runwaystringRunway designator on which the aircraft landed
    has_gastring"True" if at least one GA was performed, otherwise "False"
    n_approachesintegerNumber of approaches identified for this flight
    n_rwy_approachedintegerNumber of unique runways approached by this flight

    The last two columns, n_approaches and n_rwy_approached, are useful to filter out training and calibration flight. These have usually a large number of n_approaches, so an easy way to exclude them is to filter by n_approaches > 2.

    go_arounds_augmented.csv.gz

    Compressed CSV containing the augmented data set. It contains a row for each landing and additional information about the landing, and if it was a GA. The data is structured in the following way:

    Column nameTypeDescription
    timedate timeUTC time of landing or first GA attempt
    icao24stringUnique 24-bit (hexadecimal number) ICAO identifier of the aircraft concerned
    callsignstringAircraft identifier in air-ground communications
    airportstringICAO airport code where the aircraft is landing
    runwaystringRunway designator on which the aircraft landed
    has_gastring"True" if at least one GA was performed, otherwise "False"
    n_approachesintegerNumber of approaches identified for this flight
    n_rwy_approachedintegerNumber of unique runways approached by this flight
    registrationstringAircraft registration
    typecodestringAircraft ICAO typecode
    icaoaircrafttypestringICAO aircraft type
    wtcstringICAO wake turbulence category
    glide_slope_anglefloatAngle of the ILS glide slope in degrees
    has_intersection

    string

    Boolean that is true if the runway has an other runway intersecting it, otherwise false
    rwy_lengthfloatLength of the runway in kilometre
    airport_countrystringISO Alpha-3 country code of the airport
    airport_regionstringGeographical region of the airport (either Europe, North America, South America, Asia, Africa, or Oceania)
    operator_countrystringISO Alpha-3 country code of the operator
    operator_regionstringGeographical region of the operator of the aircraft (either Europe, North America, South America, Asia, Africa, or Oceania)
    wind_speed_kntsintegerMETAR, surface wind speed in knots
    wind_dir_degintegerMETAR, surface wind direction in degrees
    wind_gust_kntsintegerMETAR, surface wind gust speed in knots
    visibility_mfloatMETAR, visibility in m
    temperature_degintegerMETAR, temperature in degrees Celsius
    press_sea_level_pfloatMETAR, sea level pressure in hPa
    press_pfloatMETAR, QNH in hPA
    weather_intensitylistMETAR, list of present weather codes: qualifier - intensity
    weather_precipitationlistMETAR, list of present weather codes: weather phenomena - precipitation
    weather_desclistMETAR, list of present weather codes: qualifier - descriptor
    weather_obscurationlistMETAR, list of present weather codes: weather phenomena - obscuration
    weather_otherlistMETAR, list of present weather codes: weather phenomena - other

    This data set is augmented with data from various public data sources. Aircraft related data is mostly from the OpenSky Network's aircraft data base, the METAR information is from the Iowa State University, and the rest is mostly scraped from different web sites. If you need help with the METAR information, you can consult the WMO's Aerodrom Reports and Forecasts handbook.

    go_arounds_agg.csv.gz

    Compressed CSV containing the aggregated data set. It contains a row for each airport-runway, i.e. every runway at every airport for which data is available. The data is structured in the following way:

    Column nameTypeDescription
    airportstringICAO airport code where the aircraft is landing
    runwaystringRunway designator on which the aircraft landed
    n_landingsintegerTotal number of landings observed on this runway in 2019
    ga_ratefloatGo-around rate, per 1000 landings
    glide_slope_anglefloatAngle of the ILS glide slope in degrees
    has_intersectionstringBoolean that is true if the runway has an other runway intersecting it, otherwise false
    rwy_lengthfloatLength of the runway in kilometres
    airport_countrystringISO Alpha-3 country code of the airport
    airport_regionstringGeographical region of the airport (either Europe, North America, South America, Asia, Africa, or Oceania)

    This aggregated data set is used in the paper for the generalized linear regression model.

    Downloading the trajectories

    Users of this data set with access to OpenSky Network's Impala shell can download the historical trajectories from the historical data base with a few lines of Python code. For example, you want to get all the go-arounds of the 4th of January 2019 at London City Airport (EGLC). You can use the Traffic library for easy access to the database:

    import datetime
    from tqdm.auto import tqdm
    import pandas as pd
    from traffic.data import opensky
    from traffic.core import Traffic

    load minimum data set

    df = pd.read_csv("go_arounds_minimal.csv.gz", low_memory=False) df["time"] = pd.to_datetime(df["time"])

    select London City Airport, go-arounds, and 2019-01-04

    airport = "EGLC" start = datetime.datetime(year=2019, month=1, day=4).replace( tzinfo=datetime.timezone.utc ) stop = datetime.datetime(year=2019, month=1, day=5).replace( tzinfo=datetime.timezone.utc )

    df_selection = df.query("airport==@airport & has_ga

  4. Airlines Flights Data

    • kaggle.com
    Updated Jul 29, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Data Science Lovers (2025). Airlines Flights Data [Dataset]. https://www.kaggle.com/datasets/rohitgrewal/airlines-flights-data
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 29, 2025
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Data Science Lovers
    License

    http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/

    Description

    📹Project Video available on YouTube - https://youtu.be/gu3Ot78j_Gc

    Airlines Flights Dataset for Different Cities

    The Flights Booking Dataset of various Airlines is a scraped datewise from a famous website in a structured format. The dataset contains the records of flight travel details between the cities in India. Here, multiple features are present like Source & Destination City, Arrival & Departure Time, Duration & Price of the flight etc.

    This data is available as a CSV file. We are going to analyze this data set using the Pandas DataFrame.

    This analyse will be helpful for those working in Airlines, Travel domain.

    Using this dataset, we answered multiple questions with Python in our Project.

    Q.1. What are the airlines in the dataset, accompanied by their frequencies?

    Q.2. Show Bar Graphs representing the Departure Time & Arrival Time.

    Q.3. Show Bar Graphs representing the Source City & Destination City.

    Q.4. Does price varies with airlines ?

    Q.5. Does ticket price change based on the departure time and arrival time?

    Q.6. How the price changes with change in Source and Destination?

    Q.7. How is the price affected when tickets are bought in just 1 or 2 days before departure?

    Q.8. How does the ticket price vary between Economy and Business class?

    Q.9. What will be the Average Price of Vistara airline for a flight from Delhi to Hyderabad in Business Class ?

    These are the main Features/Columns available in the dataset :

    1) Airline: The name of the airline company is stored in the airline column. It is a categorical feature having 6 different airlines.

    2) Flight: Flight stores information regarding the plane's flight code. It is a categorical feature.

    3) Source City: City from which the flight takes off. It is a categorical feature having 6 unique cities.

    4) Departure Time: This is a derived categorical feature obtained created by grouping time periods into bins. It stores information about the departure time and have 6 unique time labels.

    5) Stops: A categorical feature with 3 distinct values that stores the number of stops between the source and destination cities.

    6) Arrival Time: This is a derived categorical feature created by grouping time intervals into bins. It has six distinct time labels and keeps information about the arrival time.

    7) Destination City: City where the flight will land. It is a categorical feature having 6 unique cities.

    8) Class: A categorical feature that contains information on seat class; it has two distinct values: Business and Economy.

    9) Duration: A continuous feature that displays the overall amount of time it takes to travel between cities in hours.

    10) Days Left: This is a derived characteristic that is calculated by subtracting the trip date by the booking date.

    11) Price: Target variable stores information of the ticket price.

  5. Z

    AeroSonicDB (YPAD-0523): Labelled audio dataset for acoustic detection and...

    • data.niaid.nih.gov
    Updated Aug 1, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Downward, Blake (2024). AeroSonicDB (YPAD-0523): Labelled audio dataset for acoustic detection and classification of aircraft [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_8000468
    Explore at:
    Dataset updated
    Aug 1, 2024
    Dataset authored and provided by
    Downward, Blake
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    AeroSonicDB (YPAD-0523): Labelled audio dataset for acoustic detection and classification of aircraftVersion 1.1.2 (November 2023)

    [UPDATE: June 2024]

    Version 2.0 is currently in beta and can be found at https://zenodo.org/records/12775560. The repository is currently restricted, however you can gain access by emailing Blake Downward at aerosonicdb@gmail.com, or by submitting the following Google Form.

    Version 2 vastly extends the number of Aircraft audio samples to over 3,000 (V1 contains 625 aircraft sampes), for more than 38 hours of strongly annotated aircraft audio (V1 contains 8.9 hours of aircraft audio).

    Publication

    When using this data in an academic work, please reference the dataset DOI and version. Please also reference the following paper which describes the methodology for collecting the dataset and presents baseline model results.

    Downward, B., & Nordby, J. (2023). The AeroSonicDB (YPAD-0523) Dataset for Acoustic Detection and Classification of Aircraft. ArXiv, abs/2311.06368.

    Description

    AeroSonicDB:YPAD-0523 is a specialised dataset of ADS-B labelled audio clips for research in the fields of environmental noise attribution and machine listening, particularly acoustic detection and classification of low-flying aircraft. Audio files in this dataset were recorded at locations in close proximity to a flight path approaching or departing Adelaide International Airport's (ICAO code: YPAD) primary runway, 05/23. Recordings are initially labelled from radio (ADS-B) messages received from the aircraft overhead, then human verified and annotated with the first and final moments which the target aircraft is audible.

    A total of 1,895 audio clips are distributed across two top-level classes, "Aircraft" (8.87 hours) and "Silence" (3.52 hours). The aircraft class is then further broken-down into four subclasses, which broadly describe the structure of the aircraft and propulsion mechanism. A variety of additional "airframe" features are provided to give researchers finer control of the dataset, and the opportunity to develop ontologies specific to their own use case.

    For convenience, the dataset has been split into training (10.04 hours) and testing (2.35 hours) subsets, with the training set further split into 5 distinct folds for cross-validation. These splits are performed to prevent data-leakage between folds and the test set, ensuring samples collected in the same recording session (distinct in time, location and microphone) are assigned to the same fold.

    Researchers may find applications for this dataset in a number of fields; particularly aircraft noise isolation and noise monitoring in an urban environment, development of passive acoustic systems to assist radar technology, and understanding the sources of aircraft noise to help manufacturers design less-noisy aircraft.

    Audio data

    ADS-B (Automatic Dependent Surveillance–Broadcast) messages transmitted directly from aircraft are used to automatically trigger, capture and label audio samples. A 60-second recording is triggered when an aircraft transmits a message indicating it is within a specified distance of the recording device (see "Location data" below for specifics). The resulting audio file is labelled with the unique ICAO identifier code for the aircraft, as well as its last reported altitude, date, time, location and microphone. The recording is then human verified and annotated with timestamps for the first and last moments the aircraft is audible. In total, AeroSonicDB contains 625 recordings of low-altitude aircraft - varying in length from 18 to 60 seconds, for a total of 8.87 hours of aircraft audio.

    A collection of urban background noise without aircraft (silence) is included with the dataset as a means of distinguishing location specific environmental noises from aircraft noises. 10-second background noise, or "silence" recordings are triggered only when there are no aircraft broadcasting they are within a specified distance of the recording device (see "Location data" below). These "silence" recordings are also human verified to ensure no aircraft noise is present. The dataset contains 1,270 clips of silence/urban background noise.

    Location data

    Recordings have been collected from three (3) locations. GPS coordinates for each location are provided in the "locations.json" file. In order to protect privacy, coordinates have been provided for a road or public space nearby the recording device instead of its exact location.

    Location: 0Situated in a suburban environment approximately 15.5km north-east of the start/end of the runway. For Adelaide, typical south-westerly winds bring most arriving aircraft past this location on approach. Winds from the north or east will cause aircraft to take-off to the north-east, however not all departing aircraft will maintain a course to trigger a recording at this location. The "trigger distance" for this location is set for 3km to ensure small/slower aircraft and large/faster aircraft are captured within a sixty-second recording.

    "Silence" or ambient background noises at this location include; cars, motorbikes, light-trucks, garbage trucks, power-tools, lawn mowers, construction sounds, sirens, people talking, dogs barking and a wide range of Australian native birds (New Holland Honeyeaters, Wattlebirds, Australian Magpies, Australian Ravens, Spotted Doves, Rainbow Lorikeets and others).

    Location: 1Situated approximately 500m south-east of the south-eastern end of the runway, this location is nearby recreational areas (golf course, skate park and parklands) with a busy road/highway inbetween the location and runway. This location features heavy winds and road traffic, as well as people talking, walking and riding, and also birds such as the Australian Magpie and Noisy Miner. The trigger distance for this location is set to 1km. Due to their low altitude aircraft are louder, but audible for a shorter time compared to "Location 0".

    Location: 2As an alternative to "Location 1", this location is situated approximately 950m south-east of the end of the runway. This location has a wastewater facility to the north, a residential area to the south and a popular beach to the west. This location offers greater wind protection and further distance from airport and highway noises. Ambient background sounds feature close proximity cars and motorbikes, cyclists, people walking, nail guns and other construction sounds, as well as the local birds mentioned above.

    Aircraft metadata

    Supplementary "airframe" metadata for all aircraft has been gathered to help broaden the research possibilities from this dataset. Airframe information was collected and cross-checked from a number of open-source databases. The author has no reason to beleive any significant errors exist in the "aircraft_meta" files, however future versions of this dataset plan to obtain aircraft information directly from ICAO (International Civil Aviation Organization) to ensure a single, verifiable source of information.

    Class/subclass ontology (minutes of recordings)

    1. no aircraft (211) 0: no aircraft (211)

    2. aircraft (533) 1: piston-propeller aeroplane (30) 2: turbine-propeller aeroplane (90) 3: turbine-fan aeroplane (409) 4: rotorcraft (4) The subclasses are a combination of the "airframe" and "engtype" features. Piston and Turboshaft rotorcraft/helicopters have been combined into a single subclass due to the small number of samples. Data splits

    Audio recordings have been split into training (81%) and test (19%) sets. The training set has further been split into 5 folds, giving researchers a common split to perform 5-fold cross-validation to ensure reproducibility and comparable results. Data leakage into the test set has been avoided by ensuring recordings are disjointed from the training set by time and location - meaning samples in the test set for a particular location were recorded after any samples included in the training set for that particular location.

    Labelled data

    The entire dataset (training and test) is referenced and labelled in the "sample_meta.csv" file. Each row contains a reference to a unique recording, its meta information, annotations and airframe features.

    Alternatively, these labels can be derived directly from the filename of the sample (see below). The "aircraft_meta.csv" and "aircraft_meta.json" files can be used to reference aircraft specific features - such as; manufacturer, engine type, ICAO type designator etc. (see "Columns/Labels" below for all features).

    File naming convention

    Audio samples are in WAV format, with some metadata stored in the filename.

    Basic Convention

    "Aircraft ID + Date + Time + Location ID + Microphone ID"

    "XXXXXX_YYYY-MM-DD_hh-mm-ss_X_X"

    Sample with aircraft

    {hex_id} _ {date} _ {time} _ {location_id} _ {microphone_id} . {file_ext}

    7C7CD0_2023-05-09_12-42-55_2_1.wav

    Sample without aircraft

    "Silence" files are denoted with six (6) leading zeros rather than an aircraft hex code. All relevant metadata for "silence" samples are contained in the audio filename, and again in the accompanying "sample_meta.csv"

    000000 _ {date} _ {time} _ {location_id} _ {microphone_id} . {file_ext}

    000000_2023-05-09_12-30-55_2_1.wav

    Columns/Labels

    (found in sample_meta.csv, aircraft_meta.csv/json files)

    train-test: Train-test split (train, test)

    fold: Digit from 1 to 5 splitting the training data 5 ways (else test)

    filename: The filename of the audio recording

    date: Date of the recording

    time: Time of the recording

    location: ID for the location of the recording

    mic: ID of the microphone used

    class: Top-level label for the recording (eg. 0 = No aircraft, 1 = Aircraft audible)

    subclass: Subclass label for the recording (eg. 0 = No aircraft, 3 = Turbine-fan aeroplane)

    altitude: Approximate altitude of the aircraft (in feet) at the start of the recording

    hex_id: Unique ICAO 24-bit address for the aircraft recorded

    session: Unique recording

  6. US Airline Flight Routes and Fares 1993-2024

    • kaggle.com
    Updated Aug 4, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bhavik Jikadara (2024). US Airline Flight Routes and Fares 1993-2024 [Dataset]. https://www.kaggle.com/datasets/bhavikjikadara/us-airline-flight-routes-and-fares-1993-2024
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Aug 4, 2024
    Dataset provided by
    Kaggle
    Authors
    Bhavik Jikadara
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    This dataset provides detailed information on airline flight routes, fares, and passenger volumes within the United States from 1993 to 2024. The data includes metrics such as the origin and destination cities, distances between airports, the number of passengers, and fare information segmented by different airline carriers. It serves as a comprehensive resource for analyzing trends in air travel, pricing, and carrier competition over a span of three decades.

    Data Features:

    • tbl: Table identifier
    • Year: Year of the data record
    • quarter: Quarter of the year (1-4)
    • citymarketid_1: Origin city market ID
    • citymarketid_2: Destination city market ID
    • city1: Origin city name
    • city2: Destination city name
    • airportid_1: Origin airport ID
    • airportid_2: Destination airport ID
    • airport_1: Origin airport code
    • airport_2: Destination airport code
    • nsmiles: Distance between airports in miles
    • passengers: Number of passengers
    • fare: Average fare
    • carrier_lg: Code for the largest carrier by passengers
    • large_ms: Market share of the largest carrier
    • fare_lg: Average fare of the largest carrier
    • carrier_low: Code for the lowest fare carrier
    • lf_ms: Market share of the lowest fare carrier
    • fare_low: Lowest fare
    • Geocoded_City1: Geocoded coordinates for the origin city
    • Geocoded_City2: Geocoded coordinates for the destination city
    • tbl1apk: Unique identifier for the route

    Potential Uses:

    • Market Analysis: Assess trends in air travel demand, fare changes, and market share of airlines over time.
    • Price Optimization: Develop models to predict optimal pricing strategies for airlines.
    • Route Planning: Identify profitable routes and underserved markets for new route planning.
    • Economic Studies: Analyze the economic impact of air travel on different cities and regions.
    • Travel Behavior Research: Study changes in passenger preferences and travel behavior over the years.
    • Competitor Analysis: Evaluate the performance of different airlines on various routes.
  7. m

    Airline Delay Data

    • data.mendeley.com
    • narcis.nl
    Updated Dec 10, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    David Seymour (2020). Airline Delay Data [Dataset]. http://doi.org/10.17632/j3z5bm7496.1
    Explore at:
    Dataset updated
    Dec 10, 2020
    Authors
    David Seymour
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Data that looks at how market structure affects delays for US domestic flights between the years 2004 - 2017.

    Data on airline delays come from the Airline On-Time Performance Data (OTPD) from the US Bureau of Transportation Statistics. The data on tail numbers and seat capacity come from the Federal Aircraft Administration Aircraft Registry. The data on flight-related whether comes from the Local Climatological Data (LCD) provided by the National Center for Environmental Information.

  8. z

    AeroSonic YPAD-0523: Labelled audio dataset for acoustic detection and...

    • zenodo.org
    csv, json, txt, zip
    Updated Sep 23, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Blake Downward; Blake Downward (2023). AeroSonic YPAD-0523: Labelled audio dataset for acoustic detection and classification of aircraft [Dataset]. http://doi.org/10.5281/zenodo.8004081
    Explore at:
    zip, csv, json, txtAvailable download formats
    Dataset updated
    Sep 23, 2023
    Dataset provided by
    Zenodo
    Authors
    Blake Downward; Blake Downward
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    AeroSonic YPAD-0523: Labelled audio dataset for acoustic detection and classification of aircraft
    Version 0.2 (June 2023)

    Publication
    If using this data in an academic work, please reference the DOI and version.

    Description
    AeroSonic:YPAD-0523 is a specialised dataset of ADS-B labelled audio clips for research in the fields of aircraft noise attribution and machine listening, particularly acoustic detection and classification of low-flying aircraft. Audio files in this dataset were recorded at locations in close proximity to a flight path approaching or departing Adelaide International Airport’s (ICAO code: YPAD) primary runway, 05/23. Recordings are initially labelled from radio (ADS-B) messages received from the aircraft overhead. Each recording is then human verified, and trimmed to the best (subjective) 20 seconds of audio in which the target aircraft is audible.

    A total of 1,890 audio clips are balanced across two top-level classes, “Aircraft” (3.57 hours: 642 20-second recordings) and “Silence” (3.37 hours: 1,248 5 and 10-second recordings). The aircraft class is then further broken-down into four unbalanced subclasses which broadly describe an aircrafts structure and propulsion mechanism. A variety of additional "airframe" features are provided to give researchers finer control of the dataset, and the opportunity to develop ontologies specific to their own use case.

    For convenience, the dataset has been split into training (6.28 hours) and testing (0.66 hours) subsets, with the training set further split into 10 folds for cross-validation. Care has been taken to ensure the class distribution for each subset and fold does not significantly deviate from the overall distribution.

    Researchers may find applications for this dataset in a number of fields; particularly aircraft noise isolation and monitoring in an urban environment, development of passive acoustic systems to assist radar technology, and understanding the sources of aircraft noise to help manufacturers design less-noisy aircraft.

    Audio data
    ADS-B (Automatic Dependent Surveillance–Broadcast) messages transmitted directly from aircraft are used to automatically capture and label audio recordings. A 60-second recording is triggered when an aircraft transmits a message indicating it is within a specified distance of the recording device. The file is labelled with a unique ICAO identifier code for the aircraft, as well as its last recorded altitude, date and time. The recording is then human verified and trimmed to 20 seconds - with the aircraft audible for the duration of the clip.

    A balanced collection of urban background noise without aircraft (silence) is included with the dataset as a means of distinguishing location specific environmental noises from aircraft noises. 10-second background noise, or “silence” recordings are triggered only when there are no aircraft broadcasting that they are within a specified distance of the recording device. These "silence" recordings are also human verified to ensure no aircraft noise is present. The dataset contains 1,180 10-second clips, and 68 5-second clips of silence/ambient background noise.

    Location information
    Recordings have been collected from three (3) locations. GPS coordinates for each location are provided in the "locations.json" file. In order to protect privacy, coordinates have been provided for a road or public space nearby the recording device instead of its exact location.

    Location: 0
    Situated in a suburban environment approximately 15.5km north-east of the start/end of the runway. For Adelaide, typical south-westerly winds bring most arriving aircraft past this location on approach. Winds from the north or east will cause aircraft to take-off to the north-east, however not all departing aircraft will maintain a course to trigger a recording at this location. The "trigger distance" for this location is set for 3km to ensure small/slower aircraft and large/faster aircraft are captured within a sixty-second recording.

    "Silence" or ambient background noises at this location include; cars, motorbikes, light-trucks, garbage trucks, power-tools, lawn mowers, construction sounds, sirens, people talking, dogs barking and a wide range of Australian native birds (New Holland Honeyeaters, Wattlebirds, Australian Magpies, Australian Ravens, Spotted Doves, Rainbow Lorikeets and others).

    Location: 1
    Situated approximately 500m south-east of the south-eastern end of the runway, this location is nearby recreational areas (golf course, skate park and parklands) with a busy road/highway inbetween the location and runway. This location features heavy winds and road traffic, as well as people talking, walking and riding, and also birds such as the Australian Magpie and Noisy Miner. The trigger distance for this location is set to 1km. Due to their low altitude aircraft are louder, but audible for a shorter time compared to "Location 0".

    Location: 2
    As an alternative to "Location 1", this location is situated approximately 950m south-east of the end of the runway. This location has a wastewater facility to the north, a residential area to the south and a popular beach to the west. This location offers greater wind protection and further distance from airport and highway noises. Ambient background sounds feature close proximity cars and motorbikes, cyclists, people walking, nail guns and other construction sounds, as well as the local birds mentioned above.


    Aircraft metadata
    Supplementary "airframe" metadata for all aircraft has been gathered to help broaden the research possibilities from this dataset. Airframe information was collected and cross-checked from a number of open-source databases. The author has no reason to beleive any significant errors exist in the "aircraft_meta" files, however future versions of this dataset plan to obtain aircraft information directly from ICAO (International Civil Aviation Organization) to ensure a single, verifiable source of information.

    Class/subclass ontology (minutes of recordings)

    0. no aircraft (202)
    0: no aircraft (202)

    1. aircraft (214)
    1: piston-propeller aeroplane (12)
    2: turbine-propeller aeroplane (37)
    3: turbine-fan aeroplane (163)
    4: rotorcraft (1.6)

    The subclasses are a combination of the "airframe" and "engtype" features. Piston and Turboshaft rotorcraft/helicopters have been combined into a single subclass due to the small number of samples.

    Data splits
    Audio recordings have been split into training (90.5%) and test (9.5%) sets. The training set has further been split into 10 folds, giving researchers a common split to perform 10-fold cross-validation - ensuring reproducibility and comparative results. Data leakage into the test set has been avoided by ensuring recordings are disjointed from the training set by time and location - meaning samples in the test set for a particular location were recorded after any samples included in the training set for that particular location.


    Labelled data
    The entire dataset (training and test) is referenced and labelled in the "sample_meta.csv" file. Each row contains a reference to a unique recording and all the labels and features associated with that recording and aircraft.

    Alternatively, these labels can be derived directly from the filename of the sample (see below), plus a JSON file which accompanies each aircraft sample. The "aircraft_meta.csv" and "aircraft_meta.json" files can be used to reference aircraft specific features - such as; manufacturer, engine type, ICAO type designator etc. (see below for all 14 airframe features).

    File naming convention
    Audio samples are in WAV format, and metadata for aircraft recordings are stored in JSON files. Both files share the same name, only differing by their file extension.

    Basic Convention

    “Aircraft ID + Date + Time + Location ID + Microphone ID”

    “XXXXXX_YYYY-MM-DD_hh-mm-ss_X_X”

    Sample with aircraft

    {hex_id} _ {date} _ {time} _ {location_id} _ {microphone_id} . {file_ext}

    7C7CD0_2023-05-09_12-42-55_2_1.wav
    7C7CD0_2023-05-09_12-42-55_2_1.json

    Sample without aircraft

    “Silence” files are denoted with six (6) leading zeros rather than an aircraft hex code. All relevant metadata for “silence” samples are contained in the audio filename, and again in the accompanying “sample_meta.csv”

    000000 _ {date} _ {time} _ {location_id} _ {microphone_id} . {file_ext}

    000000_2023-05-09_12-30-55_2_1.wav

    Columns/Labels
    (found in sample_meta.csv, aircraft_meta.csv/json and aircraft recording JSON files)

    train-test: Train-test split (train, test)

    fold: Digit from 0 to 9 splitting the training subset 10 ways (else test)

    filename: The filename of the audio recording

    date: Date of the recording

    time: Time of the recording

    duration: Length of the recording (in seconds)

    location_id: ID for the location of the recording

    microphone_id: ID of the microphone used

    hex_id: Unique ICAO 24-bit address for the aircraft

  9. u

    Twin Otter Airplane Flight Level Data

    • data.ucar.edu
    ascii
    Updated Aug 1, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    G. David Emmitt; Stephan De Wekker (2025). Twin Otter Airplane Flight Level Data [Dataset]. http://doi.org/10.5065/D60000J3
    Explore at:
    asciiAvailable download formats
    Dataset updated
    Aug 1, 2025
    Authors
    G. David Emmitt; Stephan De Wekker
    Time period covered
    Oct 6, 2012 - Oct 18, 2012
    Area covered
    Description

    This dataset contains Twin Otter Airplane Flight Level Data collected over Granite Mountain during the Mountain Terrain Atmospheric Modeling and Observations Field Experimental Component (MATERHORN-X) project. All data files are in comma separated value (CSV) format. The time stamps are based off of UTC time for all the instruments for the MATERHORN-X data, unless it was stated otherwise. This dataset is from the University of Virginia (UoV). Please refer to the readme for more information.

  10. Flight Delay Data

    • kaggle.com
    Updated Nov 28, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sri Harsha Eedala (2023). Flight Delay Data [Dataset]. https://www.kaggle.com/datasets/sriharshaeedala/airline-delay
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Nov 28, 2023
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Sri Harsha Eedala
    License

    https://www.usa.gov/government-works/https://www.usa.gov/government-works/

    Description

    This dataset provides detailed information on flight arrivals and delays for U.S. airports, categorized by carriers. The data includes metrics such as the number of arriving flights, delays over 15 minutes, cancellation and diversion counts, and the breakdown of delays attributed to carriers, weather, NAS (National Airspace System), security, and late aircraft arrivals. Explore and analyze the performance of different carriers at various airports during this period. Use this dataset to gain insights into the factors contributing to delays in the aviation industry.

    Purpose: The purpose of this dataset is to offer insights into the performance of U.S. carriers at various airports during August 2013 - August 2023, focusing on flight arrivals and delays. By providing detailed information on key metrics such as the number of arriving flights, delays over 15 minutes, cancellations, and diversions, the dataset aims to facilitate analyses of factors contributing to delays, including those attributed to carriers, weather, the National Airspace System (NAS), security, and late aircraft arrivals. Researchers, data scientists, and aviation enthusiasts can leverage this dataset to explore patterns, identify trends, and draw conclusions that contribute to a better understanding of the aviation industry's operational challenges.

    Structure: The dataset is structured as a tabular format with rows representing unique combinations of year, month, carrier, and airport. Each row contains information on various metrics, including flight counts, delay counts, cancellation and diversion counts, and delay breakdowns by different factors. The columns provide specific details such as carrier codes and names, airport codes and names, and counts of delays attributed to carrier, weather, NAS, security, and late aircraft arrivals. The structured format ensures that users can easily query, analyze, and visualize the data to derive meaningful insights.

    • year: The year of the data.
    • month: The month of the data.
    • carrier: Carrier code.
    • carrier_name: Carrier name.
    • airport: Airport code.
    • airport_name: Airport name.
    • arr_flights: Number of arriving flights.
    • arr_del15: Number of flights delayed by 15 minutes or more.
    • carrier_ct: Carrier count (delay due to the carrier).
    • weather_ct: Weather count (delay due to weather).
    • nas_ct: NAS (National Airspace System) count (delay due to the NAS).
    • security_ct: Security count (delay due to security).
    • late_aircraft_ct: Late aircraft count (delay due to late aircraft arrival).
    • arr_cancelled: Number of flights canceled.
    • arr_diverted: Number of flights diverted.
    • arr_delay: Total arrival delay.
    • carrier_delay: Delay attributed to the carrier.
    • weather_delay: Delay attributed to weather.
    • nas_delay: Delay attributed to the NAS.
    • security_delay: Delay attributed to security.
    • late_aircraft_delay: Delay attributed to late aircraft arrival.

    Usage: Researchers, analysts, and data enthusiasts can utilize this dataset for a variety of purposes, including but not limited to:

    Performance Analysis: Assess the on-time performance of different carriers at specific airports and identify potential areas for improvement.

    Trend Identification: Analyze temporal trends in delays, cancellations, and diversions to understand whether certain months or periods exhibit higher operational challenges.

    Root Cause Analysis: Investigate the primary contributors to delays, such as carrier-related issues, weather conditions, NAS inefficiencies, security concerns, or late aircraft arrivals.

    Benchmarking: Compare the performance of various carriers across different airports to identify industry leaders and areas requiring attention.

    Predictive Modeling: Use historical data to develop predictive models for flight delays, aiding in the development of strategies to mitigate disruptions.

    Industry Insights: Contribute to a broader understanding of the factors influencing operational efficiency within the U.S. aviation sector.

    As users explore and analyze the dataset, they can gain valuable insights that may inform decision-making processes, improve operational strategies, and contribute to a more efficient and reliable air travel experience.

  11. 100 Worst Plane crashes in History

    • kaggle.com
    Updated Feb 28, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Aryan Shanker Saxena (2024). 100 Worst Plane crashes in History [Dataset]. https://www.kaggle.com/datasets/aryan112345/worst-plane-crashes-in-history
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Feb 28, 2024
    Dataset provided by
    Kaggle
    Authors
    Aryan Shanker Saxena
    License

    ODC Public Domain Dedication and Licence (PDDL) v1.0http://www.opendatacommons.org/licenses/pddl/1.0/
    License information was derived automatically

    Description

    It is the raw data of the worst plane crashes in human history based on the fatalities. May their souls rest in peace.

    beautifulSoup was used to scrape the raw data. Preprocess accordingly.

  12. f

    US Airline flights dataset (1988-2008)

    • figshare.com
    bin
    Updated Jul 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    soda-inria (2023). US Airline flights dataset (1988-2008) [Dataset]. http://doi.org/10.6084/m9.figshare.23772366.v1
    Explore at:
    binAvailable download formats
    Dataset updated
    Jul 31, 2023
    Dataset provided by
    figshare
    Authors
    soda-inria
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Original data:https://doi.org/10.7910/DVN/HG7NV7This data has been rearranged and converted in parquet.

  13. Aircraft Production Data

    • kaggle.com
    Updated Jul 10, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Alvaro (2020). Aircraft Production Data [Dataset]. https://www.kaggle.com/alvaroibrain/aircraft-production-data/tasks
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 10, 2020
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Alvaro
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Content

    This dataset contains production data about most of the commercial/military aircraft ever produced, namely: * Number of units produced per model * Start / End dates of production * Retirement year (in case of military and some commercial aircraft).

    Acknowledgements

    Data was extracted from DBPedia and exported as a CSV for the ease of use in notebooks.

    Inspiration

    I was taking a look at this kernel and thought about including more data about the aircrafts.

  14. Z

    Crowdsourced air traffic data from The OpenSky Network 2020

    • data.niaid.nih.gov
    • zenodo.org
    Updated May 11, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Xavier Olive (2023). Crowdsourced air traffic data from The OpenSky Network 2020 [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_3737101
    Explore at:
    Dataset updated
    May 11, 2023
    Dataset provided by
    Martin Strohmeier
    Xavier Olive
    Jannis Lübbe
    Description

    Motivation

    The data in this dataset is derived and cleaned from the full OpenSky dataset to illustrate the development of air traffic during the COVID-19 pandemic. It spans all flights seen by the network's more than 2500 members since 1 January 2019. More data has been periodically included in the dataset until the end of the COVID-19 pandemic.

    We stopped updating the dataset after December 2022. Previous files have been fixed after a thorough sanity check.

    License

    See LICENSE.txt

    Disclaimer

    The data provided in the files is provided as is. Despite our best efforts at filtering out potential issues, some information could be erroneous.

    Origin and destination airports are computed online based on the ADS-B trajectories on approach/takeoff: no crosschecking with external sources of data has been conducted. Fields origin or destination are empty when no airport could be found.

    Aircraft information come from the OpenSky aircraft database. Fields typecode and registration are empty when the aircraft is not present in the database.

    Description of the dataset

    One file per month is provided as a csv file with the following features:

    callsign: the identifier of the flight displayed on ATC screens (usually the first three letters are reserved for an airline: AFR for Air France, DLH for Lufthansa, etc.)

    number: the commercial number of the flight, when available (the matching with the callsign comes from public open API); this field may not be very reliable;

    icao24: the transponder unique identification number;

    registration: the aircraft tail number (when available);

    typecode: the aircraft model type (when available);

    origin: a four letter code for the origin airport of the flight (when available);

    destination: a four letter code for the destination airport of the flight (when available);

    firstseen: the UTC timestamp of the first message received by the OpenSky Network;

    lastseen: the UTC timestamp of the last message received by the OpenSky Network;

    day: the UTC day of the last message received by the OpenSky Network;

    latitude_1, longitude_1, altitude_1: the first detected position of the aircraft;

    latitude_2, longitude_2, altitude_2: the last detected position of the aircraft.

    Examples

    Possible visualisations and a more detailed description of the data are available at the following page:

    Credit

    If you use this dataset, please cite:

    Martin Strohmeier, Xavier Olive, Jannis Lübbe, Matthias Schäfer, and Vincent Lenders "Crowdsourced air traffic data from the OpenSky Network 2019–2020" Earth System Science Data 13(2), 2021 https://doi.org/10.5194/essd-13-357-2021

  15. z

    Geospatial Dataset of GNSS Anomalies and Political Violence Events

    • zenodo.org
    csv
    Updated Jun 14, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Eugene Pik; Eugene Pik; João S. D. Garcia; João S. D. Garcia; Matthew Berra; Timothy Smith; Ibrahim Kocaman; Ibrahim Kocaman; Matthew Berra; Timothy Smith (2025). Geospatial Dataset of GNSS Anomalies and Political Violence Events [Dataset]. http://doi.org/10.5281/zenodo.15665065
    Explore at:
    csvAvailable download formats
    Dataset updated
    Jun 14, 2025
    Dataset provided by
    Zenodo
    Authors
    Eugene Pik; Eugene Pik; João S. D. Garcia; João S. D. Garcia; Matthew Berra; Timothy Smith; Ibrahim Kocaman; Ibrahim Kocaman; Matthew Berra; Timothy Smith
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Jun 14, 2025
    Description

    Geospatial Dataset of GNSS Anomalies and Political Violence Events

    Overview

    The Geospatial Dataset of GNSS Anomalies and Political Violence Events is a collection of data that integrates aircraft flight information, GNSS (Global Navigation Satellite System) anomalies, and political violence events from the ACLED (Armed Conflict Location & Event Data Project) database.

    Dataset Files

    The dataset consists of three CSV files:

    1. Daily_GNSS_Anomalies_and_ACLED-2023-V1.csv
      • Description: Contains all grids and dates that had aircraft traffic during 2023.
      • Number of Records: 6,777,228
      • Purpose: Provides a complete view of aircraft movements and associated data, including grids without any GNSS anomalies.
    2. Daily_GNSS_Anomalies_and_ACLED-2023-V2.csv
      • Description: A filtered version of V1, including only the grids and dates where GNSS anomalies (jumps or gaps) were reported.
      • Number of Records: 718,237
      • Purpose: Focuses on areas and times with GNSS anomalies for targeted analysis.
    3. Monthly_GNSS_Anomalies_and_ACLED-2023-V9.csv
      • Description: Contains aggregated monthly data for each grid cell, combining GNSS anomalies and ACLED political violence events. Summarizes aircraft traffic, anomaly counts, and conflict activity at a monthly resolution.
      • Number of Records: 25,770
      • Purpose: Enables temporal trend analysis and spatial correlation studies between GNSS interference and political violence, using reduced data volume suitable for modeling and visualization.

    Data Fields: Daily_GNSS_Anomalies_and_ACLED-2023-V1.csv and Daily_GNSS_Anomalies_and_ACLED-2023-V2.csv

    1. grid_id
      • Description: Unique identifier for a grid cell on Earth measuring 0.5 degrees latitude by 0.5 degrees longitude.
      • Format: String combining latitude and longitude (e.g., -10.0_-36.0).
    2. day
      • Description: Date of the recorded data.
      • Format: YYYY-MM-DD (e.g., 2023-03-28).
    3. geometry
      • Description: Polygon coordinates of the grid cell in Well-Known Text (WKT) format.
      • Format: POLYGON((longitude latitude, ...)) (e.g., POLYGON((-36.0 -10.0, -35.5 -10.0, -35.5 -9.5, -36.0 -9.5, -36.0 -10.0))).
    4. flights
      • Description: Number of aircraft flights that passed through the grid on that day.
      • Format: Integer (e.g., 28).
    5. GPS_jumps
      • Description: Number of reported GNSS "jump" anomalies (possible spoofing incidents) in the grid on that day.
      • Format: Integer (e.g., 1).
    6. GPS_gaps
      • Description: Number of reported GNSS "gap" anomalies, indicating gaps in aircraft routes, in the grid on that day.
      • Format: Integer (e.g., 0).
    7. gaps_density
      • Description: Density of GNSS gaps, calculated as the number of gaps divided by the number of flights.
      • Format: Decimal (e.g., 0).
    8. jumps_density
      • Description: Density of GNSS jumps, calculated as the number of jumps divided by the number of flights.
      • Format: Decimal (e.g., 0.035714286).
    9. event_id_cnty
      • Description: ACLED event ID corresponding to political violence events in the grid on that day.
      • Format: String (e.g., BRA69267).
    10. disorder_type
      • Description: Type of disorder as classified by ACLED (e.g., "Political violence").
      • Format: String.
    11. event_type
      • Description: General category of the event according to ACLED (e.g., "Violence against civilians").
      • Format: String.
    12. sub_event_type
      • Description: Specific subtype of the event as per ACLED classification (e.g., "Attack").
      • Format: String.
    13. acled_count
      • Description: Number of ACLED events in the grid on that day.
      • Format: Integer (e.g., 1).
    14. acled_flag
      • Description: Indicator of ACLED event presence in the grid on that day (0 for no events, 1 for one or more events).
      • Format: Integer (0 or 1).

    Data Fields: Monthly_GNSS_Anomalies_and_ACLED-2023-V9.csv

    The file contains monthly aggregated GNSS anomaly and ACLED event data per grid cell. The structure and meaning of each field are detailed below:

    1. grid_id
      • Description: Unique identifier for a grid cell on Earth measuring 0.5° latitude by 0.5° longitude.
      • Format: String combining latitude and longitude (e.g., -0.5_-79.0).
    2. year_month
      • Description: Month and year of the aggregated data.
      • Format: String in Mon-YY format (e.g., Jan-23).
    3. geometry
      • Description: Polygon coordinates of the grid cell in Well-Known Text (WKT) format.
      • Format: POLYGON((longitude latitude, ...))
        (e.g., POLYGON((-79.0 -0.5, -78.5 -0.5, -78.5 0.0, -79.0 0.0, -79.0 -0.5))).
    4. flights
      • Description: Total number of aircraft flights that passed through the grid cell during the month.
      • Format: Integer (e.g., 1230).
    5. GPS_jumps
      • Description: Total number of GNSS "jump" anomalies (possible spoofing events) in the grid cell during the month.
      • Format: Integer (e.g., 13).
    6. GPS_gaps
      • Description: Total number of GNSS "gap" anomalies, indicating interruptions in aircraft routes, during the month.
      • Format: Integer (e.g., 0).
    7. event_id_cnty
      • Description: Semicolon-separated list of ACLED event IDs associated with the grid cell during the month.
      • Format: String (e.g., ECU3151;ECU3158;ECU3150).
    8. disorder_type
      • Description: Semicolon-separated list of disorder types (e.g., "Political violence", "Demonstrations") reported by ACLED in that grid cell during the month.
      • Format: String.
    9. event_type
      • Description: Semicolon-separated list of high-level ACLED event types (e.g., "Riots", "Protests").
      • Format: String.
    10. sub_event_type
    • Description: Semicolon-separated list of detailed subtypes of ACLED events (e.g., "Mob violence", "Armed clash").
    • Format: String.
    1. acled_count
    • Description: Total number of ACLED conflict events in the grid cell during the month.
    • Format: Integer (e.g., 2).
    1. acled_flag
    • Description: Conflict presence indicator: 1 if any ACLED event occurred in the grid cell during the month, otherwise 0.
    • Format: Integer (0 or 1).
    1. gaps_density
    • Description: Monthly density of GNSS gaps, calculated as GPS_gaps / flights.
    • Format: Decimal (e.g., 0.0).
    1. jumps_density
    • Description: Monthly density of GNSS jumps, calculated as GPS_jumps / flights.
    • Format: Decimal (e.g., 0.0106).

    Data Sources

    • GNSS Anomalies Data:
      • Calculated from ADS-B (Automatic Dependent Surveillance-Broadcast) messages obtained via the OpenSky Network's Trino database.
      • GNSS anomalies include "jumps" (potential spoofing incidents) and "gaps" (interruptions in aircraft route data).

    • Political Violence Events Data:
      • Sourced from the ACLED database, which provides detailed information on political violence and protest events worldwide.

    Temporal and Spatial Coverage

    • Temporal Coverage:
      • From January 1, 2023, to December 31, 2023.
      • Daily records provide temporal granularity for time-series analysis.
    • Spatial Coverage:
      • Global coverage with grid cells measuring 0.5 degrees latitude by 0.5 degrees longitude.
      • Each grid cell represents an area on Earth's surface, facilitating spatial

  16. u

    ORCAS Flight Plans

    • data.ucar.edu
    • ckanprod.ucar.edu
    archive
    Updated Aug 1, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Britton B. Stephens (2025). ORCAS Flight Plans [Dataset]. http://doi.org/10.5065/D66T0K20
    Explore at:
    archiveAvailable download formats
    Dataset updated
    Aug 1, 2025
    Authors
    Britton B. Stephens
    Time period covered
    Jan 5, 2016 - Feb 29, 2016
    Area covered
    Description

    This data contains the flight plans for the NSF/NCAR HIAPER Gulfstream V (GV) aircraft flown during the O2/N2 Ratio and CO2 Airborne Southern Ocean (ORCAS) Study. Data covers two test flights and 13 research flights between 5 January and 29 February 2016. The data is comprised of text, csv, kml, png, and html files. It also includes an R-based flight plan tool for viewing the data, with associated instructions and configuration files.

  17. Last words before the plane crash

    • kaggle.com
    Updated May 26, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Michal Bogacz (2024). Last words before the plane crash [Dataset]. https://www.kaggle.com/datasets/michau96/last-words-before-the-plane-crash
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 26, 2024
    Dataset provided by
    Kaggle
    Authors
    Michal Bogacz
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Description

    Context

    Air accidents are extremely rare, especially in recent years. One, in tragedy situations, we often manage to obtain a record of various flight parameters as well as words spoken in the cockpit just before the disaster. The database presents the record of the last sentence spoken by the plane's drivers before the crash, out of 82 disasters for which such results were obtained.

    Content and methodology

    The data was obtained using webscraping. The Python (version 3.10) language with the "BeautifulSoup", "requests", "re" and "pandas" packages was used for this process and "SelectorGadet" add-on, which made the work with the site easier. Each line in the database refers to one crush. The data was downloaded from planecrashinfo.com, which aggregates various types of information on air accidents from various sources. The database contains 4 columns that contain information on: date of the incident, airlane, flight number, if available, and a record of the last words.

    Photo by Douglas Bagg on Unsplash

  18. f

    Sawyer Mill Dam Removal Project Vegetation Area Drone Flight Paths

    • figshare.com
    pdf
    Updated May 26, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Alexandra Evans; Kevin Gardner (2021). Sawyer Mill Dam Removal Project Vegetation Area Drone Flight Paths [Dataset]. http://doi.org/10.6084/m9.figshare.14669487.v1
    Explore at:
    pdfAvailable download formats
    Dataset updated
    May 26, 2021
    Dataset provided by
    figshare
    Authors
    Alexandra Evans; Kevin Gardner
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    These are the automated drone flight paths for the vegetation area of the Evans et al. Sawyer Mill dam removal study in Dover, New Hampshire, USA. These flight paths are specific to the reservoir response manuscript. A DJI Phantom 3 Professional drone with its original RGB camera equipped with a polarizing filter was used for the study.All flight paths are available as csv files that were exported out of the Litchi mission hub. The pdf file contains screenshots of the paths in DJIFlightPlanner and Litchi. For information on the Litchi flight app, please see their website here: https://flylitchi.com/. The same set of flight paths were flown in both 2019 and 2020 to keep imagery collection consistent. Nadir flight paths had 80% side image overlap and 90% forward image overlap set in DJIFlightPlanner. The imagery angle for the angled flight paths was set to 20 degrees off nadir (-70 gimbal pitch in Litchi), and the paths of the angled flights were manually drawn in Litchi to supplement the nadir flight paths that were designed in DJIFlightPlanner.For the vegetation area, the altitude of the drone was set to 150 feet above ground level. The angled flight paths had to be traced and executed manually due to an error in Litchi (“waypoint distance too close”). This was not fixed for the 2020 flight date, so the angled flights were manually flown in 2020, as well.These materials were made using resources from an NSF EPSCoR funded project “RII Track-2 FEC: Strengthening the scientific basis for decision-making about dams: Multi-scale, coupled-systems research on ecological, social, and economic trade-offs” (a.k.a. "Future of Dams"). Support for this project is provided by the National Science Foundation’s Research Infrastructure Improvement NSF #IIA-1539071. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.

  19. d

    Flight data from: Acrobatics at the insect-scale: A durable, precise, and...

    • search.dataone.org
    • datadryad.org
    Updated Dec 25, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Suhan Kim; Yi-Hsuan Hsiao; Zhijian Ren; Jiashu Huang; Yufeng Chen (2024). Flight data from: Acrobatics at the insect-scale: A durable, precise, and agile micro-aerial-robot [Dataset]. http://doi.org/10.5061/dryad.0p2ngf28q
    Explore at:
    Dataset updated
    Dec 25, 2024
    Dataset provided by
    Dryad Digital Repository
    Authors
    Suhan Kim; Yi-Hsuan Hsiao; Zhijian Ren; Jiashu Huang; Yufeng Chen
    Description

    Aerial insects are exceptionally agile and precise owing to their small size and fast neuromotor control. They perform impressive acrobatic maneuvers when they evade predators, recover from wind gust, or land on moving objects. Flapping-wing propulsion is advantageous for achieving flight agility because it can generate large changes of instantaneous forces and torques. During flapping-wing flight, the wings, hinges, and tendons of pterygote insects endure large deformation and high stress hundreds of times each second, highlighting the outstanding flexibility and fatigue resistance of biological structures and materials. In comparison, engineered materials and microscale structures in sub-gram micro-aerial-vehicles (MAVs) exhibit substantially shorter lifespan. Consequently, most sub-gram MAVs are limited to hovering for less than 10 seconds or following simple trajectories at slow speeds. Here, we developed a 750-milligram flapping-wing MAV that demonstrated unprecedented lifespan, sp..., The dataset comprises raw data, including position and Euler angles (using the XYZ convention), collected from a motion-capturing system (Vicon Vantage V5 and Vicon Tracker 3.9). The data was retrieved from Vicon Tracker 3.9 and transmitted in real-time to a target computer (Speedgoat) via asynchronous UDP. All data was saved at 10 kHz on the target computer, with no post-processing applied., , # Flight data from: Acrobatics at the insect-scale: a durable, precise, and agile micro-aerial-robot

    Data format

    The data is saved in Comma-Separated Value (.csv) format. The first column of each .csv file represents the time (in seconds) recorded during the flight. The subsequent columns are organized in groups of six: the first three columns show the x, y, and z positions (in meters), and the next three columns contain the Euler angles in the XYZ convention (in radians). The corresponding flight numbers are also included in the column names to demonstrate repeatability.

    List of flight data

    The following list shows the filenames and the corresponding flights (in terms of figure numbers) presented in the manuscript:

    • "MIT_letter.csv" - Fig. 1 (D) and Fig. S7
    • "1000s.csv" - Fig. 4Â
    • "infinity_sign.csv" - Fig. 5 (A-D) and Fig. S4
    • "planar_circle.csv" - Fig. 5 (E-H) and Fig. S5
    • "rotating_infinity_sign.csv" - Fig. 5 (I-M) and Fig. S6
    • "single_flip.csv" - Fig 6 (A-F) and Fi...
  20. f

    Additional file 2 of Measuring the potential of individual airports for...

    • springernature.figshare.com
    • figshare.com
    txt
    Updated Jun 2, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Glenn Lawyer (2023). Additional file 2 of Measuring the potential of individual airports for pandemic spread over the world airline network [Dataset]. http://doi.org/10.6084/m9.figshare.c.3624734_D1.v1
    Explore at:
    txtAvailable download formats
    Dataset updated
    Jun 2, 2023
    Dataset provided by
    figshare
    Authors
    Glenn Lawyer
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    World
    Description

    Airport AEF values. This CSV file gives the AEF of the airports as calculated and used in the current study. Airports are indexed by IATA code, and also by city and country. AEF values are normalized to the range 0,100. (CSV 131 kb)

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Sourav Banerjee (2023). Airline Dataset [Dataset]. https://www.kaggle.com/datasets/iamsouravbanerjee/airline-dataset
Organization logo

Airline Dataset

Navigating the Skies: Exploring Insights from Synthetic Airline Data

Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Sep 26, 2023
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Sourav Banerjee
License

https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

Description

Context

Airline data holds immense importance as it offers insights into the functioning and efficiency of the aviation industry. It provides valuable information about flight routes, schedules, passenger demographics, and preferences, which airlines can leverage to optimize their operations and enhance customer experiences. By analyzing data on delays, cancellations, and on-time performance, airlines can identify trends and implement strategies to improve punctuality and mitigate disruptions. Moreover, regulatory bodies and policymakers rely on this data to ensure safety standards, enforce regulations, and make informed decisions regarding aviation policies. Researchers and analysts use airline data to study market trends, assess environmental impacts, and develop strategies for sustainable growth within the industry. In essence, airline data serves as a foundation for informed decision-making, operational efficiency, and the overall advancement of the aviation sector.

Content

This dataset comprises diverse parameters relating to airline operations on a global scale. The dataset prominently incorporates fields such as Passenger ID, First Name, Last Name, Gender, Age, Nationality, Airport Name, Airport Country Code, Country Name, Airport Continent, Continents, Departure Date, Arrival Airport, Pilot Name, and Flight Status. These columns collectively provide comprehensive insights into passenger demographics, travel details, flight routes, crew information, and flight statuses. Researchers and industry experts can leverage this dataset to analyze trends in passenger behavior, optimize travel experiences, evaluate pilot performance, and enhance overall flight operations.

Dataset Glossary (Column-wise)

  • Passenger ID - Unique identifier for each passenger
  • First Name - First name of the passenger
  • Last Name - Last name of the passenger
  • Gender - Gender of the passenger
  • Age - Age of the passenger
  • Nationality - Nationality of the passenger
  • Airport Name - Name of the airport where the passenger boarded
  • Airport Country Code - Country code of the airport's location
  • Country Name - Name of the country the airport is located in
  • Airport Continent - Continent where the airport is situated
  • Continents - Continents involved in the flight route
  • Departure Date - Date when the flight departed
  • Arrival Airport - Destination airport of the flight
  • Pilot Name - Name of the pilot operating the flight
  • Flight Status - Current status of the flight (e.g., on-time, delayed, canceled)

Structure of the Dataset

https://i.imgur.com/cUFuMeU.png" alt="">

Acknowledgement

The dataset provided here is a simulated example and was generated using the online platform found at Mockaroo. This web-based tool offers a service that enables the creation of customizable Synthetic datasets that closely resemble real data. It is primarily intended for use by developers, testers, and data experts who require sample data for a range of uses, including testing databases, filling applications with demonstration data, and crafting lifelike illustrations for presentations and tutorials. To explore further details, you can visit their website.

Cover Photo by: Kevin Woblick on Unsplash

Thumbnail by: Airplane icons created by Freepik - Flaticon

Search
Clear search
Close search
Google apps
Main menu