Facebook
Twitterhttp://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/
This dataset provides a comprehensive overview of domestic airline routes within the United States. It includes valuable information for analyzing passenger travel patterns, market trends, and airline pricing strategies.
Facebook
TwitterOpen Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
License information was derived automatically
Daily data showing UK flight numbers and rolling seven-day average, including flights to, from, and within the UK. These are official statistics in development. Source: EUROCONTROL.
Facebook
TwitterAs new technologies are developed to handle the complexities of the Next Generation Air Transportation System (NextGen), it is increasingly important to address both current and future safety concerns along with the operational, environmental, and efficiency issues within the National Airspace System (NAS). In recent years, the Federal Aviation Administration’s (FAA) safety offices have been researching ways to utilize the many safety databases maintained by the FAA, such as those involving flight recorders, radar tracks, weather, and many other high-volume sensors, in order to monitor this unique and complex system. Although a number of current technologies do monitor the frequency of known safety risks in the NAS, very few methods currently exist that are capable of analyzing large data repositories with the purpose of discovering new and previously unmonitored safety risks. While monitoring the frequency of known events in the NAS enables mitigation of already identified problems, a more proactive approach of finding unidentified issues still needs to be addressed. This is especially important in the proactive identification of new, emergent safety issues that may result from the planned introduction of advanced NextGen air traffic management technologies and procedures. Development of an automated tool that continuously evaluates the NAS to discover both events exhibiting flight characteristics indicative of safety-related concerns as well as operational anomalies will heighten the awareness of such situations in the aviation community and serve to increase the overall safety of the NAS. This paper discusses the extension of previous anomaly detection work to identify operationally significant flights within the highly complex airspace encompassing the New York area of operations, focusing on the major airports of Newark International (EWR), LaGuardia International (LGA), and John F. Kennedy International (JFK). In addition, flight traffic in the vicinity of Denver International (DEN) airport/airspace is also investigated to evaluate the impact on operations due to variances in seasonal weather and airport elevation. From our previous research, subject matter experts determined that some of the identified anomalies were significant, but could not reach conclusive findings without additional supportive data. To advance this research further, causal examination using domain experts is continued along with the integration of air traffic control (ATC) voice data to shed much needed insight into resolving which flight characteristic(s) may be impacting an aircraft's unusual profile. Once a flight characteristic is identified, it could be included in a list of potential safety precursors. This paper also describes a process that has been developed and implemented to automatically identify and produce daily reports on flights of interest from the previous day.
Facebook
TwitterAirline on-time performance Have you ever been stuck in an airport because your flight was delayed or canceled and wondered if you could have predicted it if you'd had more data? This is your chance to find out.
The results We had a total of nine entries, and turn out at the poster session at the JSM was great, with plenty of people stopping by to find out why their flights were delayed.
The data The data consists of flight arrival and departure details for all commercial flights within the USA, from October 1987 to April 2008. This is a large dataset: there are nearly 120 million records in total and takes up 1.6 gigabytes of space when compressed and 12 gigabytes when uncompressed.
The challenge The aim of the data expo is to provide a graphical summary of important features of the data set. This is intentionally vague in order to allow different entries to focus on different aspects of the data, but here are a few ideas to get you started:
When is the best time of day/day of week/time of year to fly to minimise delays? Do older planes suffer more delays? How does the number of people flying between different locations change over time? How well does weather predict plane delays? Can you detect cascading failures as delays in one airport create delays in others? Are there critical links in the system? You are also welcome to work with interesting subsets: you might want to compare flight patterns before and after 9/11, or between the pair of cities that you fly between most often, or all flights to and from a major airport like Chicago (ORD). Smaller subsets may also help you to match up the data to other interesting datasets.
Columns | Name|Description| | --- | --- | |year| 1987-2008| |month| 1-12| |day of month| 1-31| |day of week| 1 (Monday) - 7 (Sunday)| |DepTime| actual departure time (minutes)| |CRSDepTime| scheduled departure time (minutes) |ArrTime| actual arrival time (minutes)| |CRSArrTime| scheduled arrival time (minutes)| |UniqueCarrier| unique carrier code| |FlightNum| flight number| |TailNum| plane tail number| |ActualElapsedTime| in minutes| |CRSElapsedTime| in minutes| |AirTime| in minutes| |ArrDelay| arrival delay, in minutes| |DepDelay| departure delay, in minutes| |Origin| origin IATA airport code| |Dest| destination IATA airport code| |Distance| in miles| |TaxiIn| taxi in time, in minutes| |TaxiOut| taxi out time in minutes| |Cancelled| was the flight cancelled?| |CancellationCode| reason for cancellation (A = carrier, B = weather, C = NAS, D = security)| |Diverted| 1 = yes, 0 = no| |CarrierDelay| in minutes| |WeatherDelay| in minutes| |NASDelay| in minutes| |SecurityDelay| in minutes| |LateAircraftDelay| in minutes|
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
General Aviation (GA) comprises all civil flights except scheduled passenger airline services. More than 90% of the roughly 220,000 civil aircraft registered in the United States (US) are GA aircraft. In contrast with airline service aircraft which operate with two pilots in a structured higher-altitude operational envelope, GA aircraft are often individually piloted in a more unstructured lower-altitude environment. This low altitude environment is also where a bulk of the next generation of Uncrewed Aerial Vehicles (UAVs) are expected to operate. These UAVs are expected to seamlessly interact with other UAVs and manned air traffic operating in this shared airspace. Nowhere is this manned-manned and potentially unmanned-manned interaction more pronounced than in low-altitude terminal airspace around airports. Low altitudes, multi-agent close-proximity interactions, dynamically changing conditions, and rapid decision making are hallmarks of this type of airspace as compared to en-route airspace where agents are typically well-separated.This dataset contains aircraft trajectories in an untowered terminal airspace collected over 8 months surrounding the Pittsburgh-Butler Regional Airport [ICAO:KBTP], a single runway GA airport, 10 miles North of the city of Pittsburgh, Pennsylvania. The trajectory data is recorded using an on-site setup that includes an ADS-B receiver. The trajectory data provided spans days from 18 Sept 2020 till 23 Apr 2021 and includes a total of 111 days of data discounting downtime, repairs, and bad weather days with no traffic. Data is collected starting at 1:00 AM local time to 11:00 PM local time. The dataset uses an Automatic Dependent Surveillance-Broadcast (ADS-B) receiver placed within the airport premises to capture the trajectory data. The receiver uses both the 1090 MHz and 978 MHz frequencies to listen to these broadcasts. The ADS-B uses satellite navigation to produce accurate location and timestamp for the targets which is recorded on-site using our custom setup. Weather data during the data collection time period is also included for environmental context. The weather data is obtained post-hoc using the METeorological Aerodrome Reports (METAR) strings generated by the Automated Weather Observing System (AWOS) system at KBTP. The raw METAR string is then appended to the raw trajectory data by matching the closest UTC timestamps.We also provide processed data that filters, interpolates and transforms data from a global frame to an airport-centred inertial frame. The inertial frame is centred at one end of the runway with the x-axis along the runway. Trajectories are filtered with aircrafts under 6000 ft MSL and around a 5km radius around the airport origin. We also remove duplicates and interpolate data every second. The proceed files also contain wind-data; a crucial factor in decision-making; separated in components along and perpendicular to the runway direction.More Information and Supplemental ToolsPlease visit http://theairlab.org/trajair/ for more information.
Facebook
TwitterThe number of flights performed globally by the airline industry has increased steadily since the early 2000s and reached **** million in 2019. However, due to the coronavirus pandemic, the number of flights dropped to **** million in 2020. The flight volume increased again in the following years and was forecasted to reach ** million in 2025.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
The "flights.csv" dataset contains information about the flights of an airport. This dataset includes information such as departure and arrival time, delays, flight company, flight number, flight origin and destination, flight duration, distance, hour and minute of flight, and exact date and time of flight. This data can be used in management analysis and strategies and provide useful information about the performance of flights and placement companies. The analysis of the data in this dataset can be used as a basis for the following activities: - Analysis of time patterns and trends: by examining the departure and arrival time of the aircraft, changes and time changes, patterns and trends in flight behavior can be identified. - Analysis of American companies: By viewing information about airlines such as the number of flights, the impact and overall performance, you can compare and analyze the performance of each company. - Analysis of delays and service quality: By examining delays and arrival time, I can collect and analyze information about the quality of services provided by the airport and companies. - Analysis of flight routes: by checking the origin and destination of flights, distances and flight duration, popular routes and people's choices can be identified and analyzed. - Analysis of airport performance: by observing the characteristics of flights and airport performance, it is possible to identify and analyze the strengths and weaknesses of the airport and suggest improvements.
It provides various tools for data analysis and visualization and can be used as a basis for managerial decisions in the field of aviation industry.
WN -- Southwest Airlines Co.
DL -- Delta Air Lines Inc.
AA -- American Airlines Inc.
UA -- United Air Lines Inc.
B6 -- JetBlue Airways
AS -- Alaska Airlines Inc.
NK -- Spirit Air Lines
G4 -- Allegiant Air
F9 -- Frontier Airlines Inc.
HA -- Hawaiian Airlines Inc.
SY -- Sun Country Airlines d/b/a MN Airlines
VX -- Virgin America
Facebook
TwitterAviation statistics user engagement survey
Thank you very much for all responses to the survey and your interest in DfT Aviation Statistics. All feedback will be taken into consideration when we publish the Aviation Statistics update later this year, alongside which, we will update the background information with details of the feedback and any future development plans.
AVI0101 (TSGB0201): https://assets.publishing.service.gov.uk/media/6753137f21057d0ed56a0415/avi0101.ods">Air traffic at UK airports: 1950 onwards (ODS, 9.93 KB)
AVI0102 (TSGB0202): https://assets.publishing.service.gov.uk/media/6753138a14973821ce2a6d22/avi0102.ods">Air traffic by operation type and airport, UK (ODS, 37.6 KB)
AVI0103 (TSGB0203): https://assets.publishing.service.gov.uk/media/67531395dcabf976e5fb0073/avi0103.ods">Punctuality at selected UK airports (ODS, 41.1 KB)
AVI0105 (TSGB0205): https://assets.publishing.service.gov.uk/media/675313a014973821ce2a6d23/avi0105.ods">International passenger movements at UK airports by last or next country travelled to (ODS, 20.7 KB)
AVI0106 (TSGB0206): https://assets.publishing.service.gov.uk/media/67531f09e40c78cba1fb008d/avi0106.ods">Proportion of transfer passengers at selected UK airports (ODS, 9.52 KB)
AVI0107 (TSGB0207): https://assets.publishing.service.gov.uk/media/67531d7a14973821ce2a6d2d/avi0107.ods">Mode of transport to the airport (ODS, 14.3 KB)
AVI0108 (TSGB0208): https://assets.publishing.service.gov.uk/media/67531f17dcabf976e5fb007f/avi0108.ods">Purpose of travel at selected UK airports (ODS, 15.7 KB)
AVI0109 (TSGB0209): https://assets.publishing.service.gov.uk/media/67531f3b20bcf083762a6d3b/avi0109.ods">Map of UK airports (ODS, 193 KB)
AVI0201 (TSGB0210): https://assets.publishing.service.gov.uk/media/67531f527e5323915d6a042f/avi0201.ods">Main outputs for UK airlines by type of service (ODS, 17.7 KB)
AVI0203 (TSGB0211): https://assets.publishing.service.gov.uk/media/67531f6014973821ce2a6d31/avi0203.ods">Worldwide employment by UK airlines (ODS, <span class="
Facebook
TwitterMotivation
The data in this dataset is derived and cleaned from the full OpenSky dataset to illustrate the development of air traffic during the COVID-19 pandemic. It spans all flights seen by the network's more than 2500 members since 1 January 2019. More data has been periodically included in the dataset until the end of the COVID-19 pandemic.
We stopped updating the dataset after December 2022. Previous files have been fixed after a thorough sanity check.
License
See LICENSE.txt
Disclaimer
The data provided in the files is provided as is. Despite our best efforts at filtering out potential issues, some information could be erroneous.
Origin and destination airports are computed online based on the ADS-B trajectories on approach/takeoff: no crosschecking with external sources of data has been conducted. Fields origin or destination are empty when no airport could be found.
Aircraft information come from the OpenSky aircraft database. Fields typecode and registration are empty when the aircraft is not present in the database.
Description of the dataset
One file per month is provided as a csv file with the following features:
callsign: the identifier of the flight displayed on ATC screens (usually the first three letters are reserved for an airline: AFR for Air France, DLH for Lufthansa, etc.)
number: the commercial number of the flight, when available (the matching with the callsign comes from public open API); this field may not be very reliable;
icao24: the transponder unique identification number;
registration: the aircraft tail number (when available);
typecode: the aircraft model type (when available);
origin: a four letter code for the origin airport of the flight (when available);
destination: a four letter code for the destination airport of the flight (when available);
firstseen: the UTC timestamp of the first message received by the OpenSky Network;
lastseen: the UTC timestamp of the last message received by the OpenSky Network;
day: the UTC day of the last message received by the OpenSky Network;
latitude_1, longitude_1, altitude_1: the first detected position of the aircraft;
latitude_2, longitude_2, altitude_2: the last detected position of the aircraft.
Examples
Possible visualisations and a more detailed description of the data are available at the following page:
Credit
If you use this dataset, please cite:
Martin Strohmeier, Xavier Olive, Jannis Lübbe, Matthias Schäfer, and Vincent Lenders "Crowdsourced air traffic data from the OpenSky Network 2019–2020" Earth System Science Data 13(2), 2021 https://doi.org/10.5194/essd-13-357-2021
Facebook
TwitterThis dataset contains the records of all the flights in the Northern California TRACON. The data was provided by the aircraft noise abatement office (http://www.flyquietsfo.com/) of San Francisco International Airport. The data cover Jan-Mar 2006. It is organized by day and flight. Each record contains some information about the flight and a sequence of 3D position and estimated speed. This data contains thousands of trajectories that can be used for trajectory clustering. The data is used by the Aircraft Noise Abatement Office to analyze the trajectories of aircraft flying in and out SFO. The objective is to minimize the noise pollution due to aircraft in the San Francisco Bay Area The files have the extension "lt6" and are organized as follow, one file per day. line number & explaination 1 TRACK OPNUM (TRACK header word and operation number) 2 eventid (Corralation number) 3 trackstart date (in time since 1900, A8 version four year digit) 4 trackstart time HH:MM:SS 5 trackend time HH:MM:SS 6 airportid 7 ACID (FLIGHTNUM/TAILNUMBER) 8 owner name 9 aircrafttype 10 aircraft category 11 beacon 12 adflag 13 waypoint 14 other_port (dest/origin) 15 runwayname 16 min alt 17 max alt 18 min range 19 max range 20 Count of trackpoints (to follow) 21 x,y,z,v,t (all points is meters relative to MRP, velocity and time from start of track)
Facebook
TwitterMultivariate regression data set from: https://link.springer.com/article/10.1007%2Fs10994-016-5546-z : The Airline Ticket Price dataset concerns the prediction of airline ticket prices. The rows are a sequence of time-ordered observations over several days. Each sample in this dataset represents a set of observations from a specific observation date and departure date pair. The input variables for each sample are values that may be useful for prediction of the airline ticket prices for a specific departure date. The target variables in these datasets are the next day (ATP1D) price or minimum price observed over the next 7 days (ATP7D) for 6 target flight preferences: (1) any airline with any number of stops, (2) any airline non-stop only, (3) Delta Airlines, (4) Continental Airlines, (5) Airtrain Airlines, and (6) United Airlines. The input variables include the following types: the number of days between the observation date and the departure date (1 feature), the boolean variables for day-of-the-week of the observation date (7 features), the complete enumeration of the following 4 values: (1) the minimum price, mean price, and number of quotes from (2) all airlines and from each airline quoting more than 50 % of the observation days (3) for non-stop, one-stop, and two-stop flights, (4) for the current day, previous day, and two days previous. The result is a feature set of 411 variables. For specific details on how these datasets are constructed please consult Groves and Gini (2015). The nature of these datasets is heterogeneous with a mixture of several types of variables including boolean variables, prices, and counts.
Facebook
TwitterUsing a combination of OAG flight schedule and ch-aviation fleet data, Capacities - Scheduled provides an overview of future flights scheduled per calendar day with a breakdown of seat capacity for five cabin classes (Economy, Economy Plus/Comfort, Premium Economy, Business, First) by operator and route (Continent, Country, Subdivision, Metro Group, Airport).
The data set is updated weekly.
The sample data shows capacity figures for Alaska Airlines, Swiss, and Horizon Air for one week.
Contact us to get access to ch-aviation's AWS S3 sample data bucket as well allowing you to build proof of concepts with all of our sample data.
The direct bucket URL for this data set is: https://eu-central-1.console.aws.amazon.com/s3/buckets/dataservices-standardised-samples?region=eu-central-1&bucketType=general&prefix=capacities_scheduled/&showversions=false
Full Technical Data Dictionary: https://about.ch-aviation.com/capacities-scheduled/
Facebook
Twitterhttps://dataful.in/terms-and-conditionshttps://dataful.in/terms-and-conditions
This dataset provides scheduled international flight operations of domestic carriers for airlines flying to and from India. It includes flight numbers, aircraft types, origin and destination airports, operating days, scheduled times, and the validity period of the schedule. The 'frequency' column indicates the days on which each flight operates, represented using digits 1–7, where 1 = Monday, 2 = Tuesday, 3 = Wednesday, 4 = Thursday, 5 = Friday, 6 = Saturday, and 7 = Sunday.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Large go-around, also referred to as missed approach, data set. The data set is in support of the paper presented at the OpenSky Symposium on November the 10th.
If you use this data for a scientific publication, please consider citing our paper.
The data set contains landings from 176 (mostly) large airports from 44 different countries. The landings are labelled as performing a go-around (GA) or not. In total, the data set contains almost 9 million landings with more than 33000 GAs. The data was collected from OpenSky Network's historical data base for the year 2019. The published data set contains multiple files:
go_arounds_minimal.csv.gz
Compressed CSV containing the minimal data set. It contains a row for each landing and a minimal amount of information about the landing, and if it was a GA. The data is structured in the following way:
Column name
Type
Description
time
date time
UTC time of landing or first GA attempt
icao24
string
Unique 24-bit (hexadecimal number) ICAO identifier of the aircraft concerned
callsign
string
Aircraft identifier in air-ground communications
airport
string
ICAO airport code where the aircraft is landing
runway
string
Runway designator on which the aircraft landed
has_ga
string
"True" if at least one GA was performed, otherwise "False"
n_approaches
integer
Number of approaches identified for this flight
n_rwy_approached
integer
Number of unique runways approached by this flight
The last two columns, n_approaches and n_rwy_approached, are useful to filter out training and calibration flight. These have usually a large number of n_approaches, so an easy way to exclude them is to filter by n_approaches > 2.
go_arounds_augmented.csv.gz
Compressed CSV containing the augmented data set. It contains a row for each landing and additional information about the landing, and if it was a GA. The data is structured in the following way:
Column name
Type
Description
time
date time
UTC time of landing or first GA attempt
icao24
string
Unique 24-bit (hexadecimal number) ICAO identifier of the aircraft concerned
callsign
string
Aircraft identifier in air-ground communications
airport
string
ICAO airport code where the aircraft is landing
runway
string
Runway designator on which the aircraft landed
has_ga
string
"True" if at least one GA was performed, otherwise "False"
n_approaches
integer
Number of approaches identified for this flight
n_rwy_approached
integer
Number of unique runways approached by this flight
registration
string
Aircraft registration
typecode
string
Aircraft ICAO typecode
icaoaircrafttype
string
ICAO aircraft type
wtc
string
ICAO wake turbulence category
glide_slope_angle
float
Angle of the ILS glide slope in degrees
has_intersection
string
Boolean that is true if the runway has an other runway intersecting it, otherwise false
rwy_length
float
Length of the runway in kilometre
airport_country
string
ISO Alpha-3 country code of the airport
airport_region
string
Geographical region of the airport (either Europe, North America, South America, Asia, Africa, or Oceania)
operator_country
string
ISO Alpha-3 country code of the operator
operator_region
string
Geographical region of the operator of the aircraft (either Europe, North America, South America, Asia, Africa, or Oceania)
wind_speed_knts
integer
METAR, surface wind speed in knots
wind_dir_deg
integer
METAR, surface wind direction in degrees
wind_gust_knts
integer
METAR, surface wind gust speed in knots
visibility_m
float
METAR, visibility in m
temperature_deg
integer
METAR, temperature in degrees Celsius
press_sea_level_p
float
METAR, sea level pressure in hPa
press_p
float
METAR, QNH in hPA
weather_intensity
list
METAR, list of present weather codes: qualifier - intensity
weather_precipitation
list
METAR, list of present weather codes: weather phenomena - precipitation
weather_desc
list
METAR, list of present weather codes: qualifier - descriptor
weather_obscuration
list
METAR, list of present weather codes: weather phenomena - obscuration
weather_other
list
METAR, list of present weather codes: weather phenomena - other
This data set is augmented with data from various public data sources. Aircraft related data is mostly from the OpenSky Network's aircraft data base, the METAR information is from the Iowa State University, and the rest is mostly scraped from different web sites. If you need help with the METAR information, you can consult the WMO's Aerodrom Reports and Forecasts handbook.
go_arounds_agg.csv.gz
Compressed CSV containing the aggregated data set. It contains a row for each airport-runway, i.e. every runway at every airport for which data is available. The data is structured in the following way:
Column name
Type
Description
airport
string
ICAO airport code where the aircraft is landing
runway
string
Runway designator on which the aircraft landed
n_landings
integer
Total number of landings observed on this runway in 2019
ga_rate
float
Go-around rate, per 1000 landings
glide_slope_angle
float
Angle of the ILS glide slope in degrees
has_intersection
string
Boolean that is true if the runway has an other runway intersecting it, otherwise false
rwy_length
float
Length of the runway in kilometres
airport_country
string
ISO Alpha-3 country code of the airport
airport_region
string
Geographical region of the airport (either Europe, North America, South America, Asia, Africa, or Oceania)
This aggregated data set is used in the paper for the generalized linear regression model.
Downloading the trajectories
Users of this data set with access to OpenSky Network's Impala shell can download the historical trajectories from the historical data base with a few lines of Python code. For example, you want to get all the go-arounds of the 4th of January 2019 at London City Airport (EGLC). You can use the Traffic library for easy access to the database:
import datetime from tqdm.auto import tqdm import pandas as pd from traffic.data import opensky from traffic.core import Traffic
df = pd.read_csv("go_arounds_minimal.csv.gz", low_memory=False) df["time"] = pd.to_datetime(df["time"])
airport = "EGLC" start = datetime.datetime(year=2019, month=1, day=4).replace( tzinfo=datetime.timezone.utc ) stop = datetime.datetime(year=2019, month=1, day=5).replace( tzinfo=datetime.timezone.utc )
df_selection = df.query("airport==@airport & has_ga & (@start <= time <= @stop)")
flights = [] delta_time = pd.Timedelta(minutes=10) for _, row in tqdm(df_selection.iterrows(), total=df_selection.shape[0]): # take at most 10 minutes before and 10 minutes after the landing or go-around start_time = row["time"] - delta_time stop_time = row["time"] + delta_time
# fetch the data from OpenSky Network
flights.append(
opensky.history(
start=start_time.strftime("%Y-%m-%d %H:%M:%S"),
stop=stop_time.strftime("%Y-%m-%d %H:%M:%S"),
callsign=row["callsign"],
return_flight=True,
)
)
Traffic.from_flights(flights)
Additional files
Additional files are available to check the quality of the classification into GA/not GA and the selection of the landing runway. These are:
validation_table.xlsx: This Excel sheet was manually completed during the review of the samples for each runway in the data set. It provides an estimate of the false positive and false negative rate of the go-around classification. It also provides an estimate of the runway misclassification rate when the airport has two or more parallel runways. The columns with the headers highlighted in red were filled in manually, the rest is generated automatically.
validation_sample.zip: For each runway, 8 batches of 500 randomly selected trajectories (or as many as available, if fewer than 4000) classified as not having a GA and up to 8 batches of 10 random landings, classified as GA, are plotted. This allows the interested user to visually inspect a random sample of the landings and go-arounds easily.
Facebook
TwitterAttribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
License information was derived automatically
Summary:
Estimated stand-off distance between ADS-B equipped aircraft and obstacles. Obstacle information was sourced from the FAA Digital Obstacle File and the FHWA National Bridge Inventory. Aircraft tracks were sourced from processed data curated from the OpenSky Network. Results are presented as histograms organized by aircraft type and distance away from runways.
Description:
For many aviation safety studies, aircraft behavior is represented using encounter models, which are statistical models of how aircraft behave during close encounters. They are used to provide a realistic representation of the range of encounter flight dynamics where an aircraft collision avoidance system would be likely to alert. These models currently and have historically have been limited to interactions between aircraft; they have not represented the specific interactions between obstacles and aircraft equipped transponders. In response, we calculated the standoff distance between obstacles and ADS-B equipped manned aircraft.
For robustness, this assessment considered two different datasets of manned aircraft tracks and two datasets of obstacles. For robustness, MIT LL calculated the standoff distance using two different datasets of aircraft tracks and two datasets of obstacles. This approach aligned with the foundational research used to support the ASTM F3442/F3442M-20 well clear criteria of 2000 feet laterally and 250 feet AGL vertically.
The two datasets of processed tracks of ADS-B equipped aircraft curated from the OpenSky Network. It is likely that rotorcraft were underrepresented in these datasets. There were also no considerations for aircraft equipped only with Mode C or not equipped with any transponders. The first dataset was used to train the v1.3 uncorrelated encounter models and referred to as the “Monday” dataset. The second dataset is referred to as the “aerodrome” dataset and was used to train the v2.0 and v3.x terminal encounter model. The Monday dataset consisted of 104 Mondays across North America. The other dataset was based on observations at least 8 nautical miles within Class B, C, D aerodromes in the United States for the first 14 days of each month from January 2019 through February 2020. Prior to any processing, the datasets required 714 and 847 Gigabytes of storage. For more details on these datasets, please refer to "Correlated Bayesian Model of Aircraft Encounters in the Terminal Area Given a Straight Takeoff or Landing" and “Benchmarking the Processing of Aircraft Tracks with Triples Mode and Self-Scheduling.”
Two different datasets of obstacles were also considered. First was point obstacles defined by the FAA digital obstacle file (DOF) and consisted of point obstacle structures of antenna, lighthouse, meteorological tower (met), monument, sign, silo, spire (steeple), stack (chimney; industrial smokestack), transmission line tower (t-l tower), tank (water; fuel), tramway, utility pole (telephone pole, or pole of similar height, supporting wires), windmill (wind turbine), and windsock. Each obstacle was represented by a cylinder with the height reported by the DOF and a radius based on the report horizontal accuracy. We did not consider the actual width and height of the structure itself. Additionally, we only considered obstacles at least 50 feet tall and marked as verified in the DOF.
The other obstacle dataset, termed as “bridges,” was based on the identified bridges in the FAA DOF and additional information provided by the National Bridge Inventory. Due to the potential size and extent of bridges, it would not be appropriate to model them as point obstacles; however, the FAA DOF only provides a point location and no information about the size of the bridge. In response, we correlated the FAA DOF with the National Bridge Inventory, which provides information about the length of many bridges. Instead of sizing the simulated bridge based on horizontal accuracy, like with the point obstacles, the bridges were represented as circles with a radius of the longest, nearest bridge from the NBI. A circle representation was required because neither the FAA DOF or NBI provided sufficient information about orientation to represent bridges as rectangular cuboid. Similar to the point obstacles, the height of the obstacle was based on the height reported by the FAA DOF. Accordingly, the analysis using the bridge dataset should be viewed as risk averse and conservative. It is possible that a manned aircraft was hundreds of feet away from an obstacle in actuality but the estimated standoff distance could be significantly less. Additionally, all obstacles are represented with a fixed height, the potentially flat and low level entrances of the bridge are assumed to have the same height as the tall bridge towers. The attached figure illustrates an example simulated bridge.
It would had been extremely computational inefficient to calculate the standoff distance for all possible track points. Instead, we define an encounter between an aircraft and obstacle as when an aircraft flying 3069 feet AGL or less comes within 3000 feet laterally of any obstacle in a 60 second time interval. If the criteria were satisfied, then for that 60 second track segment we calculate the standoff distance to all nearby obstacles. Vertical separation was based on the MSL altitude of the track and the maximum MSL height of an obstacle.
For each combination of aircraft track and obstacle datasets, the results were organized seven different ways. Filtering criteria were based on aircraft type and distance away from runways. Runway data was sourced from the FAA runways of the United States, Puerto Rico, and Virgin Islands open dataset. Aircraft type was identified as part of the em-processing-opensky workflow.
License
This dataset is licensed under Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International(CC BY-NC-ND 4.0).
This license requires that reusers give credit to the creator. It allows reusers to copy and distribute the material in any medium or format in unadapted form and for noncommercial purposes only. Only noncommercial use of your work is permitted. Noncommercial means not primarily intended for or directed towards commercial advantage or monetary compensation. Exceptions are given for the not for profit standards organizations of ASTM International and RTCA.
MIT is releasing this dataset in good faith to promote open and transparent research of the low altitude airspace. Given the limitations of the dataset and a need for more research, a more restrictive license was warranted. Namely it is based only on only observations of ADS-B equipped aircraft, which not all aircraft in the airspace are required to employ; and observations were source from a crowdsourced network whose surveillance coverage has not been robustly characterized.
As more research is conducted and the low altitude airspace is further characterized or regulated, it is expected that a future version of this dataset may have a more permissive license.
Distribution Statement
DISTRIBUTION STATEMENT A. Approved for public release. Distribution is unlimited.
© 2021 Massachusetts Institute of Technology.
Delivered to the U.S. Government with Unlimited Rights, as defined in DFARS Part 252.227-7013 or 7014 (Feb 2014). Notwithstanding any copyright notice, U.S. Government rights in this work are defined by DFARS 252.227-7013 or DFARS 252.227-7014 as detailed above. Use of this work other than as specifically authorized by the U.S. Government may violate any copyrights that exist in this work.
This material is based upon work supported by the Federal Aviation Administration under Air Force Contract No. FA8702-15-D-0001. Any opinions, findings, conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the Federal Aviation Administration.
This document is derived from work done for the FAA (and possibly others); it is not the direct product of work done for the FAA. The information provided herein may include content supplied by third parties. Although the data and information contained herein has been produced or processed from sources believed to be reliable, the Federal Aviation Administration makes no warranty, expressed or implied, regarding the accuracy, adequacy, completeness, legality, reliability or usefulness of any information, conclusions or recommendations provided herein. Distribution of the information contained herein does not constitute an endorsement or warranty of the data or information provided herein by the Federal Aviation Administration or the U.S. Department of Transportation. Neither the Federal Aviation Administration nor the U.S. Department of
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
India All Scheduled Airlines: Domestic: Number of Flight data was reported at 102,319.000 Unit in Mar 2025. This records an increase from the previous number of 92,291.000 Unit for Feb 2025. India All Scheduled Airlines: Domestic: Number of Flight data is updated monthly, averaging 48,100.000 Unit from Apr 2001 (Median) to Mar 2025, with 288 observations. The data reached an all-time high of 102,319.000 Unit in Mar 2025 and a record low of 188.000 Unit in Apr 2020. India All Scheduled Airlines: Domestic: Number of Flight data remains active status in CEIC and is reported by Directorate General of Civil Aviation. The data is categorized under India Premium Database’s Transportation, Post and Telecom Sector – Table IN.TA019: Airline Statistics: All Scheduled Airlines.
Facebook
TwitterAttribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Hoàn hảo 👍 Dưới đây là bản mô tả (description) hoàn chỉnh — em chỉ cần copy và dán trực tiếp vào phần “Dataset Description” trên Kaggle. Mình đã viết theo chuẩn phong cách Kaggle (ngắn gọn, chuyên nghiệp, có markdown đẹp).
This dataset contains detailed information on 10,000 domestic flights within the United States during 2014. It was derived from a larger FAA dataset and includes essential flight attributes such as departure and arrival times, delays, carrier codes, origin and destination airports, and distances.
It’s a great dataset for practicing:
| Column | Description |
|---|---|
| year | Year of the flight (2014) |
| month | Month of the flight (1–12) |
| day | Day of the month |
| dep_time | Actual departure time (HHMM) |
| dep_delay | Departure delay in minutes (negative = early) |
| arr_time | Actual arrival time (HHMM) |
| arr_delay | Arrival delay in minutes (negative = early) |
| carrier | Airline carrier code (e.g., AS, VX, WN) |
| tailnum | Aircraft tail number |
| flight | Flight number |
| origin | Origin airport code (e.g., SEA, PDX) |
| dest | Destination airport code (e.g., LAX, SFO, HNL) |
| air_time | Actual flight time in minutes |
| distance | Flight distance in miles |
| hour | Departure hour (derived from dep_time) |
| minute | Departure minute (derived from dep_time) |
This dataset is a curated sample inspired by the nycflights13 dataset — a well-known dataset used in many Data Science and Machine Learning tutorials.
This dataset is shared for educational and research purposes under the CC BY 4.0 License.
flight delays, aviation, transportation, data analysis, machine learning, EDA, Hadoop, Spark, Big Data
Facebook
TwitterAttribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
This dataset contains information on air traffic passenger statistics by the airline. It includes information on the airlines, airports, and regions that the flights departed from and arrived at. It also includes information on the type of activity, price category, terminal, boarding area, and number of passengers
Air traffic passenger statistics can be a useful tool for understanding the airline industry and for making travel plans. This dataset from Open Flights contains information on air traffic passenger statistics by airline for 2017. The data includes the number of passengers, the operating airline, the published airline, the geographic region, the activity type code, the price category code, the terminal, the boarding area, and the year and month of the flight
License: Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0) - You are free to: - Share - copy and redistribute the material in any medium or format for non-commercial purposes only. - Adapt - remix, transform, and build upon the material for non-commercial purposes only. - You must: - Give appropriate credit - Provide a link to the license, and indicate if changes were made. - ShareAlike - You must distribute your contributions under the same license as the original. - You may not: - Use the material for commercial purposes.
File: Air_Traffic_Passenger_Statistics.csv | Column name | Description | |:--------------------------------|:------------------------------------------------------------------------------| | Activity Period | The date of the activity. (Date) | | Operating Airline | The airline that operated the flight. (String) | | Operating Airline IATA Code | The IATA code of the airline that operated the flight. (String) | | Published Airline | The airline that published the fare for the flight. (String) | | Published Airline IATA Code | The IATA code of the airline that published the fare for the flight. (String) | | GEO Summary | A summary of the geographic region. (String) | | GEO Region | The geographic region. (String) | | Activity Type Code | The type of activity. (String) | | Price Category Code | The price category of the fare. (String) | | Terminal | The terminal of the flight. (String) | | Boarding Area | The boarding area of the flight. (String) | | Passenger Count | The number of passengers on the flight. (Integer) | | Adjusted Activity Type Code | The type of activity, adjusted for missing data. (String) | | Adjusted Passenger Count | The number of passengers on the flight, adjusted for missing data. (Integer) | | Year | The year of the activity. (Integer) | | Month | The month of the activity. (Integer) |
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset contains data related to Air Traffic Management hotspots. Hotspots are created in the European airspaces when capacity for some pieces of airspace are foreseen to be infringed due to weather, congestion, strikes, etc. This anonymised dataset records around 5900 hotspots happening at 22 major European airports. These hotspots are generated through a simulator called Mercury that is fed with real data (in particular, real capacity reduction that happened in Europe for over a year, schedules etc) and simulates a day of operation, randomising events like delays, cancellation etc. More details on mercury can be found here [1] and [2].
The data, anonymised in terms of airports and airlines, is a dictionary which is structured as follows:
the top level key is the id of the airport, the value is list a of all regulations available for this airport.
each item of the list is a dictionary, with keys:
-- 'slot_times': list of all slots available to flights for this hotspot/regulation, in minutes since midnight.
-- 'etas': list of initial estimated arrival times of flights involved in the regulation, in minutes since midnight.
-- 'flight_ids': list of flight ids (in the same order than etas)
-- 'cost_vectors': list of cost vectors. Each item is a list itself, of length equal to the slot_times list. Each element of that list is the estimated cost that the airline owning the flight would incur, were the flight be assigned to this slot, in terms of: maintenance, crew, rebooking fees, market value loss, and curfew infringement, in 2014 euros. This cost is computed within the Mercury model and is based on [3].
-- 'airlines_flights': dictionary whose keys are airline ids and values are lists of ids of flights owned by the airline.
[1] https://www.sciencedirect.com/science/article/abs/pii/S0968090X21003600
[2] G. Gurtner, L. Delgado, and D.Valput, “An agent-based model for air transportation to capture network effects in assessing delay management mechanisms”, Transportation Research Part C: emerging Technologies, 2021.
Pre-print available here: https://westminsterresearch.westminster.ac.uk/item/v956w/an-agent-based-model-for-air-transportation-to-capture-network-effects-in-assessing-delay-management-mechanisms
[3] A. J. Cook and G. Tanner, “European airline delay cost reference values - updated and extended values (Version 4.1),” University of Westminster, London, 2015a
Facebook
TwitterFor the purposes of this paper, the National Airspace System (NAS) encompasses the operations of all aircraft which are subject to air traffic control procedures. The NAS is a highly complex dynamic system that is sensitive to aeronautical decision-making and risk management skills. In order to ensure a healthy system with safe flights a systematic approach to anomaly detection is very important when evaluating a given set of circumstances and for determination of the best possible course of action. Given the fact that the NAS is a vast and loosely integrated network of systems, it requires improved safety assurance capabilities to maintain an extremely low accident rate under increasingly dense operating conditions. Data mining based tools and techniques are required to support and aid operators’ (such as pilots, management, or policy makers) overall decision-making capacity. Within the NAS, the ability to analyze fleetwide aircraft data autonomously is still considered a significantly challenging task. For our purposes a fleet is defined as a group of aircraft sharing generally compatible parameter lists. Here, in this effort, we aim at developing a system level analysis scheme. In this paper we address the capability for detection of fleetwide anomalies as they occur, which itself is an important initiative toward the safety of the real-world flight operations. The flight data recorders archive millions of data points with valuable information on flights everyday. The operational parameters consist of both continuous and discrete (binary & categorical) data from several critical subsystems and numerous complex procedures. In this paper, we discuss a system level anomaly detection approach based on the theory of kernel learning to detect potential safety anomalies in a very large data base of commercial aircraft. We also demonstrate that the proposed approach uncovers some operationally significant events due to environmental, mechanical, and human factors issues in high dimensional, multivariate Flight Operations Quality Assurance (FOQA) data. We present the results of our detection algorithms on real FOQA data from a regional carrier.
Facebook
Twitterhttp://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/
This dataset provides a comprehensive overview of domestic airline routes within the United States. It includes valuable information for analyzing passenger travel patterns, market trends, and airline pricing strategies.