6 datasets found

z
AeroSonic YPAD-0523: Labelled audio dataset for acoustic detection and...
zenodo.org
csv, json, txt, zip
Updated Sep 23, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Blake Downward; Blake Downward (2023). AeroSonic YPAD-0523: Labelled audio dataset for acoustic detection and classification of aircraft [Dataset]. http://doi.org/10.5281/zenodo.8004081
Explore at:
zip, csv, json, txtAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.8004081
Dataset updated
Sep 23, 2023
Dataset provided by
Zenodo
Authors
Blake Downward; Blake Downward
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
AeroSonic YPAD-0523: Labelled audio dataset for acoustic detection and classification of aircraft
Version 0.2 (June 2023)

Publication
If using this data in an academic work, please reference the DOI and version.

Description
AeroSonic:YPAD-0523 is a specialised dataset of ADS-B labelled audio clips for research in the fields of aircraft noise attribution and machine listening, particularly acoustic detection and classification of low-flying aircraft. Audio files in this dataset were recorded at locations in close proximity to a flight path approaching or departing Adelaide International Airport’s (ICAO code: YPAD) primary runway, 05/23. Recordings are initially labelled from radio (ADS-B) messages received from the aircraft overhead. Each recording is then human verified, and trimmed to the best (subjective) 20 seconds of audio in which the target aircraft is audible.

A total of 1,890 audio clips are balanced across two top-level classes, “Aircraft” (3.57 hours: 642 20-second recordings) and “Silence” (3.37 hours: 1,248 5 and 10-second recordings). The aircraft class is then further broken-down into four unbalanced subclasses which broadly describe an aircrafts structure and propulsion mechanism. A variety of additional "airframe" features are provided to give researchers finer control of the dataset, and the opportunity to develop ontologies specific to their own use case.

For convenience, the dataset has been split into training (6.28 hours) and testing (0.66 hours) subsets, with the training set further split into 10 folds for cross-validation. Care has been taken to ensure the class distribution for each subset and fold does not significantly deviate from the overall distribution.

Researchers may find applications for this dataset in a number of fields; particularly aircraft noise isolation and monitoring in an urban environment, development of passive acoustic systems to assist radar technology, and understanding the sources of aircraft noise to help manufacturers design less-noisy aircraft.

Audio data
ADS-B (Automatic Dependent Surveillance–Broadcast) messages transmitted directly from aircraft are used to automatically capture and label audio recordings. A 60-second recording is triggered when an aircraft transmits a message indicating it is within a specified distance of the recording device. The file is labelled with a unique ICAO identifier code for the aircraft, as well as its last recorded altitude, date and time. The recording is then human verified and trimmed to 20 seconds - with the aircraft audible for the duration of the clip.

A balanced collection of urban background noise without aircraft (silence) is included with the dataset as a means of distinguishing location specific environmental noises from aircraft noises. 10-second background noise, or “silence” recordings are triggered only when there are no aircraft broadcasting that they are within a specified distance of the recording device. These "silence" recordings are also human verified to ensure no aircraft noise is present. The dataset contains 1,180 10-second clips, and 68 5-second clips of silence/ambient background noise.

Location information
Recordings have been collected from three (3) locations. GPS coordinates for each location are provided in the "locations.json" file. In order to protect privacy, coordinates have been provided for a road or public space nearby the recording device instead of its exact location.

Location: 0
Situated in a suburban environment approximately 15.5km north-east of the start/end of the runway. For Adelaide, typical south-westerly winds bring most arriving aircraft past this location on approach. Winds from the north or east will cause aircraft to take-off to the north-east, however not all departing aircraft will maintain a course to trigger a recording at this location. The "trigger distance" for this location is set for 3km to ensure small/slower aircraft and large/faster aircraft are captured within a sixty-second recording.

"Silence" or ambient background noises at this location include; cars, motorbikes, light-trucks, garbage trucks, power-tools, lawn mowers, construction sounds, sirens, people talking, dogs barking and a wide range of Australian native birds (New Holland Honeyeaters, Wattlebirds, Australian Magpies, Australian Ravens, Spotted Doves, Rainbow Lorikeets and others).

Location: 1
Situated approximately 500m south-east of the south-eastern end of the runway, this location is nearby recreational areas (golf course, skate park and parklands) with a busy road/highway inbetween the location and runway. This location features heavy winds and road traffic, as well as people talking, walking and riding, and also birds such as the Australian Magpie and Noisy Miner. The trigger distance for this location is set to 1km. Due to their low altitude aircraft are louder, but audible for a shorter time compared to "Location 0".

Location: 2
As an alternative to "Location 1", this location is situated approximately 950m south-east of the end of the runway. This location has a wastewater facility to the north, a residential area to the south and a popular beach to the west. This location offers greater wind protection and further distance from airport and highway noises. Ambient background sounds feature close proximity cars and motorbikes, cyclists, people walking, nail guns and other construction sounds, as well as the local birds mentioned above.

Aircraft metadata
Supplementary "airframe" metadata for all aircraft has been gathered to help broaden the research possibilities from this dataset. Airframe information was collected and cross-checked from a number of open-source databases. The author has no reason to beleive any significant errors exist in the "aircraft_meta" files, however future versions of this dataset plan to obtain aircraft information directly from ICAO (International Civil Aviation Organization) to ensure a single, verifiable source of information.

Class/subclass ontology (minutes of recordings)

0. no aircraft (202)
0: no aircraft (202)

1. aircraft (214)
1: piston-propeller aeroplane (12)
2: turbine-propeller aeroplane (37)
3: turbine-fan aeroplane (163)
4: rotorcraft (1.6)

The subclasses are a combination of the "airframe" and "engtype" features. Piston and Turboshaft rotorcraft/helicopters have been combined into a single subclass due to the small number of samples.

Data splits
Audio recordings have been split into training (90.5%) and test (9.5%) sets. The training set has further been split into 10 folds, giving researchers a common split to perform 10-fold cross-validation - ensuring reproducibility and comparative results. Data leakage into the test set has been avoided by ensuring recordings are disjointed from the training set by time and location - meaning samples in the test set for a particular location were recorded after any samples included in the training set for that particular location.

Labelled data
The entire dataset (training and test) is referenced and labelled in the "sample_meta.csv" file. Each row contains a reference to a unique recording and all the labels and features associated with that recording and aircraft.

Alternatively, these labels can be derived directly from the filename of the sample (see below), plus a JSON file which accompanies each aircraft sample. The "aircraft_meta.csv" and "aircraft_meta.json" files can be used to reference aircraft specific features - such as; manufacturer, engine type, ICAO type designator etc. (see below for all 14 airframe features).

File naming convention
Audio samples are in WAV format, and metadata for aircraft recordings are stored in JSON files. Both files share the same name, only differing by their file extension.

Basic Convention

“Aircraft ID + Date + Time + Location ID + Microphone ID”

“XXXXXX_YYYY-MM-DD_hh-mm-ss_X_X”

Sample with aircraft

{hex_id} _ {date} _ {time} _ {location_id} _ {microphone_id} . {file_ext}

7C7CD0_2023-05-09_12-42-55_2_1.wav
7C7CD0_2023-05-09_12-42-55_2_1.json

Sample without aircraft

“Silence” files are denoted with six (6) leading zeros rather than an aircraft hex code. All relevant metadata for “silence” samples are contained in the audio filename, and again in the accompanying “sample_meta.csv”

000000 _ {date} _ {time} _ {location_id} _ {microphone_id} . {file_ext}

000000_2023-05-09_12-30-55_2_1.wav

Columns/Labels
(found in sample_meta.csv, aircraft_meta.csv/json and aircraft recording JSON files)

train-test: Train-test split (train, test)

fold: Digit from 0 to 9 splitting the training subset 10 ways (else test)

filename: The filename of the audio recording

date: Date of the recording

time: Time of the recording

duration: Length of the recording (in seconds)

location_id: ID for the location of the recording

microphone_id: ID of the microphone used

hex_id: Unique ICAO 24-bit address for the aircraft
P
Suspicious Activity Detection Dataset Dataset
paperswithcode.com
Updated Mar 29, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). Suspicious Activity Detection Dataset Dataset [Dataset]. https://paperswithcode.com/dataset/suspicious-activity-detection-dataset
Explore at:
Dataset updated
Mar 29, 2025
Description
Description:

👉 Download the dataset here

This dataset has been meticulously curated to facilitate. The development and training of machine learning models specifically designed for detecting Suspicious Activity Detection Dataset. With a primary focus on shoplifting. The dataset is organized into two distinct categories: 'Suspicious' and 'Normal' activities. These classifications are intended to help models differentiate between typical behaviors and actions that may warrant further investigation in a retail setting.

Download Dataset

Structure and Organization

The dataset is structured into three main directories-train, test, and validation-each containing a balanced distribution of images from both categories. This structured approach ensures that the model is trained effectively, evaluated comprehensively, and validated on a diverse set of scenarios.

Train Folder: Contains a substantial number of images representing both suspicious and normal activities. This folder serves as the primary dataset for training the model, allowing it to learn and generalize patterns from a wide variety of scenarios.

Test Folder: Designed for evaluating the model's performance post-training, this folder contains a separate set of labeled images. The test data allows for unbiased performance evaluation, ensuring that the model can generalize well to unseen situations.

Validation Folder: This additional split is used during the model training process to tune hyperparameters and prevent overfitting by testing the model's accuracy on a smaller, separate dataset before final testing.

Labels and Annotations

Each image is accompanied by a corresponding label that indicates whether the activity is 'Suspicious' or 'Normal.' The dataset is fully labeled, making it ideal for supervised learning tasks. Additionally, the labels provide contextual information such as the type of activity or the environment in which it occurred, further enriching the dataset for nuanced model training.

Use Cases and Applications

This dataset is particularly valuable for Al applications in the retail industry, where detecting potential shoplifting or suspicious behaviors is crucial for loss prevention. The dataset can be used to train models for:

Real-Time Surveillance Systems: Integrate Al-driven models into surveillance cameras to detect and alert security personnel to potential threats.

Retail Analytics: Use the dataset to identify patterns in customer behavior, helping retailers optimize their store layouts or refine security measures.

Anomaly Detection: Extend the dataset's application beyond shoplifting to other suspicious activities, such as unauthorized access or vandalism in different environments.

Key Features

High-Quality Image Data: Each image is captured in various retail environments, providing a broad spectrum of lighting conditions, angles, and occlusions to challenge model performance.

Detailed Annotations: Beyond simple categorization, each image includes metadata that offers deeper insights, such as activity type, timestamp, and environmental conditions.

Scalable and Versatile: The dataset's comprehensive structure and annotations make it versatile for use in not only retail but also other security-critical environments like airports or stadiums.

Conclusion

This dataset offers a robust foundation for developing advanced machine learning. Models tailored for real-time activity detection. Providing critical tools for retail security, surveillance systems, and anomaly detection applications. With its rich variety of label data and organize structure. The Suspicious Activity Detection Dataset serves. As a valuable resource for any Al project focusing on enhancing safety and security through visual recognition.

This dataset is sourced from Kaggle.
Z
AeroSonicDB (YPAD-0523): Labelled audio dataset for acoustic detection and...
data.niaid.nih.gov
zenodo.org
Updated Aug 1, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Downward, Blake (2024). AeroSonicDB (YPAD-0523): Labelled audio dataset for acoustic detection and classification of aircraft [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_8000468
Explore at:
Dataset updated
Aug 1, 2024
Dataset authored and provided by
Downward, Blake
License
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Description
AeroSonicDB (YPAD-0523): Labelled audio dataset for acoustic detection and classification of aircraftVersion 1.1.2 (November 2023)

[UPDATE: June 2024]

Version 2.0 is currently in beta and can be found at https://zenodo.org/records/12775560. The repository is currently restricted, however you can gain access by emailing Blake Downward at aerosonicdb@gmail.com, or by submitting the following Google Form.

Version 2 vastly extends the number of Aircraft audio samples to over 3,000 (V1 contains 625 aircraft sampes), for more than 38 hours of strongly annotated aircraft audio (V1 contains 8.9 hours of aircraft audio).

Publication

When using this data in an academic work, please reference the dataset DOI and version. Please also reference the following paper which describes the methodology for collecting the dataset and presents baseline model results.

Downward, B., & Nordby, J. (2023). The AeroSonicDB (YPAD-0523) Dataset for Acoustic Detection and Classification of Aircraft. ArXiv, abs/2311.06368.

Description

AeroSonicDB:YPAD-0523 is a specialised dataset of ADS-B labelled audio clips for research in the fields of environmental noise attribution and machine listening, particularly acoustic detection and classification of low-flying aircraft. Audio files in this dataset were recorded at locations in close proximity to a flight path approaching or departing Adelaide International Airport's (ICAO code: YPAD) primary runway, 05/23. Recordings are initially labelled from radio (ADS-B) messages received from the aircraft overhead, then human verified and annotated with the first and final moments which the target aircraft is audible.

A total of 1,895 audio clips are distributed across two top-level classes, "Aircraft" (8.87 hours) and "Silence" (3.52 hours). The aircraft class is then further broken-down into four subclasses, which broadly describe the structure of the aircraft and propulsion mechanism. A variety of additional "airframe" features are provided to give researchers finer control of the dataset, and the opportunity to develop ontologies specific to their own use case.

For convenience, the dataset has been split into training (10.04 hours) and testing (2.35 hours) subsets, with the training set further split into 5 distinct folds for cross-validation. These splits are performed to prevent data-leakage between folds and the test set, ensuring samples collected in the same recording session (distinct in time, location and microphone) are assigned to the same fold.

Researchers may find applications for this dataset in a number of fields; particularly aircraft noise isolation and noise monitoring in an urban environment, development of passive acoustic systems to assist radar technology, and understanding the sources of aircraft noise to help manufacturers design less-noisy aircraft.

Audio data

ADS-B (Automatic Dependent Surveillance–Broadcast) messages transmitted directly from aircraft are used to automatically trigger, capture and label audio samples. A 60-second recording is triggered when an aircraft transmits a message indicating it is within a specified distance of the recording device (see "Location data" below for specifics). The resulting audio file is labelled with the unique ICAO identifier code for the aircraft, as well as its last reported altitude, date, time, location and microphone. The recording is then human verified and annotated with timestamps for the first and last moments the aircraft is audible. In total, AeroSonicDB contains 625 recordings of low-altitude aircraft - varying in length from 18 to 60 seconds, for a total of 8.87 hours of aircraft audio.

A collection of urban background noise without aircraft (silence) is included with the dataset as a means of distinguishing location specific environmental noises from aircraft noises. 10-second background noise, or "silence" recordings are triggered only when there are no aircraft broadcasting they are within a specified distance of the recording device (see "Location data" below). These "silence" recordings are also human verified to ensure no aircraft noise is present. The dataset contains 1,270 clips of silence/urban background noise.

Location data

Recordings have been collected from three (3) locations. GPS coordinates for each location are provided in the "locations.json" file. In order to protect privacy, coordinates have been provided for a road or public space nearby the recording device instead of its exact location.

Location: 0Situated in a suburban environment approximately 15.5km north-east of the start/end of the runway. For Adelaide, typical south-westerly winds bring most arriving aircraft past this location on approach. Winds from the north or east will cause aircraft to take-off to the north-east, however not all departing aircraft will maintain a course to trigger a recording at this location. The "trigger distance" for this location is set for 3km to ensure small/slower aircraft and large/faster aircraft are captured within a sixty-second recording.

"Silence" or ambient background noises at this location include; cars, motorbikes, light-trucks, garbage trucks, power-tools, lawn mowers, construction sounds, sirens, people talking, dogs barking and a wide range of Australian native birds (New Holland Honeyeaters, Wattlebirds, Australian Magpies, Australian Ravens, Spotted Doves, Rainbow Lorikeets and others).

Location: 1Situated approximately 500m south-east of the south-eastern end of the runway, this location is nearby recreational areas (golf course, skate park and parklands) with a busy road/highway inbetween the location and runway. This location features heavy winds and road traffic, as well as people talking, walking and riding, and also birds such as the Australian Magpie and Noisy Miner. The trigger distance for this location is set to 1km. Due to their low altitude aircraft are louder, but audible for a shorter time compared to "Location 0".

Location: 2As an alternative to "Location 1", this location is situated approximately 950m south-east of the end of the runway. This location has a wastewater facility to the north, a residential area to the south and a popular beach to the west. This location offers greater wind protection and further distance from airport and highway noises. Ambient background sounds feature close proximity cars and motorbikes, cyclists, people walking, nail guns and other construction sounds, as well as the local birds mentioned above.

Aircraft metadata

Supplementary "airframe" metadata for all aircraft has been gathered to help broaden the research possibilities from this dataset. Airframe information was collected and cross-checked from a number of open-source databases. The author has no reason to beleive any significant errors exist in the "aircraft_meta" files, however future versions of this dataset plan to obtain aircraft information directly from ICAO (International Civil Aviation Organization) to ensure a single, verifiable source of information.

Class/subclass ontology (minutes of recordings)

no aircraft (211) 0: no aircraft (211)

aircraft (533) 1: piston-propeller aeroplane (30) 2: turbine-propeller aeroplane (90) 3: turbine-fan aeroplane (409) 4: rotorcraft (4) The subclasses are a combination of the "airframe" and "engtype" features. Piston and Turboshaft rotorcraft/helicopters have been combined into a single subclass due to the small number of samples. Data splits

Audio recordings have been split into training (81%) and test (19%) sets. The training set has further been split into 5 folds, giving researchers a common split to perform 5-fold cross-validation to ensure reproducibility and comparable results. Data leakage into the test set has been avoided by ensuring recordings are disjointed from the training set by time and location - meaning samples in the test set for a particular location were recorded after any samples included in the training set for that particular location.

Labelled data

The entire dataset (training and test) is referenced and labelled in the "sample_meta.csv" file. Each row contains a reference to a unique recording, its meta information, annotations and airframe features.

Alternatively, these labels can be derived directly from the filename of the sample (see below). The "aircraft_meta.csv" and "aircraft_meta.json" files can be used to reference aircraft specific features - such as; manufacturer, engine type, ICAO type designator etc. (see "Columns/Labels" below for all features).

File naming convention

Audio samples are in WAV format, with some metadata stored in the filename.

Basic Convention

"Aircraft ID + Date + Time + Location ID + Microphone ID"

"XXXXXX_YYYY-MM-DD_hh-mm-ss_X_X"

Sample with aircraft

{hex_id} _ {date} _ {time} _ {location_id} _ {microphone_id} . {file_ext}

7C7CD0_2023-05-09_12-42-55_2_1.wav

Sample without aircraft

"Silence" files are denoted with six (6) leading zeros rather than an aircraft hex code. All relevant metadata for "silence" samples are contained in the audio filename, and again in the accompanying "sample_meta.csv"

000000 _ {date} _ {time} _ {location_id} _ {microphone_id} . {file_ext}

000000_2023-05-09_12-30-55_2_1.wav

Columns/Labels

(found in sample_meta.csv, aircraft_meta.csv/json files)

train-test: Train-test split (train, test)

fold: Digit from 1 to 5 splitting the training data 5 ways (else test)

filename: The filename of the audio recording

date: Date of the recording

time: Time of the recording

location: ID for the location of the recording

mic: ID of the microphone used

class: Top-level label for the recording (eg. 0 = No aircraft, 1 = Aircraft audible)

subclass: Subclass label for the recording (eg. 0 = No aircraft, 3 = Turbine-fan aeroplane)

altitude: Approximate altitude of the aircraft (in feet) at the start of the recording

hex_id: Unique ICAO 24-bit address for the aircraft recorded

session: Unique recording
z
AeroSonic YPAD-0523: Labelled audio dataset for acoustic detection and...
zenodo.org
bin, csv, json, txt +1
Updated Sep 23, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Blake Downward; Blake Downward (2023). AeroSonic YPAD-0523: Labelled audio dataset for acoustic detection and classification of aircraft [Dataset]. http://doi.org/10.5281/zenodo.8000469
Explore at:
bin, json, csv, zip, txtAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.8000469
Dataset updated
Sep 23, 2023
Dataset provided by
Zenodo
Authors
Blake Downward; Blake Downward
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
AeroSonic YPAD-0523: Labelled audio dataset for acoustic detection and classification of aircraft
Version 0.1 (June 2023)

Publication
If using this data in an academic work, please reference the DOI and version.

Description
AeroSonic:YPAD-0523 is a specialised dataset of ADS-B labelled audio clips for research in the fields of aircraft noise attribution and machine listening, particularly acoustic detection and classification of low-flying aircraft. Audio files in this dataset were recorded at locations in close proximity to a flight path approaching or departing Adelaide International Airport’s (ICAO code: YPAD) primary runway, 05/23. Recordings are initially labelled from radio (ADS-B) messages received from the aircraft overhead. Each recording is then human verified, and trimmed to the best (subjective) 20 seconds of audio in which the target aircraft is audible.

A total of 1,890 audio clips are balanced across two top-level classes, “Aircraft” (3.57 hours: 642 20-second recordings) and “Silence” (3.37 hours: 1,248 5 and 10-second recordings). The aircraft class is then further broken-down into four unbalanced subclasses which broadly describe an aircrafts structure and propulsion mechanism. A variety of additional "airframe" features are provided to give researchers finer control of the dataset, and the opportunity to develop ontologies specific to their own use case.

For convenience, the dataset has been split into training (6.28 hours) and testing (0.66 hours) subsets, with the training set further split into 10 folds for cross-validation. Care has been taken to ensure the class distribution for each subset and fold does not significantly deviate from the overall distribution.

Researchers may find applications for this dataset in a number of fields; particularly aircraft noise isolation and monitoring in an urban environment, development of passive acoustic systems to assist radar technology, and understanding the sources of aircraft noise to help manufacturers design less-noisy aircraft.

Audio data
ADS-B (Automatic Dependent Surveillance–Broadcast) messages transmitted directly from aircraft are used to automatically capture and label audio recordings. A 60-second recording is triggered when an aircraft transmits a message indicating it is within a specified distance of the recording device. The file is labelled with a unique ICAO identifier code for the aircraft, as well as its last recorded altitude, date and time. The recording is then human verified and trimmed to 20 seconds - with the aircraft audible for the duration of the clip.

A balanced collection of urban background noise without aircraft (silence) is included with the dataset as a means of distinguishing location specific environmental noises from aircraft noises. 10-second background noise, or “silence” recordings are triggered only when there are no aircraft broadcasting that they are within a specified distance of the recording device. These "silence" recordings are also human verified to ensure no aircraft noise is present. The dataset contains 1,180 10-second clips, and 68 5-second clips of silence/ambient background noise.

Location information

Location: 0
Situated in a suburban environment approximately 15.5km north-east of the start/end of the runway. For Adelaide, typical south-westerly winds bring most arriving aircraft past this location on approach. Winds from the north or east will cause aircraft to take-off to the north-east, however not all departing aircraft will maintain a course to trigger a recording at this location. The "trigger distance" for this location is set for 3km to ensure small/slower aircraft and large/faster aircraft are captured within a sixty-second recording.

"Silence" or ambient background noises at this location include; cars, motorbikes, light-trucks, garbage trucks, power-tools, lawn mowers, construction sounds, sirens, people talking, dogs barking and a wide range of Australian native birds (New Holland Honeyeaters, Wattlebirds, Australian Magpies, Australian Ravens, Spotted Doves, Rainbow Lorikeets and others).

Location: 1
Situated approximately 500m south-east of the south-eastern end of the runway, this location is nearby recreational areas (golf course, skate park and parklands) with a busy road/highway inbetween the location and runway. This location features heavy winds and road traffic, as well as people talking, walking and riding, and also birds such as the Australian Magpie and Noisy Miner. The trigger distance for this location is set to 1km. Due to their low altitude aircraft are louder, but audible for a shorter time compared to "Location 0".

Location: 2
As an alternative to "Location 1", this location is situated approximately 950m south-east of the end of the runway. This location has a wastewater facility to the north, a residential area to the south and a popular beach to the west. This location offers greater wind protection and further distance from airport and highway noises. Ambient background sounds feature close proximity cars and motorbikes, cyclists, people walking, nail guns and other construction sounds, as well as the local birds mentioned above.

Aircraft metadata
Supplementary "airframe" metadata for all aircraft has been gathered to help broaden the research possibilities from this dataset. Airframe information was collected and cross-checked from a number of open-source databases. The author has no reason to beleive any significant errors exist in the "aircraft_meta" files, however future versions of this dataset plan to obtain aircraft information directly from ICAO (International Civil Aviation Organization) to ensure a single, verifiable source of information.

Class/subclass ontology (minutes of recordings)

0. no aircraft (202)
0: no aircraft (202)

1. aircraft (214)
1: piston-propeller aeroplane (12)
2: turbine-propeller aeroplane (37)
3: turbine-fan aeroplane (163)
4: rotorcraft (1.6)

The subclasses are a combination of the "airframe" and "engtype" features. Piston and Turboshaft rotorcraft/helicopters have been combined into a single subclass due to the small number of samples.

Data splits
Audio recordings have been split into training (90.5%) and test (9.5%) sets. The training set has further been split into 10 folds, giving researchers a common split to perform 10-fold cross-validation - ensuring reproducibility and comparative results. Data leakage into the test set has been avoided by ensuring recordings are disjointed from the training set by time and location - meaning samples in the test set for a particular location were recorded after any samples included in the training set for that particular location.

Labelled data
The entire dataset (training and test) is referenced and labelled in the "sample_meta.csv" file. Each row contains a reference to a unique recording and all the labels and features associated with that recording and aircraft.

Alternatively, these labels can be derived directly from the filename of the sample (see below), plus a JSON file which accompanies each aircraft sample. The "aircraft_meta.csv" and "aircraft_meta.json" files can be used to reference aircraft specific features - such as; manufacturer, engine type, ICAO type designator etc. (see below for all 14 airframe features).

File naming convention
Audio samples are in WAV format, and metadata for aircraft recordings are stored in JSON files. Both files share the same name, only differing by their file extension.

Basic Convention

“Aircraft ID + Date + Time + Location ID + Microphone ID”

“XXXXXX_YYYY-MM-DD_hh-mm-ss_X_X”

Sample with aircraft

{hex_id} _ {date} _ {time} _ {location_id} _ {microphone_id} . {file_ext}

7C7CD0_2023-05-09_12-42-55_2_1.wav
7C7CD0_2023-05-09_12-42-55_2_1.json

Sample without aircraft

“Silence” files are denoted with six (6) leading zeros rather than an aircraft hex code. All relevant metadata for “silence” samples are contained in the audio filename, and again in the accompanying “sample_meta.csv”

000000 _ {date} _ {time} _ {location_id} _ {microphone_id} . {file_ext}

000000_2023-05-09_12-30-55_2_1.wav

Columns/Labels
(found in sample_meta.csv, aircraft_meta.csv/json and aircraft recording JSON files)

train-test: Train-test split (train, test)

fold: Digit from 0 to 9 splitting the training subset 10 ways (else test)

filename: The filename of the audio recording

date: Date of the recording

time: Time of the recording

duration: Length of the recording (in seconds)

location_id: ID for the location of the recording

microphone_id: ID of the microphone used

hex_id: Unique ICAO 24-bit address for the aircraft recorded

altitude: Approximate altitude of the aircraft (in feet) at the start of the recording

class: Top-level label for the recording (eg. 0 = No aircraft, 1 = Aircraft audible)

subclass:
InsectSet47 & InsectSet66: Expanded datasets for automatic acoustic...
zenodo.org
csv, txt, zip
Updated Apr 24, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Marius Faiß; Marius Faiß (2025). InsectSet47 & InsectSet66: Expanded datasets for automatic acoustic identification of insects (Orthoptera and Cicadidae) [Dataset]. http://doi.org/10.5281/zenodo.7828439
Explore at:
zip, csv, txtAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.7828439
Dataset updated
Apr 24, 2025
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Marius Faiß; Marius Faiß
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Two newly compiled datasets for training neural networks to automatically identify insect species while comparing adaptive, waveform-based frontends to conventional mel-spectrogram frontends for audio feature extraction. This work was published in PLOS Computational Biology and the machine learning implementations were published on Github.

These datasets expand on the previously published InsectSet32 by including recently published collections of insect recordings by citizen scientists from around the world. Recordings from BioAcoustica, xeno-canto and iNaturalist, as well as private collections by Baudewijn Odé were downloaded and manually inspected. Files with strong noise interference or intense filtering, as well as files containing sounds of multiple species were removed to compile these datasets. The files were standardised to 44.1 kHz mono WAV files ranging in length from less than one second to several minutes. Files containing long periods without insect sounds were edited into multiple smaller files with silent periods no longer than 5 seconds. These files are marked as edits in the annotation file and should be assigned together into train/validation/test sets to prevent data leakage. The annotation files contain information for each recording, including the file name, species name and identifier, as well as the data subset they were included in for training the neural network (training, test, validation).

InsectSet47 expands on InsectSet32 with recordings from xeno-canto and contains 1006 original recordings from 47 species, with at least ten files per species. The total length of InsectSet47 is 22 hours. InsectSet66 further expands on InsectSet47 by adding research-grade audio observations from iNaturalist, with a total of 1554 recordings from 66 species, a total length of over 24 hours and a minimum of ten files per species.

The datasets were split into the training, validation and test sets while ensuring a roughly equal distribution of audio files and audio material for every species in all three subsets. This resulted in a 60/20/20 split (train/validation/test) by file number and a 64/19.5/16.5 split by file length.

NOTE: Due to the inclusion of InsectSet47 and InsectSet66 in a coding challenge, the files included in the test datasets will be held back until the end of September 2023 as of now. The full datasets including the test sets will then be published as a new version of these datasets under the same identifier.
f
Classifier metrics for the prospective test set.
plos.figshare.com
xls
Updated Dec 6, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Erik Bergman; Luise Dürlich; Veronica Arthurson; Anders Sundström; Maria Larsson; Shamima Bhuiyan; Andreas Jakobsson; Gabriel Westman (2023). Classifier metrics for the prospective test set. [Dataset]. http://doi.org/10.1371/journal.pdig.0000409.t003
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pdig.0000409.t003
Dataset updated
Dec 6, 2023
Dataset provided by
PLOS Digital Health
Authors
Erik Bergman; Luise Dürlich; Veronica Arthurson; Anders Sundström; Maria Larsson; Shamima Bhuiyan; Andreas Jakobsson; Gabriel Westman
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Post-marketing reports of suspected adverse drug reactions are important for establishing the safety profile of a medicinal product. However, a high influx of reports poses a challenge for regulatory authorities as a delay in identification of previously unknown adverse drug reactions can potentially be harmful to patients. In this study, we use natural language processing (NLP) to predict whether a report is of serious nature based solely on the free-text fields and adverse event terms in the report, potentially allowing reports mislabelled at time of reporting to be detected and prioritized for assessment. We consider four different NLP models at various levels of complexity, bootstrap their train-validation data split to eliminate random effects in the performance estimates and conduct prospective testing to avoid the risk of data leakage. Using a Swedish BERT based language model, continued language pre-training and final classification training, we achieve close to human-level performance in this task. Model architectures based on less complex technical foundation such as bag-of-words approaches and LSTM neural networks trained with random initiation of weights appear to perform less well, likely due to the lack of robustness that a base of general language training provides.
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

Blake Downward; Blake Downward (2023). AeroSonic YPAD-0523: Labelled audio dataset for acoustic detection and classification of aircraft [Dataset]. http://doi.org/10.5281/zenodo.8004081

AeroSonic YPAD-0523: Labelled audio dataset for acoustic detection and classification of aircraft

Explore at:

zip, csv, json, txtAvailable download formats

Unique identifier

https://doi.org/10.5281/zenodo.8004081

Dataset updated

Sep 23, 2023

Dataset provided by

Zenodo

Authors

Blake Downward; Blake Downward

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

AeroSonic YPAD-0523: Labelled audio dataset for acoustic detection and classification of aircraft
Version 0.2 (June 2023)

Publication
If using this data in an academic work, please reference the DOI and version.

Description
AeroSonic:YPAD-0523 is a specialised dataset of ADS-B labelled audio clips for research in the fields of aircraft noise attribution and machine listening, particularly acoustic detection and classification of low-flying aircraft. Audio files in this dataset were recorded at locations in close proximity to a flight path approaching or departing Adelaide International Airport’s (ICAO code: YPAD) primary runway, 05/23. Recordings are initially labelled from radio (ADS-B) messages received from the aircraft overhead. Each recording is then human verified, and trimmed to the best (subjective) 20 seconds of audio in which the target aircraft is audible.

A total of 1,890 audio clips are balanced across two top-level classes, “Aircraft” (3.57 hours: 642 20-second recordings) and “Silence” (3.37 hours: 1,248 5 and 10-second recordings). The aircraft class is then further broken-down into four unbalanced subclasses which broadly describe an aircrafts structure and propulsion mechanism. A variety of additional "airframe" features are provided to give researchers finer control of the dataset, and the opportunity to develop ontologies specific to their own use case.

For convenience, the dataset has been split into training (6.28 hours) and testing (0.66 hours) subsets, with the training set further split into 10 folds for cross-validation. Care has been taken to ensure the class distribution for each subset and fold does not significantly deviate from the overall distribution.

Researchers may find applications for this dataset in a number of fields; particularly aircraft noise isolation and monitoring in an urban environment, development of passive acoustic systems to assist radar technology, and understanding the sources of aircraft noise to help manufacturers design less-noisy aircraft.

Audio data
ADS-B (Automatic Dependent Surveillance–Broadcast) messages transmitted directly from aircraft are used to automatically capture and label audio recordings. A 60-second recording is triggered when an aircraft transmits a message indicating it is within a specified distance of the recording device. The file is labelled with a unique ICAO identifier code for the aircraft, as well as its last recorded altitude, date and time. The recording is then human verified and trimmed to 20 seconds - with the aircraft audible for the duration of the clip.

A balanced collection of urban background noise without aircraft (silence) is included with the dataset as a means of distinguishing location specific environmental noises from aircraft noises. 10-second background noise, or “silence” recordings are triggered only when there are no aircraft broadcasting that they are within a specified distance of the recording device. These "silence" recordings are also human verified to ensure no aircraft noise is present. The dataset contains 1,180 10-second clips, and 68 5-second clips of silence/ambient background noise.

Location information
Recordings have been collected from three (3) locations. GPS coordinates for each location are provided in the "locations.json" file. In order to protect privacy, coordinates have been provided for a road or public space nearby the recording device instead of its exact location.

Location: 0
Situated in a suburban environment approximately 15.5km north-east of the start/end of the runway. For Adelaide, typical south-westerly winds bring most arriving aircraft past this location on approach. Winds from the north or east will cause aircraft to take-off to the north-east, however not all departing aircraft will maintain a course to trigger a recording at this location. The "trigger distance" for this location is set for 3km to ensure small/slower aircraft and large/faster aircraft are captured within a sixty-second recording.

"Silence" or ambient background noises at this location include; cars, motorbikes, light-trucks, garbage trucks, power-tools, lawn mowers, construction sounds, sirens, people talking, dogs barking and a wide range of Australian native birds (New Holland Honeyeaters, Wattlebirds, Australian Magpies, Australian Ravens, Spotted Doves, Rainbow Lorikeets and others).

Location: 1
Situated approximately 500m south-east of the south-eastern end of the runway, this location is nearby recreational areas (golf course, skate park and parklands) with a busy road/highway inbetween the location and runway. This location features heavy winds and road traffic, as well as people talking, walking and riding, and also birds such as the Australian Magpie and Noisy Miner. The trigger distance for this location is set to 1km. Due to their low altitude aircraft are louder, but audible for a shorter time compared to "Location 0".

Location: 2
As an alternative to "Location 1", this location is situated approximately 950m south-east of the end of the runway. This location has a wastewater facility to the north, a residential area to the south and a popular beach to the west. This location offers greater wind protection and further distance from airport and highway noises. Ambient background sounds feature close proximity cars and motorbikes, cyclists, people walking, nail guns and other construction sounds, as well as the local birds mentioned above.

Aircraft metadata
Supplementary "airframe" metadata for all aircraft has been gathered to help broaden the research possibilities from this dataset. Airframe information was collected and cross-checked from a number of open-source databases. The author has no reason to beleive any significant errors exist in the "aircraft_meta" files, however future versions of this dataset plan to obtain aircraft information directly from ICAO (International Civil Aviation Organization) to ensure a single, verifiable source of information.

Class/subclass ontology (minutes of recordings)

0. no aircraft (202)
0: no aircraft (202)

1. aircraft (214)
1: piston-propeller aeroplane (12)
2: turbine-propeller aeroplane (37)
3: turbine-fan aeroplane (163)
4: rotorcraft (1.6)

The subclasses are a combination of the "airframe" and "engtype" features. Piston and Turboshaft rotorcraft/helicopters have been combined into a single subclass due to the small number of samples.

Data splits
Audio recordings have been split into training (90.5%) and test (9.5%) sets. The training set has further been split into 10 folds, giving researchers a common split to perform 10-fold cross-validation - ensuring reproducibility and comparative results. Data leakage into the test set has been avoided by ensuring recordings are disjointed from the training set by time and location - meaning samples in the test set for a particular location were recorded after any samples included in the training set for that particular location.

Labelled data
The entire dataset (training and test) is referenced and labelled in the "sample_meta.csv" file. Each row contains a reference to a unique recording and all the labels and features associated with that recording and aircraft.

Alternatively, these labels can be derived directly from the filename of the sample (see below), plus a JSON file which accompanies each aircraft sample. The "aircraft_meta.csv" and "aircraft_meta.json" files can be used to reference aircraft specific features - such as; manufacturer, engine type, ICAO type designator etc. (see below for all 14 airframe features).

File naming convention
Audio samples are in WAV format, and metadata for aircraft recordings are stored in JSON files. Both files share the same name, only differing by their file extension.

Basic Convention

“Aircraft ID + Date + Time + Location ID + Microphone ID”

“XXXXXX_YYYY-MM-DD_hh-mm-ss_X_X”

Sample with aircraft

{hex_id} _ {date} _ {time} _ {location_id} _ {microphone_id} . {file_ext}

7C7CD0_2023-05-09_12-42-55_2_1.wav
7C7CD0_2023-05-09_12-42-55_2_1.json

Sample without aircraft

“Silence” files are denoted with six (6) leading zeros rather than an aircraft hex code. All relevant metadata for “silence” samples are contained in the audio filename, and again in the accompanying “sample_meta.csv”

000000 _ {date} _ {time} _ {location_id} _ {microphone_id} . {file_ext}

000000_2023-05-09_12-30-55_2_1.wav

Columns/Labels
(found in sample_meta.csv, aircraft_meta.csv/json and aircraft recording JSON files)

train-test: Train-test split (train, test)

fold: Digit from 0 to 9 splitting the training subset 10 ways (else test)

filename: The filename of the audio recording

date: Date of the recording

time: Time of the recording

duration: Length of the recording (in seconds)

location_id: ID for the location of the recording

microphone_id: ID of the microphone used

hex_id: Unique ICAO 24-bit address for the aircraft

Clear search

Close search

Google apps

Main menu

AeroSonic YPAD-0523: Labelled audio dataset for acoustic detection and...

Suspicious Activity Detection Dataset Dataset

AeroSonicDB (YPAD-0523): Labelled audio dataset for acoustic detection and...

AeroSonic YPAD-0523: Labelled audio dataset for acoustic detection and...

InsectSet47 & InsectSet66: Expanded datasets for automatic acoustic...

Classifier metrics for the prospective test set.

AeroSonic YPAD-0523: Labelled audio dataset for acoustic detection and classification of aircraftSee More Versions

AeroSonic YPAD-0523: Labelled audio dataset for acoustic detection and classification of aircraft