100+ datasets found

Air Quality Index Data
kaggle.com
zip
Updated Sep 25, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Shrijayan (2022). Air Quality Index Data [Dataset]. https://www.kaggle.com/datasets/cpluzshrijayan/air-quality-prediction-harbor
Explore at:
zip(78585 bytes)Available download formats
Dataset updated
Sep 25, 2022
Authors
Shrijayan
Description
This dataset is totally imaginary and NOT real data this deal only with values that are created by us.

Content

This dataset deals with pollution in the Harbor of Chennai Kolkata and Visahapattinam has been recorded but it is a pain to create and collect all the data and arrange them in a format that interests data scientists. Hence I gathered four major pollutants and place them neatly in a CSV file.

Content

There is a total of 29 fields. The four pollutants (NO2, O3, SO2, and O3) each have 5 specific columns. Observations totaled. This kernel provides a good introduction to this dataset!

For observations on specific columns visit the Column Metadata on the Data tab.

Inspiration

I did a related project and decided to open-source our dataset so that data scientists don't need to re-scrap from the first for historical pollution data.
m
Air Quality Monitoring Data: PM10, PM2.5, O3, CO, Temp, Rel. Humidity - Jan...
data.mendeley.com
Updated Nov 16, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
daniel perez (2023). Air Quality Monitoring Data: PM10, PM2.5, O3, CO, Temp, Rel. Humidity - Jan 2022 to May 2023, Universidad Iberoamericana, Mexico City [Dataset]. http://doi.org/10.17632/gjvrn32zbm.1
Explore at:
Unique identifier
https://doi.org/10.17632/gjvrn32zbm.1
Dataset updated
Nov 16, 2023
Authors
daniel perez
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Mexico City
Description
This comprehensive dataset, provided in CSV format, captures detailed air quality monitoring data recorded hourly from January 2022 to May 2023 at Universidad Iberoamericana in Mexico City (Geo-location: 19.372292, -99.263679). It includes a wide range of environmental parameters such as particulate matter (PM10 and PM2.5), ozone (O3), carbon monoxide (CO), temperature, and relative humidity.

The CSV file contains multiple columns representing each parameter, along with corresponding timestamps for each hour of recording. This extensive dataset offers valuable insights into air quality trends and variations over a significant period, making it a rich resource for environmental research and analysis.
Data from "Air pollution control strategies directly limiting national...
catalog.data.gov
gimi9.com
Updated Nov 12, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. EPA Office of Research and Development (ORD) (2020). Data from "Air pollution control strategies directly limiting national health damages in the US", by Ou et al. [Dataset]. https://catalog.data.gov/dataset/data-from-air-pollution-control-strategies-directly-limiting-national-health-damages-in-th
Explore at:
Dataset updated
Nov 12, 2020
Dataset provided by
United States Environmental Protection Agencyhttp://www.epa.gov/
Area covered
United States
Description
This file describes the dataset used in Ou et al., "Air pollution control strategies directly limiting national health damages in the US." This work used the Global Change Assessment Model (GCAM) with state-level representation of the U.S. energy system (GCAM-USA). GCAM and GCAM-USA are developed and released by the University of Maryland/Pacific Northwest National Laboratory Joint Global Change Research Center (JGCRI). For further details, see the GCAM documentation: jgcri.github.io/gcam-doc. The model source code is available at github.com/JGCRI/gcam-core. A modified version of GCAMv4.3 was used for this analysis. Source code and input data specific for this paper are available upon request. This dataset contains Excel spreadsheets and an R script that link to comma-separated values (CSV) files that were extracted from the model output. The spreadsheets and scripts show the data and reproduce each of the figures in the paper. This dataset is associated with the following publication: Ou, Y., J. West, S. Smith, C. Nolte, and D. Loughlin. Air pollution control strategies directly limiting national health damages in the US.. Nature Communications. Nature Publishing Group, London, UK, 11: 957, (2020).
Sofia air quality dataset
kaggle.com
zip
Updated Sep 14, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Hristo Mavrodiev (2019). Sofia air quality dataset [Dataset]. https://www.kaggle.com/datasets/hmavrodiev/sofia-air-quality-dataset
Explore at:
zip(3015567883 bytes)Available download formats
Dataset updated
Sep 14, 2019
Authors
Hristo Mavrodiev
Description
Content

Archive of luftdaten.info / api.dusti.xyz

about

This archive contains CSV dumps of all outdoor sensors that seem to have delivered valid data to the API. The data is organised in directories for each day, containing csv files for each sensor. There may be multiple sensors in one measurement station (e.g. PM and temperature/humidity), their data will be in different files.

License

The archive.luftdaten.info data is made available under the Open Database License: http://opendatacommons.org/licenses/odbl/1.0/. Any rights in individual contents of the database are licensed under the Database Contents License: http://opendatacommons.org/licenses/dbcl/1.0/

Problems

Please report problems in our opendata-stuttgart issuetracker: https://github.com/opendata-stuttgart/meta/issues

Future improvements

description of position of sensor on location

image of sensor at location

Inspiration

Use the past data to generate air quality statistics and air quality prediction models
d
Data from: Indoor air quality in California homes with code-required...
datadryad.org
data.niaid.nih.gov
+1more
zip
Updated Apr 22, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Wanyu Chan; Yang-Seon Kim; William Delp; Iain Walker; Brett Singer (2020). Indoor air quality in California homes with code-required mechanical ventilation [Dataset]. http://doi.org/10.7941/D1ZS7X
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.7941/D1ZS7X
Dataset updated
Apr 22, 2020
Dataset provided by
Dryad
Authors
Wanyu Chan; Yang-Seon Kim; William Delp; Iain Walker; Brett Singer
Time period covered
Feb 7, 2020
Area covered
California
Description
Time Series Data Handling and Quality Assurance Review

Most instruments had internal logging and special software to download data from the field instruments as binary files or ascii/csv files. The instruments for which files downloaded as binary provide software to view the data or export the data to csv files.

One-minute resolution time-series data files were created for each house using an R script that pulled data from the csv files, aligned data by time, executed unit conversions, and translated from instruments with longer or different data intervals (e.g. 30 min formaldehyde data and 1.5 min for anemometer data). Visual review was conducted on the compiled files (and primary csv or binary files were consulted as needed) to check for translation or writing errors (especially from terminal emulator), indications of instrument malfunction, mislabeled units or unit conversion errors, mislabeled location, and time stamp errors.

The draft final set of time-series data&nb...
Z
Britain Breathing 2016-2019 Air Quality and Meteorological Regional...
data.niaid.nih.gov
Updated Feb 16, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Gledson, Ann; Lowe, Douglas; Reani, Manuele; Jay, Caroline; Topping, David (2022). Britain Breathing 2016-2019 Air Quality and Meteorological Regional Estimates Dataset [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_4439642
Explore at:
Dataset updated
Feb 16, 2022
Dataset provided by
Chinese University of Hong Kong
University of Manchester
Authors
Gledson, Ann; Lowe, Douglas; Reani, Manuele; Jay, Caroline; Topping, David
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
United Kingdom
Description
This data set is a collection of estimated daily mean and maximum values for a range of air quality and meterological measurements and model forecasts for the UK and crown dependencies postcode districts (e.g. 'AB') for the years 2016-2019, inclusive.

The paper describing this dataset is available here: https://www.nature.com/articles/s41597-022-01135-6

The data uses a 'concentric regions' method to estimate the measurement for all regions, as follows. If measurements exist within the region, the mean of those measurements is used, if not, then a ring of neighbouring postcode regions are selected, and the mean of their measurement values used. If no measurement sites/data are found in the first ring, the process continues, taking the next ring of postcode district regions, working outwards until one or more sensors are found in a ring. As well as the measurement estimations, the number of rings required to find site data and make the estimations is also published. As a result, please note that estimations with higher ring counts ('rings') are likely to be calculated from more distant sensors. This distance depends upon the size of the postcode regions surrounding the location being estimated. Please use the ring count ('rings') to limit/filter estimations based on your required level of confidence.

The meteorological, pollen and air quality measurement data used to make the regional estimations can be found at this Zenodo archive. The data there contains Temperature, Relative Humidity, and Pressure data, downloaded from the Met Office MIDAS archives via the MEDMI server (https://www.data-mashup.org.uk/). Also downloaded from the MEDMI server are daily pollen measurements for the UK. PM10, PM2.5, NO2, NOx (as NO2), O3, and SO2 measurements from the DEFRA AURN network, and also model forecasts of the same made using the EMEP model.

The code used to make the estimations is available at this Zenodo archive.

The postcode data in postcode_district_data.csv are collated from several sources:

https://www.doogal.co.uk/UKPostcodes.php (population figures for the UK (UK Census 2011))

https://www.freemaptools.com/download-uk-postcode-outcode-boundaries.htm (postcode boundary polygons for UK and crown dependancies)

https://www.gov.gg/population (Guernsey (GY) population data for end June 2020)

https://www.gov.je/Government/JerseyInFigures/Population/Pages/Population.aspx (Jersey (JE) population data for end 2019)

https://www.gov.im/media/1369690/isle-of-man-in-numbers-july-2020.pdf (Isle of Man (IM) population data for April 2016)

The data-set is presented in CSV format, as six files:

postcode_district_data.csv: location metadata (region_id, geometry, description, population, country)

regional_site_counts.csv: a table showing the number of sites for each measurement (columns), for each region_id (rows). region_id's match those in the postcode_district_data.csv file.

turing_regional_estimates_aq_daily_met_pollen_pollution_imputed_data.csv: uses imputed site data (timestamp, region_id, ...[measurement name, rings]) ('rings' is the number of rings required to make the estimation)

turing_regional_estimates_aq_daily_met_pollen_pollution_original_data.csv: uses original site data (timestamp, region_id, ...[measurement name, rings]) ('rings' is the number of rings required to make the estimation)

turing_regional_estimates_aq_loc_type_daily_imputed_data.csv: uses imputed site data. Air quality regional estimates are calculated using specific AQ site location types* separately. (To prevent, for example, 'Traffic Urban' type sites being used to estimate 'non-traffic' or rural regions.)

turing_regional_estimates_aq_loc_type_daily_original_data.csv: uses original data. Air quality regional estimates are calculated using specific AQ site location types* separately. (To prevent, for example, 'Traffic Urban' type sites being used to estimate 'non-traffic' or rural regions.)

Air quality site types:

Industrial: comprises 'urban industrial' (9 sites) and suburban industrial (2 sites)

'Rural background' (14 sites)

'Urban background' (48 sites)

'Urban traffic' (47 sites)
Data from: Air Quality Data
kaggle.com
zip
Updated Oct 16, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Python Developer (2025). Air Quality Data [Dataset]. https://www.kaggle.com/datasets/programmer3/air-quality-data
Explore at:
zip(76469629 bytes)Available download formats
Dataset updated
Oct 16, 2025
Authors
Python Developer
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
The dataset contains Air Quality Index (AQI) at hourly and daily level of various stations across multiple cities.

station_day.csv → 108035 rows city_day.csv → 29531 rows city_hour.csv → 707875 rows station_hour.csv → 2589083 rows stations.csv → 230 rows

Total rows across all files: 3434754

Station Dataset Features

StationId → Unique identifier for each air quality monitoring station. Useful for referencing and merging data.

StationName → Name of the monitoring station. Helps in identifying stations in reports and visualizations.

City → The city where the station is located. Can be used to aggregate data at city level.

State → The state where the station is located. Useful for regional analysis.

Status → Indicates whether the station is active, inactive, or under maintenance. Helps filter reliable data.

City (Air Quality) Dataset Features

City → Name of the city where the measurement was taken.

Datetime → Timestamp of the measurement. Could be hourly or daily. Important for time series analysis.

PM2.5 → Fine particulate matter ≤ 2.5 μm in diameter. Indicates air pollution that penetrates deep into the lungs.

PM10 → Particulate matter ≤ 10 μm. Measures larger airborne particles.

NO → Nitric oxide, a primary pollutant mainly from vehicles and industry.

NO2 → Nitrogen dioxide, harmful to respiratory health. Often a marker of traffic pollution.

NOx → Nitrogen oxides (NO + NO2). Total nitrogen oxide pollution.

NH3 → Ammonia, released from agriculture and waste. Can contribute to secondary particulate matter.

CO → Carbon monoxide, a toxic gas from incomplete combustion.

SO2 → Sulfur dioxide, mainly from burning fossil fuels. Causes acid rain and respiratory issues.

O3 → Ozone at ground level, harmful to lungs. A secondary pollutant formed by reactions of NOx and VOCs.

Benzene, Toluene, Xylene → Volatile organic compounds (VOCs) from industry and vehicles. Can be toxic and form ozone.

AQI → Air Quality Index, a standardized number summarizing pollution level across multiple pollutants.

AQI_Bucket → Category of AQI (Good, Moderate, Poor, etc.), making interpretation easier.
d
Data from: Patterns discovery dataset for particulate matter (pm2.5)...
search.dataone.org
datadryad.org
Updated Dec 12, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Uday Kiran Rage; Vanitha Kattumuri; Arjun Chakravarthi Pogaku (2024). Patterns discovery dataset for particulate matter (pm2.5) pollution trends in Japan [Dataset]. http://doi.org/10.5061/dryad.hhmgqnkrr
Explore at:
Unique identifier
https://doi.org/10.5061/dryad.hhmgqnkrr
Dataset updated
Dec 12, 2024
Dataset provided by
Dryad Digital Repository
Authors
Uday Kiran Rage; Vanitha Kattumuri; Arjun Chakravarthi Pogaku
Description
Air pollution presents a significant environmental risk, impacting human health, accelerating climate change, and disrupting ecosystems. The main aim of air pollution research is to pinpoint the most harmful pollutants identified in previous studies and to map regions exposed to high pollution levels. This study introduces a large-scale, high-quality dataset to advance the analysis of PM2.5 pollution and reveal hidden patterns through pattern mining techniques. The dataset covers five years of hourly PM2.5 measurements collected from approximately 1,900 sensors across Japan, sourced from the Ministry of the Environment's Soramame platform. This platform offers hourly pollutant records, downloadable as monthly raw data files. The unorganised raw data files are systematically organised and stored in database tables using an Entity-Relationship (ER) schema. The primary objective of this dataset is to aid in developing and validating pattern mining models, enabling the accurate detection of..., The air pollution data was collected from Japanâ€™s Soramame platform, which provides hourly updates on pollutant levels nationwide. The data files were collected from January 1, 2018, 01:00:00, to April 25, 2023, 22:00:00, covering records from approximately 1,900 sensors stationed in various locations across Japan. These files are initially unorganised in CSV format and require systematic organisation by year, month, time, sensor, and pollutant type. To maintain data integrity, we structured the dataset using an Entity-Relationship (ER) schema within a PostgreSQL database, comprising two main tables: the Sensor table (storing sensor name, ID, address, and location) and the Observations table (recording pollutant types and their values). A detailed step-by-step process is provided in the README, and this organization created a consolidated CSV file containing PM2.5 levels, timestamps, and sensor details., , # AEROS PM2.5 Dataset

Overview

The AEROS PM2.5 Dataset provides a comprehensive collection of hourly PM2.5 measurements recorded over a period of five years from sensors located across Japan. This dataset is a valuable resource for studying air quality trends, pollution patterns, and environmental health impacts.

Dataset Description

File Information

File Name: FINAL_DATASET.csv

Content: Hourly PM2.5 measurements collected from sensors located in Japan over five years.

Structure

The dataset includes the following columns:

Timestamps: The date and time when the measurement was recorded.

Sensor Location IDs: Unique identifiers for the sensor locations.

PM2.5 Values (Âµg/mÂ³): The recorded PM2.5 concentration at a specific timestamp and location.

Units

PM2.5 Values: Measured in micrograms per cubic meter (Âµg/mÂ³).

Notes on Data

Empty Cells: Represent instances where no PM2.5 data was recorded by the s...

Global Air Quality Data(15 Days Hourly, 50 Cities)

kaggle.com

zip

Updated Nov 19, 2025

Facebook

Twitter

Click to copy link

Link copied

Cite

Smeet Raichura (2025). Global Air Quality Data(15 Days Hourly, 50 Cities) [Dataset]. https://www.kaggle.com/datasets/smeet888/global-air-quality-data15-days-hourly-50-cities

Explore at:

zip(598546 bytes)Available download formats

Dataset updated

Nov 19, 2025

Authors

Smeet Raichura

License

https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

Description

📘 Overview

This dataset provides hourly air-quality measurements for 50 major global cities over a continuous 15-day period, including pollutant concentrations, meteorological conditions, geographical metadata, and an engineered AQI index.

All values are synthetically generated using historically consistent pollutant patterns and statistical ranges, allowing researchers and ML practitioners to work with realistic air-quality trends without licensing restrictions or data-collection barriers.

This dataset is ideal for time-series modeling, forecasting, environmental analytics, and machine-learning experimentation.

🧭 Cities Included

Covers all major regions:

North America — New York, Los Angeles, Toronto

Europe — London, Paris, Berlin, Zurich

Asia — Delhi, Tokyo, Seoul, Beijing, Singapore

Middle East — Dubai, Riyadh, Doha

Africa — Lagos, Cairo, Nairobi

Oceania — Sydney, Melbourne, Auckland

South America — São Paulo, Buenos Aires

🧱 Dataset Structure

Each hourly record includes:

Air Pollutants

PM2.5 (µg/m³)

PM10 (µg/m³)

NO₂ (ppb)

SO₂ (ppb)

O₃ (ppb)

CO (ppm)

Weather Features

Temperature (°C)

Humidity (%)

Wind Speed (m/s)

Location Metadata

City

Country

Latitude

Longitude

Other

Timestamp (ISO-8601)

AQI (Computed index)

🧹 Data Quality & Formatting

No missing values — 100% complete

Numeric values rounded to 3 decimals

Clean column names (snake_case)

Consistent hourly frequency

Fully ML-ready

📊 Example Use Cases

✔ AQI forecasting (LSTM, GRU, Transformers) ✔ Multivariate time-series modeling ✔ Clustering cities by pollution patterns ✔ Environmental trend visualization ✔ Weather–pollution correlation studies ✔ Anomaly detection (peak pollution events)

Column	Description	Unit	Type
timestamp	Hourly timestamp (UTC)	—	datetime
city	City name	—	string
country	Country name	—	string
latitude	City latitude	°	float
longitude	City longitude	°	float
pm25	Fine particulate matter	µg/m³	float
pm10	Coarse particulate matter	µg/m³	float
no2	Nitrogen dioxide	ppb	float
so2	Sulfur dioxide	ppb	float
o3	Ozone	ppb	float
co	Carbon monoxide	ppm	float
temperature	Ambient temperature	°C	float
humidity	Relative humidity	%	float
wind_speed	Wind speed	m/s	float
aqi	Derived Air Quality Index	—	int

🧪 Data Generation Method (Provenance)

This dataset is synthetically generated using realistic pollutant behavior patterns based on historical studies and open-source environmental datasets.

Modeling steps included:

City-specific pollutant baseline ranges

Randomized variation using Gaussian noise

Temporal patterns using sinusoidal diurnal cycles (morning & evening peaks)

Weather-pollution correlation rules (e.g., low wind → higher PM)

AQI computed using standard US-EPA breakpoints

All numeric values standardized to 3-decimal precision

This ensures that although synthetic, the dataset follows realistic environmental dynamics.

📁 File Information

global_air_quality_50_cities.csv

Rows: 18,000+

Columns: 16

Format: UTF-8 CSV

Data for: Leveraging scientific community knowledge for air quality model...
catalog.data.gov
datasets.ai
Updated Mar 18, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. EPA Office of Research and Development (ORD) (2024). Data for: Leveraging scientific community knowledge for air quality model chemistry parameterizations [Dataset]. https://catalog.data.gov/dataset/data-for-leveraging-scientific-community-knowledge-for-air-quality-model-chemistry-paramet
Explore at:
Dataset updated
Mar 18, 2024
Dataset provided by
United States Environmental Protection Agencyhttp://www.epa.gov/
Description
Files contain values from Figures 1, 2, and 3 of the article by Pye et al., "Leveraging scientific community knowledge for air quality model chemistry parameterizations," scheduled for publication in EM in January 2024. Figures 2 and 3 are available in csv and excel spreadsheet format. Figure 1 is only available in spreadsheet format. Figure 1 shows gas and aerosol-phase chemistry representations in CMAQ since 2010. Figure 2 shows ozone and SOA formation potential (in g/g) for CRACMM species. Figure 3 shows the size (number of species and reactions) for various chemical mechanisms. This dataset is associated with the following publication: Pye, H., R. Schwantes, K. Barsanti, V.F. McNeill, and G. Wolfe. Leveraging scientific community knowledge for air quality model chemistry parameterizations. EM Magazine. Air and Waste Management Association, Pittsburgh, PA, USA, 24-31, (2024).
o
National Park Air Quality Index Dataset
osti.gov
Updated Jun 3, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Byler, Eleanor B; Chojnicki, Kirsten N; Svinth, Christian N (2024). National Park Air Quality Index Dataset [Dataset]. https://www.osti.gov/dataexplorer/biblio/dataset/2468701
Explore at:
Dataset updated
Jun 3, 2024
Dataset provided by
DOE
Pacific Northwest National Laboratory 2
Authors
Byler, Eleanor B; Chojnicki, Kirsten N; Svinth, Christian N
Description
The National Park Air Quality Index dataset (NPS-AQI) consists of webcam images taken from the National Park Service's publicly available air quality web cameras and associated measurements for air pollutants, AQI, and meteorological data obtained via the publicly available NPS Gaseous Pollutant Monitoring Program. The full dataset is a collection of 146,822 images paired with air quality measurements. The specific measurements reported are: ozone ppm, 8-hour running average ozone ppm, so2 ppm, AQI (derived from ozone), temperature, and humidity. The images are 1500X1000 pixel PNG files arranged into folders by NPS site and named according to the time and date the image was taken. There are three CSV files (representing "training", "validation", and "testing" images splits) containing image names and associated NPS site names, air pollutant measurements, and meteorlogical data.
G
Ten-year data tables by province, industry and substance – releases
open.canada.ca
data.urbandatacentre.ca
+1more
csv, html
Updated Jul 15, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Environment and Climate Change Canada (2025). Ten-year data tables by province, industry and substance – releases [Dataset]. https://open.canada.ca/data/en/dataset/ea0dc8ae-d93c-4e24-9f61-946f1736a26f
Explore at:
html, csvAvailable download formats
Dataset updated
Jul 15, 2025
Dataset provided by
Environment and Climate Change Canada
License
Open Government Licence - Canada 2.0https://open.canada.ca/en/open-government-licence-canada
License information was derived automatically
Time period covered
Jan 1, 2014 - Dec 31, 2023
Description
The National Pollutant Release Inventory (NPRI) is Canada's public inventory of pollutant releases (to air, water and land), disposals and transfers for recycling. Each file contains annual total releases for the past ten years by media (air, water or land), broken-down by province, industry or substance. Files are in .CSV format. The results can be further broken down using the pre-defined search available at the bottom of the NPRI Data Search webpage. The results returned by the NPRI search engine may differ from the numbers contained in the downloadable files. The online search engine’s results will display releases, disposals and transfers reported by facilities, but does not distinguish between media type (i.e. air, water, land). It also displays facilities reporting only under Ontario Regulation 127/01 and facilities submitting “did not meet criteria” reports. Please consult the following resources to enhance your analysis: - Guide on using and Interpreting NPRI Data: https://www.canada.ca/en/environment-climate-change/services/national-pollutant-release-inventory/using-interpreting-data.html - Access additional data from the NPRI, including datasets and mapping products: https://www.canada.ca/en/environment-climate-change/services/national-pollutant-release-inventory/tools-resources-data/exploredata.html Supplemental Information More NPRI datasets and mapping products are available here: https://www.canada.ca/en/environment-climate-change/services/national-pollutant-release-inventory/tools-resources-data/access.html Supporting Projects: National Pollutant Release Inventory (NPRI)
Bulk data files for all years – releases, disposals, transfers and facility...
open.canada.ca
csv, html
Updated Jul 15, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Environment and Climate Change Canada (2025). Bulk data files for all years – releases, disposals, transfers and facility locations [Dataset]. https://open.canada.ca/data/en/dataset/40e01423-7728-429c-ac9d-2954385ccdfb
Explore at:
csv, htmlAvailable download formats
Dataset updated
Jul 15, 2025
Dataset provided by
Environment And Climate Change Canadahttps://www.canada.ca/en/environment-climate-change.html
License
Open Government Licence - Canada 2.0https://open.canada.ca/en/open-government-licence-canada
License information was derived automatically
Time period covered
Jan 1, 1993 - Dec 31, 2023
Description
The National Pollutant Release Inventory (NPRI) is Canada's public inventory of pollutant releases (to air, water and land), disposals and transfers for recycling. Each file contains data from 1993 to the latest reporting year. These CSV format datasets are in normalized or ‘list’ format and are optimized for pivot table analyses. Here is a description of each file: - The RELEASES file contains all substance release quantities. - The DISPOSALS file contains all on-site and off-site disposal quantities, including tailings and waste rock (TWR). - The TRANSFERS file contains all quantities transferred for recycling or treatment prior to disposal. - The COMMENTS file contains all the comments provided by facilities about substances included in their report. - The GEO LOCATIONS file contains complete geographic information for all facilities that have reported to the NPRI. Please consult the following resources to enhance your analysis: - Guide on using and Interpreting NPRI Data: https://www.canada.ca/en/environment-climate-change/services/national-pollutant-release-inventory/using-interpreting-data.html - Access additional data from the NPRI, including datasets and mapping products: https://www.canada.ca/en/environment-climate-change/services/national-pollutant-release-inventory/tools-resources-data/exploredata.html Supplemental Information More NPRI datasets and mapping products are available here: https://www.canada.ca/en/environment-climate-change/services/national-pollutant-release-inventory/tools-resources-data/access.html Supporting Projects: National Pollutant Release Inventory (NPRI)
u
Dissolved Inorganic Carbon and Dissolved Organic Carbon Data for the East...
data.nceas.ucsb.edu
data.ess-dive.lbl.gov
+3more
Updated Aug 19, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Wenming Dong; Curtis Beutler; Wendy Brown; Alexander Newman; Dylan O'Ryan; Roelof Versteeg; Kenneth Williams (2023). Dissolved Inorganic Carbon and Dissolved Organic Carbon Data for the East River Watershed, Colorado (2015-2023) [Dataset]. http://doi.org/10.15485/1660459
Explore at:
Unique identifier
https://doi.org/10.15485/1660459
Dataset updated
Aug 19, 2023
Dataset provided by
ESS-DIVE
Authors
Wenming Dong; Curtis Beutler; Wendy Brown; Alexander Newman; Dylan O'Ryan; Roelof Versteeg; Kenneth Williams
Time period covered
Sep 1, 2015 - Jan 5, 2023
Area covered

Description
This data package contains mean values for dissolved organic carbon (DOC) and dissolved inorganic carbon (DIC) for water samples taken from the East River Watershed in Colorado. The East River is part of the Watershed Function Scientific Focus Area (WFSFA) located in the Upper Colorado River Basin, United States. DOC and DIC concentrations in water samples were determined using a TOC-VCPH analyzer (Shimadzu Corporation, Japan). DOC was analyzed as non-purgeable organic carbon (NPOC) by purging HCl acidified samples with carbon-free air to remove DIC prior to measurement. After the acidified sample has been sparged, it is injected into a combustion tube filled with oxidation catalyst heated to 680 degrees C. The DOC in samples is combusted to CO2 and measured by a non-dispersive infrared (NDIR) detector. The peak area of the analog signal produced by the NDIR detector is proportional to the DOC concentration of the sample. DIC was determined by acidifying the samples with HCl first, and then purge with carbon-free air to release CO2 for analysis by NDIR detector. All files are labeled by location and variable, and data reported are the mean values upon minimum three replicate measurements with a relative standard deviation < 3%. All samples were analyzed under a rigorous quality assurance and quality control (QA/QC) process as detailed in the methods. This data package contains (1) a zip file (dic_npoc_data_2014-2023.zip) containing a total of 319 files: 318 data files of DIC and NPOC data from across the Lawrence Berkeley National Laboratory (LBNL) Watershed Function Scientific Focus Area (SFA) which is reported in .csv files per location and a locations.csv (1 file) with latitude and longitude for each location; (2) a file-level metadata (v3_20230808_flmd.csv) file that lists each file contained in the dataset with associated metadata; (3) a data dictionary (v3_20230808_dd.csv) file that contains terms/column_headers used throughout the files along with a definition, units, and data type; and (4) PDF and docx files for the determination of Method Detection Limits (MDLs) for DIC and NPOC data. There are a total of 106 locations containing DIC/NPOC data. Update on 2020-10-07: Updated the data files to remove times from the timestamps, so that only dates remain. The data values have not changed. Update on 2021-04-11: Added Determination of Method Detection Limits (MDLs) for DIC, NPOC and TDN Analyses document, which can be accessed as a PDF or with Microsoft Word. Update on 2022-06-10: versioned updates to this dataset was made along with these changes: (1) updated dissolved inorganic carbon and dissolved organic carbon data for all locations up to 2021-12-31, (2) removal of units from column headers in datafiles, (3) added row underneath headers to contain units of variables, (4) restructure of units to comply with CSV reporting format requirements, (5) added -9999 for empty numerical cells, and (6) the addition of the file-level metadata (flmd.csv) and data dictionary (dd.csv) were added to comply with the File-Level Metadata Reporting Format. Update on 2022-09-09: Updates were made to reporting format specific files (file-level metadata and data dictionary) to correct swapped file names, add additional details on metadata descriptions on both files, add a header_row column to enable parsing, and add version number and date to file names (v2_20220909_flmd.csv and v2_20220909_dd.csv). Update on 2023-08-08: Updates were made to both the data files and reporting format specific files. New available anion data was added, up until 2023-01-05. The file level metadata and data dictionary files were updated to reflect the additional data added.
H
China AQI PM25s Cumulative Hourly Data (2015-09 to 2016-05)
dataverse.harvard.edu
search.dataone.org
Updated Jan 18, 2017
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Lex Berman (2017). China AQI PM25s Cumulative Hourly Data (2015-09 to 2016-05) [Dataset]. http://doi.org/10.7910/DVN/3X9NIF
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.7910/DVN/3X9NIF
Dataset updated
Jan 18, 2017
Dataset provided by
Harvard Dataverse
Authors
Lex Berman
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Area covered
China
Description
The hourly updates of ground monitoring observations in China are collected and stored in .csv files (one for each hour). NOTE: the pm25s data does NOT report locations for the reporting stations in x, y coordinates.
d
Area source grid air pollution emissions (TWD97)
data.gov.tw
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ministry of Environment, Area source grid air pollution emissions (TWD97) [Dataset]. https://data.gov.tw/en/datasets/152688
Explore at:
Dataset authored and provided by
Ministry of Environment
License
https://data.gov.tw/licensehttps://data.gov.tw/license
Description
Grid emission data of non-point source air pollutants, the base year is estimated to be the 108th year of the Republic of China, and the grid covers Taiwans main island and outlying islands. The central meridian of TWD97-TM2 121 and TWD97-TM2 119 is the grid point on the lower left, respectively. In angular coordinates, air pollutants include TSP, PM10, PM6 (no estimation), PM2.5, SOx, NOx, THC, NMHC, CO, and Pb. Since the number of files is extremely large and exceeds the capacity of the CSV file format, if you need to download it completely, it is recommended to download it from the following website: https://air.epa.gov.tw/EnvTopics/AirQuality_6.aspx Go to the "Attachment Download" at the bottom of the page , click "Emission Inventory TEDS11.1 Raw Data".
m
Data from: Air Quality Monitoring at Universidad Iberoamericana, Mexico...
data.mendeley.com
Updated Jun 25, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Daniel Alejandro Perez de la Mora (2025). Air Quality Monitoring at Universidad Iberoamericana, Mexico City: Ozone and Meteorological Data During the Rainy and Ozone Seasons (Mar–Nov 2024) [Dataset]. http://doi.org/10.17632/z42m8c43ny.3
Explore at:
Unique identifier
https://doi.org/10.17632/z42m8c43ny.3
Dataset updated
Jun 25, 2025
Authors
Daniel Alejandro Perez de la Mora
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Mexico, Mexico City
Description
This dataset provides hourly air quality and meteorological measurements recorded from March 1, 2024, to November 30, 2024, at Universidad Iberoamericana, Mexico City (Geo-location: 19.372292, -99.263679). Each value represents the average of measurements taken within that hour. The data spans the rainy season (approximately May to October) and part of the ozone season (approximately February to May, with overlap into March–April), capturing air pollution and weather patterns in an urban, high-altitude environment.

The dataset includes the following parameters:

Ozone (O₃) (ppb) – Hourly average ozone concentration.

Temperature (°C) – Hourly average ambient temperature.

Relative Humidity (%) – Hourly average humidity.

Wind Direction (degrees) – Hourly average wind direction (0° = North).

Wind Speed (m/s) – Hourly average wind speed.

Air Pressure (hPa) – Hourly average atmospheric pressure.

Accumulated Precipitation (mm) – Total precipitation accumulated within the hour.

Solar Radiation (W/m²) – Hourly average solar radiation.

File Format: CSV (comma-separated) Data Structure: Each row represents an hourly measurement with a timestamp formatted as DD-MM-YY HH:MM (e.g., 01-03-24 14:00), using 24-hour time.

Sensor Details & Data Quality: Measurements were collected using low-cost air quality sensors deployed at the IBERO campus.

Applications: This dataset is suitable for air pollution analysis (e.g., ozone trends during the ozone season), meteorological research (e.g., precipitation and temperature patterns in the rainy season), and environmental modeling in Mexico City.
A Multi-Pollutant Emissions Inventory for Air Pollution Modeling and...
zenodo.org
bin, csv, png, zip
Updated Aug 1, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sarath Guttikunda; Sarath Guttikunda (2024). A Multi-Pollutant Emissions Inventory for Air Pollution Modeling and Supporting Information for Kampala [Dataset]. http://doi.org/10.5281/zenodo.11560003
Explore at:
bin, csv, png, zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.11560003
Dataset updated
Aug 1, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Sarath Guttikunda; Sarath Guttikunda
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Kampala
Description
This paper is under review

Abstract:

Kampala, the political and economic capital of Uganda and one of the fastest urbanising cities in sub-Saharan Africa, is experiencing a deteriorating trend in air quality with emissions from multiple diffused local sources like transportation, domestic and outdoor cooking, and industries, and sources outside the city airshed like seasonal open fires in the region. PM2.5 (particulate matter under 2.5um size) is the key pollutant of concern in the city with monthly spatial heterogeneity of 60-100 ug/m3. Outdoor air pollution is distinctly pronounced in the global south cities and lack the necessary capacity and resources to develop integrated air quality management programmes including ambient monitoring, emissions and pollution analysis, source apportionment, and preparation of clean air action plans. This paper presents an integrated assessment of air quality in Kampala drawing from ground measurements (from a hybrid network of stations), satellite observations (from NASA’s MODIS and OMI), global reanalysis fields (from GEOS-chem and CAMS simulations), high resolution (~1km) multi-pollutant emissions inventory for the airshed, WRF-CAMx based PM2.5 pollution analysis, and a qualitative review of institutional and policy environment for air quality management in Kampala. The proposed clean air action plans aim for better air quality in the region using a combination of short-, medium-, and long-term emission control measures for all the dominate sources and institutionalize pollution tracking mechanisms (like emissions and pollution monitoring and reporting) for effective management of air pollution.

This data archive serves as a supplemenary to the journal article and with a short description of the files below:

File: AQ-Kampala-Analysis-Summary.pptx (Caution: large 60MB)
A composite presentation including the following

Grid summaries

Snapshots of airshed GIS files, emission activities

Summary of meteorology from WRF simulations and historical synoptics

Summaries of ambient monitoring data

CAMS reanalysis summary

Summaries of Emission inventory and WRF-CAMx modelling (annual and monthly)

Summaries of PM2.5 Source apportionment (annual and monthly)

File: grids_kampala.rar
Grid file (KML and ESRI shapefiles format) for the airshed spanning 0.0N to 0.6N and 32.3E to 32.9E with a spatial resolution of 0.01deg (~1km)

File: gis_roads_from_opensteetmaps.rar
ESRI shapefiles of primary roads and all roads, extracted from the openstreetmaps
Raw data archive @ https://download.geofabrik.de/index.html

File: gis-scanned2021image-quarries.kml
KML file of quarries scanned using the imagery on Google Earth platform

File: population_kampala_2000-2022.csv
Gridded population data 2000 to 2022
Raw data archive is from LANDSCAN - https://landscan.ornl.gov

File: Monitoring-Kampala_USEmbassy_2017-2024.xlsx
Summary of monitoring data collected at the US Embassy in Kampala
Raw data archive is @ https://www.airnow.gov/international/us-embassies-and-consulates/#Uganda$Kampala

File: meteo_wrf_stats.xlsm
Summary of output of WRF simulations for the Kampala region. Have to activiate macros to summarize the results by month and update the charts. The tool can be used to other cities also by changing the input data.

File: meteo_precip-era5-reanalysis.csv
Summary of monthly precipitation date (mm/day) from ERA5 reanalysis fields
Raw data archive is @ https://psl.noaa.gov/data/atmoswrit/timeseries

File: TROPOMI_EastAfrica_NO2_Maps.zip
Images of monthly average TROPOMI NO2 extracts covering East Africa (Uganda and Ethiopia)
Extracted from Google Earth Engine, using 10% cloud fraction

File: TROPOMI_EastAfrica_CSVs.zip
CSV files of gridded monthly average NO2, SO2, HCHO, and Ozone columnar densities
Extracted from Google Earth Engine, using 10% cloud fraction
Read the data descriptions and applicability of the data for analysis before using (for example, negative numbers in the SO2 file).
NO2 - https://developers.google.com/earth-engine/datasets/catalog/COPERNICUS_S5P_OFFL_L3_NO2
SO2 - https://developers.google.com/earth-engine/datasets/catalog/COPERNICUS_S5P_OFFL_L3_SO2
HCHO - https://developers.google.com/earth-engine/datasets/catalog/COPERNICUS_S5P_OFFL_L3_HCHO
O3 - https://developers.google.com/earth-engine/datasets/catalog/COPERNICUS_S5P_OFFL_L3_O3

File: composite_emisson_factors_gains.xlsx
A composite library of emission factors for reference

File: kampala_gridded_emissions_2018.rar
Gridded emissions inventory for Kampala - PM25, PM10, SO2, NO, NO2, and CO
PM25 is speciated into FPRM, BC, and OC (sum all for PM25)
PM10 is speciated into FPRM, CPRM, BC, OC (sum all for PM10)
All emissions in tons/year/grid
Emissions are seggragted into sectors and fuels - included in the filenames

File: kampala_gridded_modelled_monthavgp25.csv
Gridded PM2.5 concentrations for 2018, from WRF-CAMx modelling system
Monthly averages in ug/m3
NARSTO EPA Supersite (SS) Houston, Texas Air Quality Study 2000 (TexAQS2000)...
data.nasa.gov
Updated Apr 1, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
nasa.gov (2025). NARSTO EPA Supersite (SS) Houston, Texas Air Quality Study 2000 (TexAQS2000) Texas Natural Resource Conservation Commission (TNRCC) continuous ambient monitoring stations (CAMS) Air Quality Data - Dataset - NASA Open Data Portal [Dataset]. https://data.nasa.gov/dataset/narsto-epa-supersite-ss-houston-texas-air-quality-study-2000-texaqs2000-texas-natural-reso-93e18
Explore at:
Dataset updated
Apr 1, 2025
Dataset provided by
NASAhttp://nasa.gov/
Area covered
Houston, Texas
Description
NARSTO_EPA_HOUSTON_TEXAQS2000_CAMS_DATA is the North American Research Strategy for Tropospheric Ozone (NARSTO) Environmental Protection Agency (EPA) Supersite (SS) Houston, Texas Air Quality Study 2000 (TexAQS2000) Texas Natural Resource Conservation Commission (TNRCC) continuous ambient monitoring stations (CAMS) Air Quality Data. This data set contains 5-minute air quality measurements collected in Texas during August and September 2000 at 85 CAMS during TEXAQS2000. Measurements include carbon monoxide (CO), sulfur dioxide (SO2), nitrogen oxide (NO), nitrogen dioxide (NO2), oxides of nitrogen (NOx), total reactive nitrogen species (NOy), ozone, particulate matter (PM) 2.5 mass, hydrogen sulfide (H2S), wind speed, wind direction, maximum wind gust, air temperature, dewpoint temperature, humidity, precipitation, surface pressure, radiation, and visibility. CAMS are operated by the Texas Commission on Environmental Quality (TCEQ), local city or county governments, or private monitoring networks. Important monitoring site information: The site information data table in each of the 85 data files may not contain the latest TCEQ site information. A companion file site information spreadsheet (.csv) that lists data for all 85 sites is the latest TCEQ site information. The site information data tables in the 85 data files will not be updated. The 85 site spreadsheet companion document is the official source of site data, and this data is listed in the TEXAQS2000 CAMS guide document.NARSTO, which has since disbanded, was a public/private partnership, whose membership spanned across government, utilities, industry, and academe throughout Mexico, the United States, and Canada. The primary mission was to coordinate and enhance policy-relevant scientific research and assessment of tropospheric pollution behavior; activities provide input for science-based decision-making and determination of workable, efficient, and effective strategies for local and regional air-pollution management. Data products from local, regional, and international monitoring and research programs are still available.
NC_birth_air_pollution_analysis
catalog.data.gov
Updated Apr 25, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. EPA Office of Research and Development (ORD) (2025). NC_birth_air_pollution_analysis [Dataset]. https://catalog.data.gov/dataset/nc-birth-air-pollution-analysis
Explore at:
Dataset updated
Apr 25, 2025
Dataset provided by
United States Environmental Protection Agencyhttp://www.epa.gov/
Description
Dataset contains information on births in North Carolina during the study period, linked to air pollution concentrations during pregnancy. This dataset is not publicly accessible because: EPA cannot release personally identifiable information regarding living individuals, according to the Privacy Act and the Freedom of Information Act (FOIA). This dataset contains information about human research subjects. Because there is potential to identify individual participants and disclose personal information, either alone or in combination with other datasets, individual level data are not appropriate to post for public access. Restricted access may be granted to authorized persons by contacting the party listed. It can be accessed through the following means: Code files and data dictionaries can be requested from authors (rappazzo.kristen@epa.gov). Birth records can be requested through the North Carolina Department of Health and Human Services. Format: Data include csv, SAS, and R files containing information about births in North Carolina. This dataset is associated with the following publication: Krajewski, A., T. Luben, J. Warren, and K. Rappazzo. Associations between weekly gestational exposure of fine particulate matter, ozone, and nitrogen dioxide and preterm birth in a North Carolina Birth Cohort, 2003-2015. Environmental Epidemiology. Wolters Kluwer, Alphen aan den Rijn, NETHERLANDS, 7(6): e278, (2023).

Facebook

Twitter

Click to copy link

Link copied

Cite

Shrijayan (2022). Air Quality Index Data [Dataset]. https://www.kaggle.com/datasets/cpluzshrijayan/air-quality-prediction-harbor

Air Quality Index Data

Air Pollution Quality Prediction in Harbor

Explore at:

zip(78585 bytes)Available download formats

Dataset updated

Sep 25, 2022

Authors

Shrijayan

Description

This dataset is totally imaginary and NOT real data this deal only with values that are created by us.

Content

This dataset deals with pollution in the Harbor of Chennai Kolkata and Visahapattinam has been recorded but it is a pain to create and collect all the data and arrange them in a format that interests data scientists. Hence I gathered four major pollutants and place them neatly in a CSV file.

Content

There is a total of 29 fields. The four pollutants (NO2, O3, SO2, and O3) each have 5 specific columns. Observations totaled. This kernel provides a good introduction to this dataset!

For observations on specific columns visit the Column Metadata on the Data tab.

Inspiration

I did a related project and decided to open-source our dataset so that data scientists don't need to re-scrap from the first for historical pollution data.

Clear search

Close search

Google apps

Main menu

Air Quality Index Data

Content

Content

Inspiration

Air Quality Monitoring Data: PM10, PM2.5, O3, CO, Temp, Rel. Humidity - Jan...

Data from "Air pollution control strategies directly limiting national...

Sofia air quality dataset

Content

Archive of luftdaten.info / api.dusti.xyz

about

License

Problems

Future improvements

Inspiration

Data from: Indoor air quality in California homes with code-required...

Britain Breathing 2016-2019 Air Quality and Meteorological Regional...

Data from: Air Quality Data

Data from: Patterns discovery dataset for particulate matter (pm2.5)...

Overview

Dataset Description

File Information

Structure

Units

Notes on Data

Global Air Quality Data(15 Days Hourly, 50 Cities)

Data for: Leveraging scientific community knowledge for air quality model...

National Park Air Quality Index Dataset

Ten-year data tables by province, industry and substance – releases

Bulk data files for all years – releases, disposals, transfers and facility...

Dissolved Inorganic Carbon and Dissolved Organic Carbon Data for the East...

China AQI PM25s Cumulative Hourly Data (2015-09 to 2016-05)

Area source grid air pollution emissions (TWD97)

Data from: Air Quality Monitoring at Universidad Iberoamericana, Mexico...

A Multi-Pollutant Emissions Inventory for Air Pollution Modeling and...

NARSTO EPA Supersite (SS) Houston, Texas Air Quality Study 2000 (TexAQS2000)...

NC_birth_air_pollution_analysis

Air Quality Index Data

Air Pollution Quality Prediction in Harbor

Content

Content

Inspiration