100+ datasets found
  1. Air Quality Index Data

    • kaggle.com
    zip
    Updated Sep 25, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Shrijayan (2022). Air Quality Index Data [Dataset]. https://www.kaggle.com/datasets/cpluzshrijayan/air-quality-prediction-harbor
    Explore at:
    zip(78585 bytes)Available download formats
    Dataset updated
    Sep 25, 2022
    Authors
    Shrijayan
    Description

    This dataset is totally imaginary and NOT real data this deal only with values that are created by us.

    Content

    This dataset deals with pollution in the Harbor of Chennai Kolkata and Visahapattinam has been recorded but it is a pain to create and collect all the data and arrange them in a format that interests data scientists. Hence I gathered four major pollutants and place them neatly in a CSV file.

    Content

    There is a total of 29 fields. The four pollutants (NO2, O3, SO2, and O3) each have 5 specific columns. Observations totaled. This kernel provides a good introduction to this dataset!

    For observations on specific columns visit the Column Metadata on the Data tab.

    Inspiration

    I did a related project and decided to open-source our dataset so that data scientists don't need to re-scrap from the first for historical pollution data.

  2. m

    Air Quality Monitoring Data: PM10, PM2.5, O3, CO, Temp, Rel. Humidity - Jan...

    • data.mendeley.com
    Updated Nov 16, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    daniel perez (2023). Air Quality Monitoring Data: PM10, PM2.5, O3, CO, Temp, Rel. Humidity - Jan 2022 to May 2023, Universidad Iberoamericana, Mexico City [Dataset]. http://doi.org/10.17632/gjvrn32zbm.1
    Explore at:
    Dataset updated
    Nov 16, 2023
    Authors
    daniel perez
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Mexico City
    Description

    This comprehensive dataset, provided in CSV format, captures detailed air quality monitoring data recorded hourly from January 2022 to May 2023 at Universidad Iberoamericana in Mexico City (Geo-location: 19.372292, -99.263679). It includes a wide range of environmental parameters such as particulate matter (PM10 and PM2.5), ozone (O3), carbon monoxide (CO), temperature, and relative humidity.

    The CSV file contains multiple columns representing each parameter, along with corresponding timestamps for each hour of recording. This extensive dataset offers valuable insights into air quality trends and variations over a significant period, making it a rich resource for environmental research and analysis.

  3. Data from "Air pollution control strategies directly limiting national...

    • catalog.data.gov
    • gimi9.com
    Updated Nov 12, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. EPA Office of Research and Development (ORD) (2020). Data from "Air pollution control strategies directly limiting national health damages in the US", by Ou et al. [Dataset]. https://catalog.data.gov/dataset/data-from-air-pollution-control-strategies-directly-limiting-national-health-damages-in-th
    Explore at:
    Dataset updated
    Nov 12, 2020
    Dataset provided by
    United States Environmental Protection Agencyhttp://www.epa.gov/
    Area covered
    United States
    Description

    This file describes the dataset used in Ou et al., "Air pollution control strategies directly limiting national health damages in the US." This work used the Global Change Assessment Model (GCAM) with state-level representation of the U.S. energy system (GCAM-USA). GCAM and GCAM-USA are developed and released by the University of Maryland/Pacific Northwest National Laboratory Joint Global Change Research Center (JGCRI). For further details, see the GCAM documentation: jgcri.github.io/gcam-doc. The model source code is available at github.com/JGCRI/gcam-core. A modified version of GCAMv4.3 was used for this analysis. Source code and input data specific for this paper are available upon request. This dataset contains Excel spreadsheets and an R script that link to comma-separated values (CSV) files that were extracted from the model output. The spreadsheets and scripts show the data and reproduce each of the figures in the paper. This dataset is associated with the following publication: Ou, Y., J. West, S. Smith, C. Nolte, and D. Loughlin. Air pollution control strategies directly limiting national health damages in the US.. Nature Communications. Nature Publishing Group, London, UK, 11: 957, (2020).

  4. Sofia air quality dataset

    • kaggle.com
    zip
    Updated Sep 14, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Hristo Mavrodiev (2019). Sofia air quality dataset [Dataset]. https://www.kaggle.com/datasets/hmavrodiev/sofia-air-quality-dataset
    Explore at:
    zip(3015567883 bytes)Available download formats
    Dataset updated
    Sep 14, 2019
    Authors
    Hristo Mavrodiev
    Description

    Content

    Archive of luftdaten.info / api.dusti.xyz

    about

    This archive contains CSV dumps of all outdoor sensors that seem to have delivered valid data to the API. The data is organised in directories for each day, containing csv files for each sensor. There may be multiple sensors in one measurement station (e.g. PM and temperature/humidity), their data will be in different files.

    License

    The archive.luftdaten.info data is made available under the Open Database License: http://opendatacommons.org/licenses/odbl/1.0/. Any rights in individual contents of the database are licensed under the Database Contents License: http://opendatacommons.org/licenses/dbcl/1.0/

    Problems

    Please report problems in our opendata-stuttgart issuetracker: https://github.com/opendata-stuttgart/meta/issues

    Future improvements

    • description of position of sensor on location
    • image of sensor at location

    Inspiration

    Use the past data to generate air quality statistics and air quality prediction models

  5. d

    Data from: Indoor air quality in California homes with code-required...

    • datadryad.org
    • data.niaid.nih.gov
    • +1more
    zip
    Updated Apr 22, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Wanyu Chan; Yang-Seon Kim; William Delp; Iain Walker; Brett Singer (2020). Indoor air quality in California homes with code-required mechanical ventilation [Dataset]. http://doi.org/10.7941/D1ZS7X
    Explore at:
    zipAvailable download formats
    Dataset updated
    Apr 22, 2020
    Dataset provided by
    Dryad
    Authors
    Wanyu Chan; Yang-Seon Kim; William Delp; Iain Walker; Brett Singer
    Time period covered
    Feb 7, 2020
    Area covered
    California
    Description

    Time Series Data Handling and Quality Assurance Review

    Most instruments had internal logging and special software to download data from the field instruments as binary files or ascii/csv files. The instruments for which files downloaded as binary provide software to view the data or export the data to csv files.

    One-minute resolution time-series data files were created for each house using an R script that pulled data from the csv files, aligned data by time, executed unit conversions, and translated from instruments with longer or different data intervals (e.g. 30 min formaldehyde data and 1.5 min for anemometer data). Visual review was conducted on the compiled files (and primary csv or binary files were consulted as needed) to check for translation or writing errors (especially from terminal emulator), indications of instrument malfunction, mislabeled units or unit conversion errors, mislabeled location, and time stamp errors.

    The draft final set of time-series data&nb...

  6. Z

    Britain Breathing 2016-2019 Air Quality and Meteorological Regional...

    • data.niaid.nih.gov
    Updated Feb 16, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Gledson, Ann; Lowe, Douglas; Reani, Manuele; Jay, Caroline; Topping, David (2022). Britain Breathing 2016-2019 Air Quality and Meteorological Regional Estimates Dataset [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_4439642
    Explore at:
    Dataset updated
    Feb 16, 2022
    Dataset provided by
    Chinese University of Hong Kong
    University of Manchester
    Authors
    Gledson, Ann; Lowe, Douglas; Reani, Manuele; Jay, Caroline; Topping, David
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    United Kingdom
    Description

    This data set is a collection of estimated daily mean and maximum values for a range of air quality and meterological measurements and model forecasts for the UK and crown dependencies postcode districts (e.g. 'AB') for the years 2016-2019, inclusive.

    The paper describing this dataset is available here: https://www.nature.com/articles/s41597-022-01135-6

    The data uses a 'concentric regions' method to estimate the measurement for all regions, as follows. If measurements exist within the region, the mean of those measurements is used, if not, then a ring of neighbouring postcode regions are selected, and the mean of their measurement values used. If no measurement sites/data are found in the first ring, the process continues, taking the next ring of postcode district regions, working outwards until one or more sensors are found in a ring. As well as the measurement estimations, the number of rings required to find site data and make the estimations is also published. As a result, please note that estimations with higher ring counts ('rings') are likely to be calculated from more distant sensors. This distance depends upon the size of the postcode regions surrounding the location being estimated. Please use the ring count ('rings') to limit/filter estimations based on your required level of confidence.

    The meteorological, pollen and air quality measurement data used to make the regional estimations can be found at this Zenodo archive. The data there contains Temperature, Relative Humidity, and Pressure data, downloaded from the Met Office MIDAS archives via the MEDMI server (https://www.data-mashup.org.uk/). Also downloaded from the MEDMI server are daily pollen measurements for the UK. PM10, PM2.5, NO2, NOx (as NO2), O3, and SO2 measurements from the DEFRA AURN network, and also model forecasts of the same made using the EMEP model.

    The code used to make the estimations is available at this Zenodo archive.

    The postcode data in postcode_district_data.csv are collated from several sources:

    https://www.doogal.co.uk/UKPostcodes.php (population figures for the UK (UK Census 2011))

    https://www.freemaptools.com/download-uk-postcode-outcode-boundaries.htm (postcode boundary polygons for UK and crown dependancies)

    https://www.gov.gg/population (Guernsey (GY) population data for end June 2020)

    https://www.gov.je/Government/JerseyInFigures/Population/Pages/Population.aspx (Jersey (JE) population data for end 2019)

    https://www.gov.im/media/1369690/isle-of-man-in-numbers-july-2020.pdf (Isle of Man (IM) population data for April 2016)

    The data-set is presented in CSV format, as six files:

    postcode_district_data.csv: location metadata (region_id, geometry, description, population, country)

    regional_site_counts.csv: a table showing the number of sites for each measurement (columns), for each region_id (rows). region_id's match those in the postcode_district_data.csv file.

    turing_regional_estimates_aq_daily_met_pollen_pollution_imputed_data.csv: uses imputed site data (timestamp, region_id, ...[measurement name, rings]) ('rings' is the number of rings required to make the estimation)

    turing_regional_estimates_aq_daily_met_pollen_pollution_original_data.csv: uses original site data (timestamp, region_id, ...[measurement name, rings]) ('rings' is the number of rings required to make the estimation)

    turing_regional_estimates_aq_loc_type_daily_imputed_data.csv: uses imputed site data. Air quality regional estimates are calculated using specific AQ site location types* separately. (To prevent, for example, 'Traffic Urban' type sites being used to estimate 'non-traffic' or rural regions.)

    turing_regional_estimates_aq_loc_type_daily_original_data.csv: uses original data. Air quality regional estimates are calculated using specific AQ site location types* separately. (To prevent, for example, 'Traffic Urban' type sites being used to estimate 'non-traffic' or rural regions.)

    • Air quality site types:

    Industrial: comprises 'urban industrial' (9 sites) and suburban industrial (2 sites)

    'Rural background' (14 sites)

    'Urban background' (48 sites)

    'Urban traffic' (47 sites)

  7. Data from: Air Quality Data

    • kaggle.com
    zip
    Updated Oct 16, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Python Developer (2025). Air Quality Data [Dataset]. https://www.kaggle.com/datasets/programmer3/air-quality-data
    Explore at:
    zip(76469629 bytes)Available download formats
    Dataset updated
    Oct 16, 2025
    Authors
    Python Developer
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    The dataset contains Air Quality Index (AQI) at hourly and daily level of various stations across multiple cities.

    station_day.csv → 108035 rows city_day.csv → 29531 rows city_hour.csv → 707875 rows station_hour.csv → 2589083 rows stations.csv → 230 rows

    Total rows across all files: 3434754

    1. Station Dataset Features

    StationId → Unique identifier for each air quality monitoring station. Useful for referencing and merging data.

    StationName → Name of the monitoring station. Helps in identifying stations in reports and visualizations.

    City → The city where the station is located. Can be used to aggregate data at city level.

    State → The state where the station is located. Useful for regional analysis.

    Status → Indicates whether the station is active, inactive, or under maintenance. Helps filter reliable data.

    1. City (Air Quality) Dataset Features

    City → Name of the city where the measurement was taken.

    Datetime → Timestamp of the measurement. Could be hourly or daily. Important for time series analysis.

    PM2.5 → Fine particulate matter ≤ 2.5 μm in diameter. Indicates air pollution that penetrates deep into the lungs.

    PM10 → Particulate matter ≤ 10 μm. Measures larger airborne particles.

    NO → Nitric oxide, a primary pollutant mainly from vehicles and industry.

    NO2 → Nitrogen dioxide, harmful to respiratory health. Often a marker of traffic pollution.

    NOx → Nitrogen oxides (NO + NO2). Total nitrogen oxide pollution.

    NH3 → Ammonia, released from agriculture and waste. Can contribute to secondary particulate matter.

    CO → Carbon monoxide, a toxic gas from incomplete combustion.

    SO2 → Sulfur dioxide, mainly from burning fossil fuels. Causes acid rain and respiratory issues.

    O3 → Ozone at ground level, harmful to lungs. A secondary pollutant formed by reactions of NOx and VOCs.

    Benzene, Toluene, Xylene → Volatile organic compounds (VOCs) from industry and vehicles. Can be toxic and form ozone.

    AQI → Air Quality Index, a standardized number summarizing pollution level across multiple pollutants.

    AQI_Bucket → Category of AQI (Good, Moderate, Poor, etc.), making interpretation easier.

  8. d

    Data from: Patterns discovery dataset for particulate matter (pm2.5)...

    • search.dataone.org
    • datadryad.org
    Updated Dec 12, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Uday Kiran Rage; Vanitha Kattumuri; Arjun Chakravarthi Pogaku (2024). Patterns discovery dataset for particulate matter (pm2.5) pollution trends in Japan [Dataset]. http://doi.org/10.5061/dryad.hhmgqnkrr
    Explore at:
    Dataset updated
    Dec 12, 2024
    Dataset provided by
    Dryad Digital Repository
    Authors
    Uday Kiran Rage; Vanitha Kattumuri; Arjun Chakravarthi Pogaku
    Description

    Air pollution presents a significant environmental risk, impacting human health, accelerating climate change, and disrupting ecosystems. The main aim of air pollution research is to pinpoint the most harmful pollutants identified in previous studies and to map regions exposed to high pollution levels. This study introduces a large-scale, high-quality dataset to advance the analysis of PM2.5 pollution and reveal hidden patterns through pattern mining techniques. The dataset covers five years of hourly PM2.5 measurements collected from approximately 1,900 sensors across Japan, sourced from the Ministry of the Environment's Soramame platform. This platform offers hourly pollutant records, downloadable as monthly raw data files. The unorganised raw data files are systematically organised and stored in database tables using an Entity-Relationship (ER) schema. The primary objective of this dataset is to aid in developing and validating pattern mining models, enabling the accurate detection of..., The air pollution data was collected from Japan’s Soramame platform, which provides hourly updates on pollutant levels nationwide. The data files were collected from January 1, 2018, 01:00:00, to April 25, 2023, 22:00:00, covering records from approximately 1,900 sensors stationed in various locations across Japan. These files are initially unorganised in CSV format and require systematic organisation by year, month, time, sensor, and pollutant type. To maintain data integrity, we structured the dataset using an Entity-Relationship (ER) schema within a PostgreSQL database, comprising two main tables: the Sensor table (storing sensor name, ID, address, and location) and the Observations table (recording pollutant types and their values). A detailed step-by-step process is provided in the README, and this organization created a consolidated CSV file containing PM2.5 levels, timestamps, and sensor details., , # AEROS PM2.5 Dataset

    Overview

    The AEROS PM2.5 Dataset provides a comprehensive collection of hourly PM2.5 measurements recorded over a period of five years from sensors located across Japan. This dataset is a valuable resource for studying air quality trends, pollution patterns, and environmental health impacts.

    Dataset Description

    File Information

    • File Name: FINAL_DATASET.csv
    • Content: Hourly PM2.5 measurements collected from sensors located in Japan over five years.

    Structure

    The dataset includes the following columns:

    1. Timestamps: The date and time when the measurement was recorded.
    2. Sensor Location IDs: Unique identifiers for the sensor locations.
    3. PM2.5 Values (µg/m³): The recorded PM2.5 concentration at a specific timestamp and location.

    Units

    • PM2.5 Values: Measured in micrograms per cubic meter (µg/m³).

    Notes on Data

    • Empty Cells: Represent instances where no PM2.5 data was recorded by the s...
  9. Global Air Quality Data(15 Days Hourly, 50 Cities)

    • kaggle.com
    zip
    Updated Nov 19, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Smeet Raichura (2025). Global Air Quality Data(15 Days Hourly, 50 Cities) [Dataset]. https://www.kaggle.com/datasets/smeet888/global-air-quality-data15-days-hourly-50-cities
    Explore at:
    zip(598546 bytes)Available download formats
    Dataset updated
    Nov 19, 2025
    Authors
    Smeet Raichura
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    📘 Overview

    This dataset provides hourly air-quality measurements for 50 major global cities over a continuous 15-day period, including pollutant concentrations, meteorological conditions, geographical metadata, and an engineered AQI index.

    All values are synthetically generated using historically consistent pollutant patterns and statistical ranges, allowing researchers and ML practitioners to work with realistic air-quality trends without licensing restrictions or data-collection barriers.

    This dataset is ideal for time-series modeling, forecasting, environmental analytics, and machine-learning experimentation.

    🧭 Cities Included

    Covers all major regions:

    North America — New York, Los Angeles, Toronto

    Europe — London, Paris, Berlin, Zurich

    Asia — Delhi, Tokyo, Seoul, Beijing, Singapore

    Middle East — Dubai, Riyadh, Doha

    Africa — Lagos, Cairo, Nairobi

    Oceania — Sydney, Melbourne, Auckland

    South America — São Paulo, Buenos Aires

    🧱 Dataset Structure

    Each hourly record includes:

    Air Pollutants

    PM2.5 (µg/m³)

    PM10 (µg/m³)

    NO₂ (ppb)

    SO₂ (ppb)

    O₃ (ppb)

    CO (ppm)

    Weather Features

    Temperature (°C)

    Humidity (%)

    Wind Speed (m/s)

    Location Metadata

    City

    Country

    Latitude

    Longitude

    Other

    Timestamp (ISO-8601)

    AQI (Computed index)

    🧹 Data Quality & Formatting

    No missing values — 100% complete

    Numeric values rounded to 3 decimals

    Clean column names (snake_case)

    Consistent hourly frequency

    Fully ML-ready

    📊 Example Use Cases

    ✔ AQI forecasting (LSTM, GRU, Transformers) ✔ Multivariate time-series modeling ✔ Clustering cities by pollution patterns ✔ Environmental trend visualization ✔ Weather–pollution correlation studies ✔ Anomaly detection (peak pollution events)

    ColumnDescriptionUnitType
    timestampHourly timestamp (UTC)datetime
    cityCity namestring
    countryCountry namestring
    latitudeCity latitude°float
    longitudeCity longitude°float
    pm25Fine particulate matterµg/m³float
    pm10Coarse particulate matterµg/m³float
    no2Nitrogen dioxideppbfloat
    so2Sulfur dioxideppbfloat
    o3Ozoneppbfloat
    coCarbon monoxideppmfloat
    temperatureAmbient temperature°Cfloat
    humidityRelative humidity%float
    wind_speedWind speedm/sfloat
    aqiDerived Air Quality Indexint

    🧪 Data Generation Method (Provenance)

    This dataset is synthetically generated using realistic pollutant behavior patterns based on historical studies and open-source environmental datasets.

    Modeling steps included:

    City-specific pollutant baseline ranges

    Randomized variation using Gaussian noise

    Temporal patterns using sinusoidal diurnal cycles (morning & evening peaks)

    Weather-pollution correlation rules (e.g., low wind → higher PM)

    AQI computed using standard US-EPA breakpoints

    All numeric values standardized to 3-decimal precision

    This ensures that although synthetic, the dataset follows realistic environmental dynamics.

    📁 File Information

    global_air_quality_50_cities.csv

    Rows: 18,000+

    Columns: 16

    Format: UTF-8 CSV

  10. Data for: Leveraging scientific community knowledge for air quality model...

    • catalog.data.gov
    • datasets.ai
    Updated Mar 18, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. EPA Office of Research and Development (ORD) (2024). Data for: Leveraging scientific community knowledge for air quality model chemistry parameterizations [Dataset]. https://catalog.data.gov/dataset/data-for-leveraging-scientific-community-knowledge-for-air-quality-model-chemistry-paramet
    Explore at:
    Dataset updated
    Mar 18, 2024
    Dataset provided by
    United States Environmental Protection Agencyhttp://www.epa.gov/
    Description

    Files contain values from Figures 1, 2, and 3 of the article by Pye et al., "Leveraging scientific community knowledge for air quality model chemistry parameterizations," scheduled for publication in EM in January 2024. Figures 2 and 3 are available in csv and excel spreadsheet format. Figure 1 is only available in spreadsheet format. Figure 1 shows gas and aerosol-phase chemistry representations in CMAQ since 2010. Figure 2 shows ozone and SOA formation potential (in g/g) for CRACMM species. Figure 3 shows the size (number of species and reactions) for various chemical mechanisms. This dataset is associated with the following publication: Pye, H., R. Schwantes, K. Barsanti, V.F. McNeill, and G. Wolfe. Leveraging scientific community knowledge for air quality model chemistry parameterizations. EM Magazine. Air and Waste Management Association, Pittsburgh, PA, USA, 24-31, (2024).

  11. o

    National Park Air Quality Index Dataset

    • osti.gov
    Updated Jun 3, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Byler, Eleanor B; Chojnicki, Kirsten N; Svinth, Christian N (2024). National Park Air Quality Index Dataset [Dataset]. https://www.osti.gov/dataexplorer/biblio/dataset/2468701
    Explore at:
    Dataset updated
    Jun 3, 2024
    Dataset provided by
    DOE
    Pacific Northwest National Laboratory 2
    Authors
    Byler, Eleanor B; Chojnicki, Kirsten N; Svinth, Christian N
    Description

    The National Park Air Quality Index dataset (NPS-AQI) consists of webcam images taken from the National Park Service's publicly available air quality web cameras and associated measurements for air pollutants, AQI, and meteorological data obtained via the publicly available NPS Gaseous Pollutant Monitoring Program. The full dataset is a collection of 146,822 images paired with air quality measurements. The specific measurements reported are: ozone ppm, 8-hour running average ozone ppm, so2 ppm, AQI (derived from ozone), temperature, and humidity. The images are 1500X1000 pixel PNG files arranged into folders by NPS site and named according to the time and date the image was taken. There are three CSV files (representing "training", "validation", and "testing" images splits) containing image names and associated NPS site names, air pollutant measurements, and meteorlogical data.

  12. G

    Ten-year data tables by province, industry and substance – releases

    • open.canada.ca
    • data.urbandatacentre.ca
    • +1more
    csv, html
    Updated Jul 15, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Environment and Climate Change Canada (2025). Ten-year data tables by province, industry and substance – releases [Dataset]. https://open.canada.ca/data/en/dataset/ea0dc8ae-d93c-4e24-9f61-946f1736a26f
    Explore at:
    html, csvAvailable download formats
    Dataset updated
    Jul 15, 2025
    Dataset provided by
    Environment and Climate Change Canada
    License

    Open Government Licence - Canada 2.0https://open.canada.ca/en/open-government-licence-canada
    License information was derived automatically

    Time period covered
    Jan 1, 2014 - Dec 31, 2023
    Description

    The National Pollutant Release Inventory (NPRI) is Canada's public inventory of pollutant releases (to air, water and land), disposals and transfers for recycling. Each file contains annual total releases for the past ten years by media (air, water or land), broken-down by province, industry or substance. Files are in .CSV format. The results can be further broken down using the pre-defined search available at the bottom of the NPRI Data Search webpage. The results returned by the NPRI search engine may differ from the numbers contained in the downloadable files. The online search engine’s results will display releases, disposals and transfers reported by facilities, but does not distinguish between media type (i.e. air, water, land). It also displays facilities reporting only under Ontario Regulation 127/01 and facilities submitting “did not meet criteria” reports. Please consult the following resources to enhance your analysis: - Guide on using and Interpreting NPRI Data: https://www.canada.ca/en/environment-climate-change/services/national-pollutant-release-inventory/using-interpreting-data.html - Access additional data from the NPRI, including datasets and mapping products: https://www.canada.ca/en/environment-climate-change/services/national-pollutant-release-inventory/tools-resources-data/exploredata.html Supplemental Information More NPRI datasets and mapping products are available here: https://www.canada.ca/en/environment-climate-change/services/national-pollutant-release-inventory/tools-resources-data/access.html Supporting Projects: National Pollutant Release Inventory (NPRI)

  13. Bulk data files for all years – releases, disposals, transfers and facility...

    • open.canada.ca
    csv, html
    Updated Jul 15, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Environment and Climate Change Canada (2025). Bulk data files for all years – releases, disposals, transfers and facility locations [Dataset]. https://open.canada.ca/data/en/dataset/40e01423-7728-429c-ac9d-2954385ccdfb
    Explore at:
    csv, htmlAvailable download formats
    Dataset updated
    Jul 15, 2025
    Dataset provided by
    Environment And Climate Change Canadahttps://www.canada.ca/en/environment-climate-change.html
    License

    Open Government Licence - Canada 2.0https://open.canada.ca/en/open-government-licence-canada
    License information was derived automatically

    Time period covered
    Jan 1, 1993 - Dec 31, 2023
    Description

    The National Pollutant Release Inventory (NPRI) is Canada's public inventory of pollutant releases (to air, water and land), disposals and transfers for recycling. Each file contains data from 1993 to the latest reporting year. These CSV format datasets are in normalized or ‘list’ format and are optimized for pivot table analyses. Here is a description of each file: - The RELEASES file contains all substance release quantities. - The DISPOSALS file contains all on-site and off-site disposal quantities, including tailings and waste rock (TWR). - The TRANSFERS file contains all quantities transferred for recycling or treatment prior to disposal. - The COMMENTS file contains all the comments provided by facilities about substances included in their report. - The GEO LOCATIONS file contains complete geographic information for all facilities that have reported to the NPRI. Please consult the following resources to enhance your analysis: - Guide on using and Interpreting NPRI Data: https://www.canada.ca/en/environment-climate-change/services/national-pollutant-release-inventory/using-interpreting-data.html - Access additional data from the NPRI, including datasets and mapping products: https://www.canada.ca/en/environment-climate-change/services/national-pollutant-release-inventory/tools-resources-data/exploredata.html Supplemental Information More NPRI datasets and mapping products are available here: https://www.canada.ca/en/environment-climate-change/services/national-pollutant-release-inventory/tools-resources-data/access.html Supporting Projects: National Pollutant Release Inventory (NPRI)

  14. u

    Dissolved Inorganic Carbon and Dissolved Organic Carbon Data for the East...

    • data.nceas.ucsb.edu
    • data.ess-dive.lbl.gov
    • +3more
    Updated Aug 19, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Wenming Dong; Curtis Beutler; Wendy Brown; Alexander Newman; Dylan O'Ryan; Roelof Versteeg; Kenneth Williams (2023). Dissolved Inorganic Carbon and Dissolved Organic Carbon Data for the East River Watershed, Colorado (2015-2023) [Dataset]. http://doi.org/10.15485/1660459
    Explore at:
    Dataset updated
    Aug 19, 2023
    Dataset provided by
    ESS-DIVE
    Authors
    Wenming Dong; Curtis Beutler; Wendy Brown; Alexander Newman; Dylan O'Ryan; Roelof Versteeg; Kenneth Williams
    Time period covered
    Sep 1, 2015 - Jan 5, 2023
    Area covered
    Description

    This data package contains mean values for dissolved organic carbon (DOC) and dissolved inorganic carbon (DIC) for water samples taken from the East River Watershed in Colorado. The East River is part of the Watershed Function Scientific Focus Area (WFSFA) located in the Upper Colorado River Basin, United States. DOC and DIC concentrations in water samples were determined using a TOC-VCPH analyzer (Shimadzu Corporation, Japan). DOC was analyzed as non-purgeable organic carbon (NPOC) by purging HCl acidified samples with carbon-free air to remove DIC prior to measurement. After the acidified sample has been sparged, it is injected into a combustion tube filled with oxidation catalyst heated to 680 degrees C. The DOC in samples is combusted to CO2 and measured by a non-dispersive infrared (NDIR) detector. The peak area of the analog signal produced by the NDIR detector is proportional to the DOC concentration of the sample. DIC was determined by acidifying the samples with HCl first, and then purge with carbon-free air to release CO2 for analysis by NDIR detector. All files are labeled by location and variable, and data reported are the mean values upon minimum three replicate measurements with a relative standard deviation < 3%. All samples were analyzed under a rigorous quality assurance and quality control (QA/QC) process as detailed in the methods. This data package contains (1) a zip file (dic_npoc_data_2014-2023.zip) containing a total of 319 files: 318 data files of DIC and NPOC data from across the Lawrence Berkeley National Laboratory (LBNL) Watershed Function Scientific Focus Area (SFA) which is reported in .csv files per location and a locations.csv (1 file) with latitude and longitude for each location; (2) a file-level metadata (v3_20230808_flmd.csv) file that lists each file contained in the dataset with associated metadata; (3) a data dictionary (v3_20230808_dd.csv) file that contains terms/column_headers used throughout the files along with a definition, units, and data type; and (4) PDF and docx files for the determination of Method Detection Limits (MDLs) for DIC and NPOC data. There are a total of 106 locations containing DIC/NPOC data. Update on 2020-10-07: Updated the data files to remove times from the timestamps, so that only dates remain. The data values have not changed. Update on 2021-04-11: Added Determination of Method Detection Limits (MDLs) for DIC, NPOC and TDN Analyses document, which can be accessed as a PDF or with Microsoft Word. Update on 2022-06-10: versioned updates to this dataset was made along with these changes: (1) updated dissolved inorganic carbon and dissolved organic carbon data for all locations up to 2021-12-31, (2) removal of units from column headers in datafiles, (3) added row underneath headers to contain units of variables, (4) restructure of units to comply with CSV reporting format requirements, (5) added -9999 for empty numerical cells, and (6) the addition of the file-level metadata (flmd.csv) and data dictionary (dd.csv) were added to comply with the File-Level Metadata Reporting Format. Update on 2022-09-09: Updates were made to reporting format specific files (file-level metadata and data dictionary) to correct swapped file names, add additional details on metadata descriptions on both files, add a header_row column to enable parsing, and add version number and date to file names (v2_20220909_flmd.csv and v2_20220909_dd.csv). Update on 2023-08-08: Updates were made to both the data files and reporting format specific files. New available anion data was added, up until 2023-01-05. The file level metadata and data dictionary files were updated to reflect the additional data added.

  15. H

    China AQI PM25s Cumulative Hourly Data (2015-09 to 2016-05)

    • dataverse.harvard.edu
    • search.dataone.org
    Updated Jan 18, 2017
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Lex Berman (2017). China AQI PM25s Cumulative Hourly Data (2015-09 to 2016-05) [Dataset]. http://doi.org/10.7910/DVN/3X9NIF
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jan 18, 2017
    Dataset provided by
    Harvard Dataverse
    Authors
    Lex Berman
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Area covered
    China
    Description

    The hourly updates of ground monitoring observations in China are collected and stored in .csv files (one for each hour). NOTE: the pm25s data does NOT report locations for the reporting stations in x, y coordinates.

  16. d

    Area source grid air pollution emissions (TWD97)

    • data.gov.tw
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ministry of Environment, Area source grid air pollution emissions (TWD97) [Dataset]. https://data.gov.tw/en/datasets/152688
    Explore at:
    Dataset authored and provided by
    Ministry of Environment
    License

    https://data.gov.tw/licensehttps://data.gov.tw/license

    Description

    Grid emission data of non-point source air pollutants, the base year is estimated to be the 108th year of the Republic of China, and the grid covers Taiwans main island and outlying islands. The central meridian of TWD97-TM2 121 and TWD97-TM2 119 is the grid point on the lower left, respectively. In angular coordinates, air pollutants include TSP, PM10, PM6 (no estimation), PM2.5, SOx, NOx, THC, NMHC, CO, and Pb. Since the number of files is extremely large and exceeds the capacity of the CSV file format, if you need to download it completely, it is recommended to download it from the following website: https://air.epa.gov.tw/EnvTopics/AirQuality_6.aspx Go to the "Attachment Download" at the bottom of the page , click "Emission Inventory TEDS11.1 Raw Data".

  17. m

    Data from: Air Quality Monitoring at Universidad Iberoamericana, Mexico...

    • data.mendeley.com
    Updated Jun 25, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Daniel Alejandro Perez de la Mora (2025). Air Quality Monitoring at Universidad Iberoamericana, Mexico City: Ozone and Meteorological Data During the Rainy and Ozone Seasons (Mar–Nov 2024) [Dataset]. http://doi.org/10.17632/z42m8c43ny.3
    Explore at:
    Dataset updated
    Jun 25, 2025
    Authors
    Daniel Alejandro Perez de la Mora
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Mexico, Mexico City
    Description

    This dataset provides hourly air quality and meteorological measurements recorded from March 1, 2024, to November 30, 2024, at Universidad Iberoamericana, Mexico City (Geo-location: 19.372292, -99.263679). Each value represents the average of measurements taken within that hour. The data spans the rainy season (approximately May to October) and part of the ozone season (approximately February to May, with overlap into March–April), capturing air pollution and weather patterns in an urban, high-altitude environment.

    The dataset includes the following parameters:

    Ozone (O₃) (ppb) – Hourly average ozone concentration.

    Temperature (°C) – Hourly average ambient temperature.

    Relative Humidity (%) – Hourly average humidity.

    Wind Direction (degrees) – Hourly average wind direction (0° = North).

    Wind Speed (m/s) – Hourly average wind speed.

    Air Pressure (hPa) – Hourly average atmospheric pressure.

    Accumulated Precipitation (mm) – Total precipitation accumulated within the hour.

    Solar Radiation (W/m²) – Hourly average solar radiation.

    File Format: CSV (comma-separated) Data Structure: Each row represents an hourly measurement with a timestamp formatted as DD-MM-YY HH:MM (e.g., 01-03-24 14:00), using 24-hour time.

    Sensor Details & Data Quality: Measurements were collected using low-cost air quality sensors deployed at the IBERO campus.

    Applications: This dataset is suitable for air pollution analysis (e.g., ozone trends during the ozone season), meteorological research (e.g., precipitation and temperature patterns in the rainy season), and environmental modeling in Mexico City.

  18. A Multi-Pollutant Emissions Inventory for Air Pollution Modeling and...

    • zenodo.org
    bin, csv, png, zip
    Updated Aug 1, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sarath Guttikunda; Sarath Guttikunda (2024). A Multi-Pollutant Emissions Inventory for Air Pollution Modeling and Supporting Information for Kampala [Dataset]. http://doi.org/10.5281/zenodo.11560003
    Explore at:
    bin, csv, png, zipAvailable download formats
    Dataset updated
    Aug 1, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Sarath Guttikunda; Sarath Guttikunda
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Kampala
    Description

    This paper is under review

    Abstract:

    Kampala, the political and economic capital of Uganda and one of the fastest urbanising cities in sub-Saharan Africa, is experiencing a deteriorating trend in air quality with emissions from multiple diffused local sources like transportation, domestic and outdoor cooking, and industries, and sources outside the city airshed like seasonal open fires in the region. PM2.5 (particulate matter under 2.5um size) is the key pollutant of concern in the city with monthly spatial heterogeneity of 60-100 ug/m3. Outdoor air pollution is distinctly pronounced in the global south cities and lack the necessary capacity and resources to develop integrated air quality management programmes including ambient monitoring, emissions and pollution analysis, source apportionment, and preparation of clean air action plans. This paper presents an integrated assessment of air quality in Kampala drawing from ground measurements (from a hybrid network of stations), satellite observations (from NASA’s MODIS and OMI), global reanalysis fields (from GEOS-chem and CAMS simulations), high resolution (~1km) multi-pollutant emissions inventory for the airshed, WRF-CAMx based PM2.5 pollution analysis, and a qualitative review of institutional and policy environment for air quality management in Kampala. The proposed clean air action plans aim for better air quality in the region using a combination of short-, medium-, and long-term emission control measures for all the dominate sources and institutionalize pollution tracking mechanisms (like emissions and pollution monitoring and reporting) for effective management of air pollution.

    This data archive serves as a supplemenary to the journal article and with a short description of the files below:

    • File: AQ-Kampala-Analysis-Summary.pptx (Caution: large 60MB)
      A composite presentation including the following
      • Grid summaries
      • Snapshots of airshed GIS files, emission activities
      • Summary of meteorology from WRF simulations and historical synoptics
      • Summaries of ambient monitoring data
      • CAMS reanalysis summary
      • Summaries of Emission inventory and WRF-CAMx modelling (annual and monthly)
      • Summaries of PM2.5 Source apportionment (annual and monthly)
    • File: grids_kampala.rar
      Grid file (KML and ESRI shapefiles format) for the airshed spanning 0.0N to 0.6N and 32.3E to 32.9E with a spatial resolution of 0.01deg (~1km)
    • File: gis_roads_from_opensteetmaps.rar
      ESRI shapefiles of primary roads and all roads, extracted from the openstreetmaps
      Raw data archive @ https://download.geofabrik.de/index.html
    • File: gis-scanned2021image-quarries.kml
      KML file of quarries scanned using the imagery on Google Earth platform
    • File: population_kampala_2000-2022.csv
      Gridded population data 2000 to 2022
      Raw data archive is from LANDSCAN - https://landscan.ornl.gov
    • File: Monitoring-Kampala_USEmbassy_2017-2024.xlsx
      Summary of monitoring data collected at the US Embassy in Kampala
      Raw data archive is @ https://www.airnow.gov/international/us-embassies-and-consulates/#Uganda$Kampala
    • File: meteo_wrf_stats.xlsm
      Summary of output of WRF simulations for the Kampala region. Have to activiate macros to summarize the results by month and update the charts. The tool can be used to other cities also by changing the input data.
    • File: meteo_precip-era5-reanalysis.csv
      Summary of monthly precipitation date (mm/day) from ERA5 reanalysis fields
      Raw data archive is @ https://psl.noaa.gov/data/atmoswrit/timeseries
    • File: TROPOMI_EastAfrica_NO2_Maps.zip
      Images of monthly average TROPOMI NO2 extracts covering East Africa (Uganda and Ethiopia)
      Extracted from Google Earth Engine, using 10% cloud fraction
    • File: TROPOMI_EastAfrica_CSVs.zip
      CSV files of gridded monthly average NO2, SO2, HCHO, and Ozone columnar densities
      Extracted from Google Earth Engine, using 10% cloud fraction
      Read the data descriptions and applicability of the data for analysis before using (for example, negative numbers in the SO2 file).
      NO2 - https://developers.google.com/earth-engine/datasets/catalog/COPERNICUS_S5P_OFFL_L3_NO2
      SO2 - https://developers.google.com/earth-engine/datasets/catalog/COPERNICUS_S5P_OFFL_L3_SO2
      HCHO - https://developers.google.com/earth-engine/datasets/catalog/COPERNICUS_S5P_OFFL_L3_HCHO
      O3 - https://developers.google.com/earth-engine/datasets/catalog/COPERNICUS_S5P_OFFL_L3_O3
    • File: composite_emisson_factors_gains.xlsx
      A composite library of emission factors for reference
    • File: kampala_gridded_emissions_2018.rar
      Gridded emissions inventory for Kampala - PM25, PM10, SO2, NO, NO2, and CO
      PM25 is speciated into FPRM, BC, and OC (sum all for PM25)
      PM10 is speciated into FPRM, CPRM, BC, OC (sum all for PM10)
      All emissions in tons/year/grid
      Emissions are seggragted into sectors and fuels - included in the filenames
    • File: kampala_gridded_modelled_monthavgp25.csv
      Gridded PM2.5 concentrations for 2018, from WRF-CAMx modelling system
      Monthly averages in ug/m3
  19. NARSTO EPA Supersite (SS) Houston, Texas Air Quality Study 2000 (TexAQS2000)...

    • data.nasa.gov
    Updated Apr 1, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    nasa.gov (2025). NARSTO EPA Supersite (SS) Houston, Texas Air Quality Study 2000 (TexAQS2000) Texas Natural Resource Conservation Commission (TNRCC) continuous ambient monitoring stations (CAMS) Air Quality Data - Dataset - NASA Open Data Portal [Dataset]. https://data.nasa.gov/dataset/narsto-epa-supersite-ss-houston-texas-air-quality-study-2000-texaqs2000-texas-natural-reso-93e18
    Explore at:
    Dataset updated
    Apr 1, 2025
    Dataset provided by
    NASAhttp://nasa.gov/
    Area covered
    Houston, Texas
    Description

    NARSTO_EPA_HOUSTON_TEXAQS2000_CAMS_DATA is the North American Research Strategy for Tropospheric Ozone (NARSTO) Environmental Protection Agency (EPA) Supersite (SS) Houston, Texas Air Quality Study 2000 (TexAQS2000) Texas Natural Resource Conservation Commission (TNRCC) continuous ambient monitoring stations (CAMS) Air Quality Data. This data set contains 5-minute air quality measurements collected in Texas during August and September 2000 at 85 CAMS during TEXAQS2000. Measurements include carbon monoxide (CO), sulfur dioxide (SO2), nitrogen oxide (NO), nitrogen dioxide (NO2), oxides of nitrogen (NOx), total reactive nitrogen species (NOy), ozone, particulate matter (PM) 2.5 mass, hydrogen sulfide (H2S), wind speed, wind direction, maximum wind gust, air temperature, dewpoint temperature, humidity, precipitation, surface pressure, radiation, and visibility. CAMS are operated by the Texas Commission on Environmental Quality (TCEQ), local city or county governments, or private monitoring networks. Important monitoring site information: The site information data table in each of the 85 data files may not contain the latest TCEQ site information. A companion file site information spreadsheet (.csv) that lists data for all 85 sites is the latest TCEQ site information. The site information data tables in the 85 data files will not be updated. The 85 site spreadsheet companion document is the official source of site data, and this data is listed in the TEXAQS2000 CAMS guide document.NARSTO, which has since disbanded, was a public/private partnership, whose membership spanned across government, utilities, industry, and academe throughout Mexico, the United States, and Canada. The primary mission was to coordinate and enhance policy-relevant scientific research and assessment of tropospheric pollution behavior; activities provide input for science-based decision-making and determination of workable, efficient, and effective strategies for local and regional air-pollution management. Data products from local, regional, and international monitoring and research programs are still available.

  20. NC_birth_air_pollution_analysis

    • catalog.data.gov
    Updated Apr 25, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. EPA Office of Research and Development (ORD) (2025). NC_birth_air_pollution_analysis [Dataset]. https://catalog.data.gov/dataset/nc-birth-air-pollution-analysis
    Explore at:
    Dataset updated
    Apr 25, 2025
    Dataset provided by
    United States Environmental Protection Agencyhttp://www.epa.gov/
    Description

    Dataset contains information on births in North Carolina during the study period, linked to air pollution concentrations during pregnancy. This dataset is not publicly accessible because: EPA cannot release personally identifiable information regarding living individuals, according to the Privacy Act and the Freedom of Information Act (FOIA). This dataset contains information about human research subjects. Because there is potential to identify individual participants and disclose personal information, either alone or in combination with other datasets, individual level data are not appropriate to post for public access. Restricted access may be granted to authorized persons by contacting the party listed. It can be accessed through the following means: Code files and data dictionaries can be requested from authors (rappazzo.kristen@epa.gov). Birth records can be requested through the North Carolina Department of Health and Human Services. Format: Data include csv, SAS, and R files containing information about births in North Carolina. This dataset is associated with the following publication: Krajewski, A., T. Luben, J. Warren, and K. Rappazzo. Associations between weekly gestational exposure of fine particulate matter, ozone, and nitrogen dioxide and preterm birth in a North Carolina Birth Cohort, 2003-2015. Environmental Epidemiology. Wolters Kluwer, Alphen aan den Rijn, NETHERLANDS, 7(6): e278, (2023).

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Shrijayan (2022). Air Quality Index Data [Dataset]. https://www.kaggle.com/datasets/cpluzshrijayan/air-quality-prediction-harbor
Organization logo

Air Quality Index Data

Air Pollution Quality Prediction in Harbor

Explore at:
zip(78585 bytes)Available download formats
Dataset updated
Sep 25, 2022
Authors
Shrijayan
Description

This dataset is totally imaginary and NOT real data this deal only with values that are created by us.

Content

This dataset deals with pollution in the Harbor of Chennai Kolkata and Visahapattinam has been recorded but it is a pain to create and collect all the data and arrange them in a format that interests data scientists. Hence I gathered four major pollutants and place them neatly in a CSV file.

Content

There is a total of 29 fields. The four pollutants (NO2, O3, SO2, and O3) each have 5 specific columns. Observations totaled. This kernel provides a good introduction to this dataset!

For observations on specific columns visit the Column Metadata on the Data tab.

Inspiration

I did a related project and decided to open-source our dataset so that data scientists don't need to re-scrap from the first for historical pollution data.

Search
Clear search
Close search
Google apps
Main menu