Facebook
TwitterThis dataset is totally imaginary and NOT real data this deal only with values that are created by us.
This dataset deals with pollution in the Harbor of Chennai Kolkata and Visahapattinam has been recorded but it is a pain to create and collect all the data and arrange them in a format that interests data scientists. Hence I gathered four major pollutants and place them neatly in a CSV file.
There is a total of 29 fields. The four pollutants (NO2, O3, SO2, and O3) each have 5 specific columns. Observations totaled. This kernel provides a good introduction to this dataset!
For observations on specific columns visit the Column Metadata on the Data tab.
I did a related project and decided to open-source our dataset so that data scientists don't need to re-scrap from the first for historical pollution data.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This comprehensive dataset, provided in CSV format, captures detailed air quality monitoring data recorded hourly from January 2022 to May 2023 at Universidad Iberoamericana in Mexico City (Geo-location: 19.372292, -99.263679). It includes a wide range of environmental parameters such as particulate matter (PM10 and PM2.5), ozone (O3), carbon monoxide (CO), temperature, and relative humidity.
The CSV file contains multiple columns representing each parameter, along with corresponding timestamps for each hour of recording. This extensive dataset offers valuable insights into air quality trends and variations over a significant period, making it a rich resource for environmental research and analysis.
Facebook
TwitterThis file describes the dataset used in Ou et al., "Air pollution control strategies directly limiting national health damages in the US." This work used the Global Change Assessment Model (GCAM) with state-level representation of the U.S. energy system (GCAM-USA). GCAM and GCAM-USA are developed and released by the University of Maryland/Pacific Northwest National Laboratory Joint Global Change Research Center (JGCRI). For further details, see the GCAM documentation: jgcri.github.io/gcam-doc. The model source code is available at github.com/JGCRI/gcam-core. A modified version of GCAMv4.3 was used for this analysis. Source code and input data specific for this paper are available upon request. This dataset contains Excel spreadsheets and an R script that link to comma-separated values (CSV) files that were extracted from the model output. The spreadsheets and scripts show the data and reproduce each of the figures in the paper. This dataset is associated with the following publication: Ou, Y., J. West, S. Smith, C. Nolte, and D. Loughlin. Air pollution control strategies directly limiting national health damages in the US.. Nature Communications. Nature Publishing Group, London, UK, 11: 957, (2020).
Facebook
TwitterThis archive contains CSV dumps of all outdoor sensors that seem to have delivered valid data to the API. The data is organised in directories for each day, containing csv files for each sensor. There may be multiple sensors in one measurement station (e.g. PM and temperature/humidity), their data will be in different files.
The archive.luftdaten.info data is made available under the Open Database License: http://opendatacommons.org/licenses/odbl/1.0/. Any rights in individual contents of the database are licensed under the Database Contents License: http://opendatacommons.org/licenses/dbcl/1.0/
Please report problems in our opendata-stuttgart issuetracker: https://github.com/opendata-stuttgart/meta/issues
Use the past data to generate air quality statistics and air quality prediction models
Facebook
TwitterTime Series Data Handling and Quality Assurance Review
Most instruments had internal logging and special software to download data from the field instruments as binary files or ascii/csv files. The instruments for which files downloaded as binary provide software to view the data or export the data to csv files.
One-minute resolution time-series data files were created for each house using an R script that pulled data from the csv files, aligned data by time, executed unit conversions, and translated from instruments with longer or different data intervals (e.g. 30 min formaldehyde data and 1.5 min for anemometer data). Visual review was conducted on the compiled files (and primary csv or binary files were consulted as needed) to check for translation or writing errors (especially from terminal emulator), indications of instrument malfunction, mislabeled units or unit conversion errors, mislabeled location, and time stamp errors.
The draft final set of time-series data&nb...
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This data set is a collection of estimated daily mean and maximum values for a range of air quality and meterological measurements and model forecasts for the UK and crown dependencies postcode districts (e.g. 'AB') for the years 2016-2019, inclusive.
The paper describing this dataset is available here: https://www.nature.com/articles/s41597-022-01135-6
The data uses a 'concentric regions' method to estimate the measurement for all regions, as follows. If measurements exist within the region, the mean of those measurements is used, if not, then a ring of neighbouring postcode regions are selected, and the mean of their measurement values used. If no measurement sites/data are found in the first ring, the process continues, taking the next ring of postcode district regions, working outwards until one or more sensors are found in a ring. As well as the measurement estimations, the number of rings required to find site data and make the estimations is also published. As a result, please note that estimations with higher ring counts ('rings') are likely to be calculated from more distant sensors. This distance depends upon the size of the postcode regions surrounding the location being estimated. Please use the ring count ('rings') to limit/filter estimations based on your required level of confidence.
The meteorological, pollen and air quality measurement data used to make the regional estimations can be found at this Zenodo archive. The data there contains Temperature, Relative Humidity, and Pressure data, downloaded from the Met Office MIDAS archives via the MEDMI server (https://www.data-mashup.org.uk/). Also downloaded from the MEDMI server are daily pollen measurements for the UK. PM10, PM2.5, NO2, NOx (as NO2), O3, and SO2 measurements from the DEFRA AURN network, and also model forecasts of the same made using the EMEP model.
The code used to make the estimations is available at this Zenodo archive.
The postcode data in postcode_district_data.csv are collated from several sources:
https://www.doogal.co.uk/UKPostcodes.php (population figures for the UK (UK Census 2011))
https://www.freemaptools.com/download-uk-postcode-outcode-boundaries.htm (postcode boundary polygons for UK and crown dependancies)
https://www.gov.gg/population (Guernsey (GY) population data for end June 2020)
https://www.gov.je/Government/JerseyInFigures/Population/Pages/Population.aspx (Jersey (JE) population data for end 2019)
https://www.gov.im/media/1369690/isle-of-man-in-numbers-july-2020.pdf (Isle of Man (IM) population data for April 2016)
The data-set is presented in CSV format, as six files:
postcode_district_data.csv: location metadata (region_id, geometry, description, population, country)
regional_site_counts.csv: a table showing the number of sites for each measurement (columns), for each region_id (rows). region_id's match those in the postcode_district_data.csv file.
turing_regional_estimates_aq_daily_met_pollen_pollution_imputed_data.csv: uses imputed site data (timestamp, region_id, ...[measurement name, rings]) ('rings' is the number of rings required to make the estimation)
turing_regional_estimates_aq_daily_met_pollen_pollution_original_data.csv: uses original site data (timestamp, region_id, ...[measurement name, rings]) ('rings' is the number of rings required to make the estimation)
turing_regional_estimates_aq_loc_type_daily_imputed_data.csv: uses imputed site data. Air quality regional estimates are calculated using specific AQ site location types* separately. (To prevent, for example, 'Traffic Urban' type sites being used to estimate 'non-traffic' or rural regions.)
turing_regional_estimates_aq_loc_type_daily_original_data.csv: uses original data. Air quality regional estimates are calculated using specific AQ site location types* separately. (To prevent, for example, 'Traffic Urban' type sites being used to estimate 'non-traffic' or rural regions.)
Industrial: comprises 'urban industrial' (9 sites) and suburban industrial (2 sites)
'Rural background' (14 sites)
'Urban background' (48 sites)
'Urban traffic' (47 sites)
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
The dataset contains Air Quality Index (AQI) at hourly and daily level of various stations across multiple cities.
station_day.csv → 108035 rows city_day.csv → 29531 rows city_hour.csv → 707875 rows station_hour.csv → 2589083 rows stations.csv → 230 rows
Total rows across all files: 3434754
StationId → Unique identifier for each air quality monitoring station. Useful for referencing and merging data.
StationName → Name of the monitoring station. Helps in identifying stations in reports and visualizations.
City → The city where the station is located. Can be used to aggregate data at city level.
State → The state where the station is located. Useful for regional analysis.
Status → Indicates whether the station is active, inactive, or under maintenance. Helps filter reliable data.
City → Name of the city where the measurement was taken.
Datetime → Timestamp of the measurement. Could be hourly or daily. Important for time series analysis.
PM2.5 → Fine particulate matter ≤ 2.5 μm in diameter. Indicates air pollution that penetrates deep into the lungs.
PM10 → Particulate matter ≤ 10 μm. Measures larger airborne particles.
NO → Nitric oxide, a primary pollutant mainly from vehicles and industry.
NO2 → Nitrogen dioxide, harmful to respiratory health. Often a marker of traffic pollution.
NOx → Nitrogen oxides (NO + NO2). Total nitrogen oxide pollution.
NH3 → Ammonia, released from agriculture and waste. Can contribute to secondary particulate matter.
CO → Carbon monoxide, a toxic gas from incomplete combustion.
SO2 → Sulfur dioxide, mainly from burning fossil fuels. Causes acid rain and respiratory issues.
O3 → Ozone at ground level, harmful to lungs. A secondary pollutant formed by reactions of NOx and VOCs.
Benzene, Toluene, Xylene → Volatile organic compounds (VOCs) from industry and vehicles. Can be toxic and form ozone.
AQI → Air Quality Index, a standardized number summarizing pollution level across multiple pollutants.
AQI_Bucket → Category of AQI (Good, Moderate, Poor, etc.), making interpretation easier.
Facebook
TwitterAir pollution presents a significant environmental risk, impacting human health, accelerating climate change, and disrupting ecosystems. The main aim of air pollution research is to pinpoint the most harmful pollutants identified in previous studies and to map regions exposed to high pollution levels. This study introduces a large-scale, high-quality dataset to advance the analysis of PM2.5 pollution and reveal hidden patterns through pattern mining techniques. The dataset covers five years of hourly PM2.5 measurements collected from approximately 1,900 sensors across Japan, sourced from the Ministry of the Environment's Soramame platform. This platform offers hourly pollutant records, downloadable as monthly raw data files. The unorganised raw data files are systematically organised and stored in database tables using an Entity-Relationship (ER) schema. The primary objective of this dataset is to aid in developing and validating pattern mining models, enabling the accurate detection of..., The air pollution data was collected from Japan’s Soramame platform, which provides hourly updates on pollutant levels nationwide. The data files were collected from January 1, 2018, 01:00:00, to April 25, 2023, 22:00:00, covering records from approximately 1,900 sensors stationed in various locations across Japan. These files are initially unorganised in CSV format and require systematic organisation by year, month, time, sensor, and pollutant type. To maintain data integrity, we structured the dataset using an Entity-Relationship (ER) schema within a PostgreSQL database, comprising two main tables: the Sensor table (storing sensor name, ID, address, and location) and the Observations table (recording pollutant types and their values). A detailed step-by-step process is provided in the README, and this organization created a consolidated CSV file containing PM2.5 levels, timestamps, and sensor details., , # AEROS PM2.5 Dataset
The AEROS PM2.5 Dataset provides a comprehensive collection of hourly PM2.5 measurements recorded over a period of five years from sensors located across Japan. This dataset is a valuable resource for studying air quality trends, pollution patterns, and environmental health impacts.
FINAL_DATASET.csvThe dataset includes the following columns:
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
📘 Overview
This dataset provides hourly air-quality measurements for 50 major global cities over a continuous 15-day period, including pollutant concentrations, meteorological conditions, geographical metadata, and an engineered AQI index.
All values are synthetically generated using historically consistent pollutant patterns and statistical ranges, allowing researchers and ML practitioners to work with realistic air-quality trends without licensing restrictions or data-collection barriers.
This dataset is ideal for time-series modeling, forecasting, environmental analytics, and machine-learning experimentation.
🧭 Cities Included
Covers all major regions:
North America — New York, Los Angeles, Toronto
Europe — London, Paris, Berlin, Zurich
Asia — Delhi, Tokyo, Seoul, Beijing, Singapore
Middle East — Dubai, Riyadh, Doha
Africa — Lagos, Cairo, Nairobi
Oceania — Sydney, Melbourne, Auckland
South America — São Paulo, Buenos Aires
🧱 Dataset Structure
Each hourly record includes:
Air Pollutants
PM2.5 (µg/m³)
PM10 (µg/m³)
NO₂ (ppb)
SO₂ (ppb)
O₃ (ppb)
CO (ppm)
Weather Features
Temperature (°C)
Humidity (%)
Wind Speed (m/s)
Location Metadata
City
Country
Latitude
Longitude
Other
Timestamp (ISO-8601)
AQI (Computed index)
🧹 Data Quality & Formatting
No missing values — 100% complete
Numeric values rounded to 3 decimals
Clean column names (snake_case)
Consistent hourly frequency
Fully ML-ready
📊 Example Use Cases
✔ AQI forecasting (LSTM, GRU, Transformers) ✔ Multivariate time-series modeling ✔ Clustering cities by pollution patterns ✔ Environmental trend visualization ✔ Weather–pollution correlation studies ✔ Anomaly detection (peak pollution events)
| Column | Description | Unit | Type |
|---|---|---|---|
| timestamp | Hourly timestamp (UTC) | — | datetime |
| city | City name | — | string |
| country | Country name | — | string |
| latitude | City latitude | ° | float |
| longitude | City longitude | ° | float |
| pm25 | Fine particulate matter | µg/m³ | float |
| pm10 | Coarse particulate matter | µg/m³ | float |
| no2 | Nitrogen dioxide | ppb | float |
| so2 | Sulfur dioxide | ppb | float |
| o3 | Ozone | ppb | float |
| co | Carbon monoxide | ppm | float |
| temperature | Ambient temperature | °C | float |
| humidity | Relative humidity | % | float |
| wind_speed | Wind speed | m/s | float |
| aqi | Derived Air Quality Index | — | int |
🧪 Data Generation Method (Provenance)
This dataset is synthetically generated using realistic pollutant behavior patterns based on historical studies and open-source environmental datasets.
Modeling steps included:
City-specific pollutant baseline ranges
Randomized variation using Gaussian noise
Temporal patterns using sinusoidal diurnal cycles (morning & evening peaks)
Weather-pollution correlation rules (e.g., low wind → higher PM)
AQI computed using standard US-EPA breakpoints
All numeric values standardized to 3-decimal precision
This ensures that although synthetic, the dataset follows realistic environmental dynamics.
📁 File Information
global_air_quality_50_cities.csv
Rows: 18,000+
Columns: 16
Format: UTF-8 CSV
Facebook
TwitterFiles contain values from Figures 1, 2, and 3 of the article by Pye et al., "Leveraging scientific community knowledge for air quality model chemistry parameterizations," scheduled for publication in EM in January 2024. Figures 2 and 3 are available in csv and excel spreadsheet format. Figure 1 is only available in spreadsheet format. Figure 1 shows gas and aerosol-phase chemistry representations in CMAQ since 2010. Figure 2 shows ozone and SOA formation potential (in g/g) for CRACMM species. Figure 3 shows the size (number of species and reactions) for various chemical mechanisms. This dataset is associated with the following publication: Pye, H., R. Schwantes, K. Barsanti, V.F. McNeill, and G. Wolfe. Leveraging scientific community knowledge for air quality model chemistry parameterizations. EM Magazine. Air and Waste Management Association, Pittsburgh, PA, USA, 24-31, (2024).
Facebook
TwitterThe National Park Air Quality Index dataset (NPS-AQI) consists of webcam images taken from the National Park Service's publicly available air quality web cameras and associated measurements for air pollutants, AQI, and meteorological data obtained via the publicly available NPS Gaseous Pollutant Monitoring Program. The full dataset is a collection of 146,822 images paired with air quality measurements. The specific measurements reported are: ozone ppm, 8-hour running average ozone ppm, so2 ppm, AQI (derived from ozone), temperature, and humidity. The images are 1500X1000 pixel PNG files arranged into folders by NPS site and named according to the time and date the image was taken. There are three CSV files (representing "training", "validation", and "testing" images splits) containing image names and associated NPS site names, air pollutant measurements, and meteorlogical data.
Facebook
TwitterOpen Government Licence - Canada 2.0https://open.canada.ca/en/open-government-licence-canada
License information was derived automatically
The National Pollutant Release Inventory (NPRI) is Canada's public inventory of pollutant releases (to air, water and land), disposals and transfers for recycling. Each file contains annual total releases for the past ten years by media (air, water or land), broken-down by province, industry or substance. Files are in .CSV format. The results can be further broken down using the pre-defined search available at the bottom of the NPRI Data Search webpage. The results returned by the NPRI search engine may differ from the numbers contained in the downloadable files. The online search engine’s results will display releases, disposals and transfers reported by facilities, but does not distinguish between media type (i.e. air, water, land). It also displays facilities reporting only under Ontario Regulation 127/01 and facilities submitting “did not meet criteria” reports. Please consult the following resources to enhance your analysis: - Guide on using and Interpreting NPRI Data: https://www.canada.ca/en/environment-climate-change/services/national-pollutant-release-inventory/using-interpreting-data.html - Access additional data from the NPRI, including datasets and mapping products: https://www.canada.ca/en/environment-climate-change/services/national-pollutant-release-inventory/tools-resources-data/exploredata.html Supplemental Information More NPRI datasets and mapping products are available here: https://www.canada.ca/en/environment-climate-change/services/national-pollutant-release-inventory/tools-resources-data/access.html Supporting Projects: National Pollutant Release Inventory (NPRI)
Facebook
TwitterOpen Government Licence - Canada 2.0https://open.canada.ca/en/open-government-licence-canada
License information was derived automatically
The National Pollutant Release Inventory (NPRI) is Canada's public inventory of pollutant releases (to air, water and land), disposals and transfers for recycling. Each file contains data from 1993 to the latest reporting year. These CSV format datasets are in normalized or ‘list’ format and are optimized for pivot table analyses. Here is a description of each file: - The RELEASES file contains all substance release quantities. - The DISPOSALS file contains all on-site and off-site disposal quantities, including tailings and waste rock (TWR). - The TRANSFERS file contains all quantities transferred for recycling or treatment prior to disposal. - The COMMENTS file contains all the comments provided by facilities about substances included in their report. - The GEO LOCATIONS file contains complete geographic information for all facilities that have reported to the NPRI. Please consult the following resources to enhance your analysis: - Guide on using and Interpreting NPRI Data: https://www.canada.ca/en/environment-climate-change/services/national-pollutant-release-inventory/using-interpreting-data.html - Access additional data from the NPRI, including datasets and mapping products: https://www.canada.ca/en/environment-climate-change/services/national-pollutant-release-inventory/tools-resources-data/exploredata.html Supplemental Information More NPRI datasets and mapping products are available here: https://www.canada.ca/en/environment-climate-change/services/national-pollutant-release-inventory/tools-resources-data/access.html Supporting Projects: National Pollutant Release Inventory (NPRI)
Facebook
TwitterThis data package contains mean values for dissolved organic carbon (DOC) and dissolved inorganic carbon (DIC) for water samples taken from the East River Watershed in Colorado. The East River is part of the Watershed Function Scientific Focus Area (WFSFA) located in the Upper Colorado River Basin, United States. DOC and DIC concentrations in water samples were determined using a TOC-VCPH analyzer (Shimadzu Corporation, Japan). DOC was analyzed as non-purgeable organic carbon (NPOC) by purging HCl acidified samples with carbon-free air to remove DIC prior to measurement. After the acidified sample has been sparged, it is injected into a combustion tube filled with oxidation catalyst heated to 680 degrees C. The DOC in samples is combusted to CO2 and measured by a non-dispersive infrared (NDIR) detector. The peak area of the analog signal produced by the NDIR detector is proportional to the DOC concentration of the sample. DIC was determined by acidifying the samples with HCl first, and then purge with carbon-free air to release CO2 for analysis by NDIR detector. All files are labeled by location and variable, and data reported are the mean values upon minimum three replicate measurements with a relative standard deviation < 3%. All samples were analyzed under a rigorous quality assurance and quality control (QA/QC) process as detailed in the methods. This data package contains (1) a zip file (dic_npoc_data_2014-2023.zip) containing a total of 319 files: 318 data files of DIC and NPOC data from across the Lawrence Berkeley National Laboratory (LBNL) Watershed Function Scientific Focus Area (SFA) which is reported in .csv files per location and a locations.csv (1 file) with latitude and longitude for each location; (2) a file-level metadata (v3_20230808_flmd.csv) file that lists each file contained in the dataset with associated metadata; (3) a data dictionary (v3_20230808_dd.csv) file that contains terms/column_headers used throughout the files along with a definition, units, and data type; and (4) PDF and docx files for the determination of Method Detection Limits (MDLs) for DIC and NPOC data. There are a total of 106 locations containing DIC/NPOC data. Update on 2020-10-07: Updated the data files to remove times from the timestamps, so that only dates remain. The data values have not changed. Update on 2021-04-11: Added Determination of Method Detection Limits (MDLs) for DIC, NPOC and TDN Analyses document, which can be accessed as a PDF or with Microsoft Word. Update on 2022-06-10: versioned updates to this dataset was made along with these changes: (1) updated dissolved inorganic carbon and dissolved organic carbon data for all locations up to 2021-12-31, (2) removal of units from column headers in datafiles, (3) added row underneath headers to contain units of variables, (4) restructure of units to comply with CSV reporting format requirements, (5) added -9999 for empty numerical cells, and (6) the addition of the file-level metadata (flmd.csv) and data dictionary (dd.csv) were added to comply with the File-Level Metadata Reporting Format. Update on 2022-09-09: Updates were made to reporting format specific files (file-level metadata and data dictionary) to correct swapped file names, add additional details on metadata descriptions on both files, add a header_row column to enable parsing, and add version number and date to file names (v2_20220909_flmd.csv and v2_20220909_dd.csv). Update on 2023-08-08: Updates were made to both the data files and reporting format specific files. New available anion data was added, up until 2023-01-05. The file level metadata and data dictionary files were updated to reflect the additional data added.
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
The hourly updates of ground monitoring observations in China are collected and stored in .csv files (one for each hour). NOTE: the pm25s data does NOT report locations for the reporting stations in x, y coordinates.
Facebook
Twitterhttps://data.gov.tw/licensehttps://data.gov.tw/license
Grid emission data of non-point source air pollutants, the base year is estimated to be the 108th year of the Republic of China, and the grid covers Taiwans main island and outlying islands. The central meridian of TWD97-TM2 121 and TWD97-TM2 119 is the grid point on the lower left, respectively. In angular coordinates, air pollutants include TSP, PM10, PM6 (no estimation), PM2.5, SOx, NOx, THC, NMHC, CO, and Pb. Since the number of files is extremely large and exceeds the capacity of the CSV file format, if you need to download it completely, it is recommended to download it from the following website: https://air.epa.gov.tw/EnvTopics/AirQuality_6.aspx Go to the "Attachment Download" at the bottom of the page , click "Emission Inventory TEDS11.1 Raw Data".
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset provides hourly air quality and meteorological measurements recorded from March 1, 2024, to November 30, 2024, at Universidad Iberoamericana, Mexico City (Geo-location: 19.372292, -99.263679). Each value represents the average of measurements taken within that hour. The data spans the rainy season (approximately May to October) and part of the ozone season (approximately February to May, with overlap into March–April), capturing air pollution and weather patterns in an urban, high-altitude environment.
The dataset includes the following parameters:
Ozone (O₃) (ppb) – Hourly average ozone concentration.
Temperature (°C) – Hourly average ambient temperature.
Relative Humidity (%) – Hourly average humidity.
Wind Direction (degrees) – Hourly average wind direction (0° = North).
Wind Speed (m/s) – Hourly average wind speed.
Air Pressure (hPa) – Hourly average atmospheric pressure.
Accumulated Precipitation (mm) – Total precipitation accumulated within the hour.
Solar Radiation (W/m²) – Hourly average solar radiation.
File Format: CSV (comma-separated) Data Structure: Each row represents an hourly measurement with a timestamp formatted as DD-MM-YY HH:MM (e.g., 01-03-24 14:00), using 24-hour time.
Sensor Details & Data Quality: Measurements were collected using low-cost air quality sensors deployed at the IBERO campus.
Applications: This dataset is suitable for air pollution analysis (e.g., ozone trends during the ozone season), meteorological research (e.g., precipitation and temperature patterns in the rainy season), and environmental modeling in Mexico City.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This paper is under review
Abstract:
Kampala, the political and economic capital of Uganda and one of the fastest urbanising cities in sub-Saharan Africa, is experiencing a deteriorating trend in air quality with emissions from multiple diffused local sources like transportation, domestic and outdoor cooking, and industries, and sources outside the city airshed like seasonal open fires in the region. PM2.5 (particulate matter under 2.5um size) is the key pollutant of concern in the city with monthly spatial heterogeneity of 60-100 ug/m3. Outdoor air pollution is distinctly pronounced in the global south cities and lack the necessary capacity and resources to develop integrated air quality management programmes including ambient monitoring, emissions and pollution analysis, source apportionment, and preparation of clean air action plans. This paper presents an integrated assessment of air quality in Kampala drawing from ground measurements (from a hybrid network of stations), satellite observations (from NASA’s MODIS and OMI), global reanalysis fields (from GEOS-chem and CAMS simulations), high resolution (~1km) multi-pollutant emissions inventory for the airshed, WRF-CAMx based PM2.5 pollution analysis, and a qualitative review of institutional and policy environment for air quality management in Kampala. The proposed clean air action plans aim for better air quality in the region using a combination of short-, medium-, and long-term emission control measures for all the dominate sources and institutionalize pollution tracking mechanisms (like emissions and pollution monitoring and reporting) for effective management of air pollution.
This data archive serves as a supplemenary to the journal article and with a short description of the files below:
Facebook
TwitterNARSTO_EPA_HOUSTON_TEXAQS2000_CAMS_DATA is the North American Research Strategy for Tropospheric Ozone (NARSTO) Environmental Protection Agency (EPA) Supersite (SS) Houston, Texas Air Quality Study 2000 (TexAQS2000) Texas Natural Resource Conservation Commission (TNRCC) continuous ambient monitoring stations (CAMS) Air Quality Data. This data set contains 5-minute air quality measurements collected in Texas during August and September 2000 at 85 CAMS during TEXAQS2000. Measurements include carbon monoxide (CO), sulfur dioxide (SO2), nitrogen oxide (NO), nitrogen dioxide (NO2), oxides of nitrogen (NOx), total reactive nitrogen species (NOy), ozone, particulate matter (PM) 2.5 mass, hydrogen sulfide (H2S), wind speed, wind direction, maximum wind gust, air temperature, dewpoint temperature, humidity, precipitation, surface pressure, radiation, and visibility. CAMS are operated by the Texas Commission on Environmental Quality (TCEQ), local city or county governments, or private monitoring networks. Important monitoring site information: The site information data table in each of the 85 data files may not contain the latest TCEQ site information. A companion file site information spreadsheet (.csv) that lists data for all 85 sites is the latest TCEQ site information. The site information data tables in the 85 data files will not be updated. The 85 site spreadsheet companion document is the official source of site data, and this data is listed in the TEXAQS2000 CAMS guide document.NARSTO, which has since disbanded, was a public/private partnership, whose membership spanned across government, utilities, industry, and academe throughout Mexico, the United States, and Canada. The primary mission was to coordinate and enhance policy-relevant scientific research and assessment of tropospheric pollution behavior; activities provide input for science-based decision-making and determination of workable, efficient, and effective strategies for local and regional air-pollution management. Data products from local, regional, and international monitoring and research programs are still available.
Facebook
TwitterDataset contains information on births in North Carolina during the study period, linked to air pollution concentrations during pregnancy. This dataset is not publicly accessible because: EPA cannot release personally identifiable information regarding living individuals, according to the Privacy Act and the Freedom of Information Act (FOIA). This dataset contains information about human research subjects. Because there is potential to identify individual participants and disclose personal information, either alone or in combination with other datasets, individual level data are not appropriate to post for public access. Restricted access may be granted to authorized persons by contacting the party listed. It can be accessed through the following means: Code files and data dictionaries can be requested from authors (rappazzo.kristen@epa.gov). Birth records can be requested through the North Carolina Department of Health and Human Services. Format: Data include csv, SAS, and R files containing information about births in North Carolina. This dataset is associated with the following publication: Krajewski, A., T. Luben, J. Warren, and K. Rappazzo. Associations between weekly gestational exposure of fine particulate matter, ozone, and nitrogen dioxide and preterm birth in a North Carolina Birth Cohort, 2003-2015. Environmental Epidemiology. Wolters Kluwer, Alphen aan den Rijn, NETHERLANDS, 7(6): e278, (2023).
Facebook
TwitterThis dataset is totally imaginary and NOT real data this deal only with values that are created by us.
This dataset deals with pollution in the Harbor of Chennai Kolkata and Visahapattinam has been recorded but it is a pain to create and collect all the data and arrange them in a format that interests data scientists. Hence I gathered four major pollutants and place them neatly in a CSV file.
There is a total of 29 fields. The four pollutants (NO2, O3, SO2, and O3) each have 5 specific columns. Observations totaled. This kernel provides a good introduction to this dataset!
For observations on specific columns visit the Column Metadata on the Data tab.
I did a related project and decided to open-source our dataset so that data scientists don't need to re-scrap from the first for historical pollution data.