Facebook
TwitterThe EPA GitHub repository PAU4ChemAs as described in the README.md file, contains Python scripts written to build the PAU dataset modules (technologies, capital and operating costs, and chemical prices) for tracking chemical flows transfers, releases estimation, and identification of potential occupation exposure scenarios in pollution abatement units (PAUs). These PAUs are employed for on-site chemical end-of-life management. The folder datasets contains the outputs for each framework step. The Chemicals_in_categories.csv contains the chemicals for the TRI chemical categories. The EPA GitHub repository PAU_case_study as described in its readme.md entry, contains the Python scripts to run the manuscript case study for designing the PAUs, the data-driven models, and the decision-making module for chemicals of concern and tracking flow transfers at the end-of-life stage. The data was obtained by means of data engineering using different publicly-available databases. The properties of chemicals were obtained using the GitHub repository Properties_Scraper, while the PAU dataset using the repository PAU4Chem. Finally, the EPA GitHub repository Properties_Scraper contains a Python script to massively gather information about exposure limits and physical properties from different publicly-available sources: EPA, NOAA, OSHA, and the institute for Occupational Safety and Health of the German Social Accident Insurance (IFA). Also, all GitHub repositories describe the Python libraries required for running their code, how to use them, the obtained outputs files after running the Python script modules, and the corresponding EPA Disclaimer. This dataset is associated with the following publication: Hernandez-Betancur, J.D., M. Martin, and G.J. Ruiz-Mercado. A data engineering framework for on-site end-of-life industrial operations. JOURNAL OF CLEANER PRODUCTION. Elsevier Science Ltd, New York, NY, USA, 327: 129514, (2021).
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
The Dataset "World's Air Quality and Water Pollution" was obtained from Jack Jae Hwan Kim Kaggle page. This Dataset is comprized of 5 columns; "City", "Region", "Country", "Air Quality" and "Water Pollution". The last two columns consist of values varying from 0 to 100; Air Quality Column: Air quality varies from 0 (bad quality) to 100 (top good quality) Water Pollution Column: Water pollution varies from 0 (no pollution) to 100 (extreme pollution).
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The Eaton Fire in Southern California contaminated homes across Altadena, Pasadena, and Sierra Madre with hazardous ash and soot. Residents feared serious health risks prompting Eaton Fire Residents United (EFRU), a grassroots coalition of community members, to begin collecting, compiling, and publicly sharing residential contamination testing data taken by professional industrial hygienists.
This dataset contains anonymized, professionally collected contamination test results from over 200 affected homes. Researchers and advocates can leverage this dataset to study contamination patterns and support evidence-based policy improvements related to wildfire recovery and public health.
More information about EFRU as well as a live version of this information is available at www.efru.la
The included Datasheet for Dataset provides comprehensive details about the dataset's contents, methods of collection, data anonymization practices, and suggested use-cases.
The SOP document contains detailed procedures for processing the original resident-provided test reports to anonymize and compile the data.
A basic Python Notebook is included for loading, exploring, and visualizing the dataset. This script should facilitate researchers getting started with the dataset.
The dataset includes contaminant levels measured inside 201 homes in CSV format. Each row in the CSV provides:
The dataset will be updated periodically as additional residential testing results become available. Further releases as well as minor corrections are expected as this is an ongoing effort by community volunteers.
We express our profound gratitude to community members who have voluntarily shared their reports and helped us compile this dataset, all in pursuit of rebuilding a healthier, safer community. We also acknowledge Jennifer Cotton, Jordan Boye, and Dawn Fanning for their contributions as well as the entire EFRU community.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
GlobalHighPM2.5 is part of a series of long-term, seamless, global, high-resolution, and high-quality datasets of air pollutants over land (i.e., GlobalHighAirPollutants, GHAP). It is generated from big data sources (e.g., ground-based measurements, satellite remote sensing products, atmospheric reanalysis, and model simulations) using artificial intelligence, taking into account the spatiotemporal heterogeneity of air pollution.
This dataset contains input data, analysis codes, and generated dataset used for the following article. If you use the GlobalHighPM2.5 dataset in your scientific research, please cite the following reference (Wei et al., NC, 2023):
Wei, J., Li, Z., Lyapustin, A., Wang, J., Dubovik, O., Schwartz, J., Sun, L., Li, C., Liu, S., and Zhu, T. First close insight into global daily gapless 1 km PM2.5 pollution, variability, and health impact. Nature Communications, 2023, 14, 8349. https://doi.org/10.1038/s41467-023-43862-3
Input Data
Relevant raw data for each figure (compiled into a single sheet within an Excel document) in the manuscript.
Code
Relevant Python scripts for replicating and ploting the analysis results in the manuscript, as well as codes for converting data formats.
Generated Dataset
Here is the first big data-derived seamless (spatial coverage = 100%) daily, monthly, and yearly 1 km (i.e., D1K, M1K, and Y1K) global ground-level PM2.5 dataset over land from 2017 to the present. This dataset exhibits high quality, with cross-validation coefficients of determination (CV-R2) of 0.91, 0.97, and 0.98, and root-mean-square errors (RMSEs) of 9.20, 4.15, and 2.77 µg m-3 on the daily, monthly, and annual bases, respectively.
Due to data volume limitations,
all (including daily) data for the year 2022 is accessible at: GlobalHighPM2.5 (2022)
all (including daily) data for the year 2021 is accessible at: GlobalHighPM2.5 (2021)
all (including daily) data for the year 2020 is accessible at: GlobalHighPM2.5 (2020)
all (including daily) data for the year 2019 is accessible at: GlobalHighPM2.5 (2019)
all (including daily) data for the year 2018 is accessible at: GlobalHighPM2.5 (2018)
all (including daily) data for the year 2017 is accessible at: GlobalHighPM2.5 (2017)
continuously updated...
More GHAP datasets for different air pollutants are available at: https://weijing-rs.github.io/product.html
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Dissipative barriers are a class of perturbations for differential operators, that can be utilised to help numerically compute eigenvalues. This dataset is comprised of numerical computations of the eigenvalues of three example of differential operators perturbed by dissipative barriers. These were used in the paper ''Spectral inclusion and pollution for a class of dissipative perturbations'' (DOI: 10.1063/5.0028440, freely available at arXiv:2006.10097) as illustrations to theoretical results. Please see that paper, in particular Section 5, for more precise information on the contents on the dataset. The datasets are numpy array (Python programming language) saved as pickle files and can be opened using the pickle package (see https://docs.python.org/3/library/pickle.html).
Facebook
Twitter🌫️ Air Pollution Trends in Indian Cities - Data Analysis Project 🎯 Objective The primary objective of this project is to analyze air pollution trends across Indian cities by studying various pollutant concentrations, their correlations, and seasonal as well as geographic variations. The analysis is intended to identify high-risk areas and emphasize the need for targeted interventions to improve public health and air quality.
📊 Analytical Insights 1️⃣ Date & Time-Based Analysis PM2.5 & PM10 Trends 📅 PM2.5 levels remained relatively stable with a slight rise in 2024 (125.848). PM10 peaked in 2018 (217.195) and hit a low in 2015 (213.597). AQI Concentration Across Months 🌤️ October recorded the highest average AQI (56.39), followed by July and April. December had the lowest AQI (54.60). NO2 Concentration in the Last 5 Years 🔴 Highest in November 2024 (206.69). Significant peaks in May 2021 and March 2020. SO2 Emissions Over the Last 5 Years 🌫️ Highest in December (202.69). Lowest in February (198.62). Weekday vs. Weekend Pollution Levels 📆 East India saw higher AQI on weekends. North, South, and West India had higher AQI on weekdays. 2️⃣ State & Location-Based Analysis Most Polluted State (AQI) 🚨 Uttar Pradesh had the highest average AQI (162.61). SO2 Levels - North vs. South 🏭 South India recorded slightly higher SO2 emissions than the North. Top 5 Most Polluted Locations (PM10) 🌆 Amritsar (Punjab) - 217.64 Jodhpur (Rajasthan) - 217.41 Surat (Gujarat) - 217.01 Vadodara (Gujarat) - 216.92 New Delhi (Delhi) - 216.61 Cleanest Locations (PM2.5 Levels) 🌱 Vadodara (122.47), Udaipur (123.32), Dwarka (123.76) 3️⃣ Pollutant-Specific Insights Most Drastic Increase in PM2.5 🚩 Punjab (Industrial Area) recorded 127.10 PM2.5. Highest SO2 Levels ☣️ Gujarat (Sensitive Area): 812.39 Biggest Contributor to Air Pollution 🏭 Maharashtra topped in SO2, CO, NO2, and PM10 emissions. Correlation Between Pollutants 🔗 SO2 & NO2: Slight negative correlation (-0.0027) PM10 & PM2.5: Near-zero correlation (-0.0013) 4️⃣ Seasonal & Comparative Trends Most Polluted Season 🍂 Winter had the highest PM10 levels. Rainy season saw the highest PM2.5 levels. CO Pollution by Season 💨 Winter recorded the highest CO concentration (200.49), followed by Autumn (200.40). Quarterly Pollution Trends 🏭 Q4 (Oct–Dec) saw the highest pollution levels for SO2, NO2, and PM10. Impact of COVID on Pollution 🦠 SO2 levels dropped from 799.05 to 796.50 during lockdown periods. NO2 levels also showed a minor reduction. 🧰 Tech Stack Data Processing: Python (Pandas, NumPy) Data Visualization: Matplotlib, Seaborn, Power BI Query Language: SQL (for pollutant dataset extraction) Version Control: Git & GitHub
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset contains supporting resources for:
Generated for the study:
Bioindication of radioactive contamination by honey bees in the Bryansk and Rostov regions: Foraging dynamics of ¹³⁷Cs and ⁴⁰K in the plant–bee–bee product pathway (Sorokin, 2025).
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
ABSTRACT The disorderly soil occupation without the necessary conservationist practices leads to impacts on the local hydrology and induces the pollution of water resources. This pollution may come from more urbanized areas due to the amount of pollutants drained during the rains. Even moderate precipitations constitute one of the main factors that define pollutant runoff on the surface. These rains have recently been called light rains. Light rains have a lower precipitation height and a higher frequency compared to classic rains of drainage projects, being necessary to define them according to patterns of rain frequency for each region. This study aimed to characterize light rain in the municipality of Piracicaba to establish statistical standards for the frequency of certain precipitation heights. A database provided by the ESALQ/USP automatic weather station, which provides precipitation measurements every 15 minutes, was used in the present study. Light rain heights reached 40.3, 41.4, and 42.7 mm for 100, 90, or 80% frequencies, respectively, which implies the use of return periods of 1.00, 1.11, and 1.25 years, respectively.
Facebook
TwitterAttribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
OpenAQ is an open-source project to surface live, real-time air quality data from around the world. Their “mission is to enable previously impossible science, impact policy and empower the public to fight air pollution.” The data includes air quality measurements from 5490 locations in 47 countries.
Scientists, researchers, developers, and citizens can use this data to understand the quality of air near them currently. The dataset only includes the most current measurement available for the location (no historical data).
Update Frequency: Weekly
You can use the BigQuery Python client library to query tables in this dataset in Kernels. Note that methods available in Kernels are limited to querying data. Tables are at bigquery-public-data.openaq.[TABLENAME]. Fork this kernel to get started.
Dataset Source: openaq.org
Use: This dataset is publicly available for anyone to use under the following terms provided by the Dataset Source and is provided "AS IS" without any warranty, express or implied.
Not seeing a result you expected?
Learn how you can add new datasets to our index.
Facebook
TwitterThe EPA GitHub repository PAU4ChemAs as described in the README.md file, contains Python scripts written to build the PAU dataset modules (technologies, capital and operating costs, and chemical prices) for tracking chemical flows transfers, releases estimation, and identification of potential occupation exposure scenarios in pollution abatement units (PAUs). These PAUs are employed for on-site chemical end-of-life management. The folder datasets contains the outputs for each framework step. The Chemicals_in_categories.csv contains the chemicals for the TRI chemical categories. The EPA GitHub repository PAU_case_study as described in its readme.md entry, contains the Python scripts to run the manuscript case study for designing the PAUs, the data-driven models, and the decision-making module for chemicals of concern and tracking flow transfers at the end-of-life stage. The data was obtained by means of data engineering using different publicly-available databases. The properties of chemicals were obtained using the GitHub repository Properties_Scraper, while the PAU dataset using the repository PAU4Chem. Finally, the EPA GitHub repository Properties_Scraper contains a Python script to massively gather information about exposure limits and physical properties from different publicly-available sources: EPA, NOAA, OSHA, and the institute for Occupational Safety and Health of the German Social Accident Insurance (IFA). Also, all GitHub repositories describe the Python libraries required for running their code, how to use them, the obtained outputs files after running the Python script modules, and the corresponding EPA Disclaimer. This dataset is associated with the following publication: Hernandez-Betancur, J.D., M. Martin, and G.J. Ruiz-Mercado. A data engineering framework for on-site end-of-life industrial operations. JOURNAL OF CLEANER PRODUCTION. Elsevier Science Ltd, New York, NY, USA, 327: 129514, (2021).