Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
This map is part of SDGs Today. Please see sdgstoday.orgAir pollution, both ambient and household, increases the risk of cardiovascular and respiratory disease and leads to some 8.8 million deaths worldwide every year. 90% of these deaths occur in developing countries, highlighting the disproportionate impact of air pollution. Sub-Saharan Africa and most of Asia and Oceania (excluding Australia/New Zealand) have the highest mortality rates associated with air pollution, as a large proportion of the population still rely on polluting fuels and technologies for cooking. OpenAQ, a non-profit organization, collects daily air quality information from stations around the world and provides it as free and open data to help better monitor and manage the air we breathe.For more information, contact OpenAQ at info@openaq.org.
http://www.gnu.org/licenses/old-licenses/gpl-2.0.en.htmlhttp://www.gnu.org/licenses/old-licenses/gpl-2.0.en.html
This data makes use of the OpenAQ API (https://api.openaq.org), the Open Elevation api (https://api.open-elevation.com), and the Historical Weather API (https://open-meteo.com/en/docs/). The OpenAQ API is used to get PM2.5 values (ug/m^3), their associated time of measure (datetime), and longitudes and latitudes of various sensors and low-cost PM monitors. This data is split in the database into "Site" objects which contain coordinates as well as elevations, and "Measurement" objects which contain values, times, and other climate variables for those times. The Open Elevation API finds associated elevations with those sites. The Historical Weather API adds data on windspeed at 10 and 100 meters (km/hr), wind direction at 10 and 100 meters (degrees), rh (%), temperature (C), and rain (mm) to each measurement. These variables were chosen based on previously published work indicating these variables may be helpful in better predicting future PM values.
This dataset should allow for constructions of GNNS, and RNNs that are well contextualized as well as time series visualizations of US PM. Future work is planned to find the most efficacious adjacency matrix between the various sensors for GNN applications. This application is enabled by the inclusion of low-cost sensors which make the geographic distances between sensors substantially smaller when compared to a dataset only using US EPA sensors.
Every day activities such as driving, burning coal for electricity, wildfires, running factories, even cooking and cleaning, release particles into the air. Besides being an irritant, small particles of 10, 2.5, 1 micrometers (PM10, PM2.5, PM1) or less are a health hazard since they can get deep into the respiratory system and damage the delicate tissues.The exposure of populations to high levels of small particles increases the risk of respiratory and cardiovascular illnesses. The World Health Organization (WHO) guidelines provide long-term and short-term exposure limits to PM10 and PM2.5:Long-term: PM10 20 µg/m³ annual mean and PM2.5 10 µg/m³ Short-term: PM10 50 µg/m³ 24-hour mean and PM2.5 25 µg/m³ Exposure to PM10 and PM2.5 above these limits may significantly impact human health.The OpenAQ Recent Conditions in Air Quality layers show the latest mass concentrations and particulate count for PM2.5, PM10, and PM1 of the stations in the OpenAQ data set with at least one value reported in the past 30 days.Source: The source information is the OpenAQ community which reports measured concentrations (µg/m³) and particle matter count (particles/cm³) on a global scale by aggregating station data from national networks of air quality.Update Frequency: It is updated every hour using the Aggregated Live Feed (ALF) methodology.Area Covered: GlobalRevisionsJan 25, 2025: Upgrade to OpenAQ API version 3"Jan 18, 2024: Update to feed routine that allows stations w/o an identifier.Jun 23, 2023: Added new fields: Location ID, Station URL; Provider, and Instrument names. The live feed routine was updated to increase reliability and improve the overall update process.Jul 21, 2022: Added service to Live Feed Status Page for active monitoring!Feb 8, 2022: Update of live feed routine to use OpenAQ API v2:Addition of PM10 and PM1 layers.Values of particle matter count (particles/cm³) to all layers.Update of field labels.Removal of SourceName field.Feb 5, 2020: Official release of Feature Service offering.This layer is provided for informational purposes and is not monitored 24/7 for accuracy and currency.If you would like to be alerted to potential issues or simply see when this Service will update next, please visit our Live Feed Status Page!
Every day activities such as driving, burning coal for electricity, wildfires, running factories, even cooking and cleaning, release particles into the air. Besides being an irritant, small particles of 10, 2.5, 1 micrometers (PM10, PM2.5, PM1) or less are a health hazard since they can get deep into the respiratory system and damage the delicate tissues.The exposure of populations to high levels of small particles increases the risk of respiratory and cardiovascular illnesses. The World Health Organization (WHO) guidelines provide long-term and short-term exposure limits to PM10 and PM2.5:Long-term: PM10 20 µg/m³ annual mean and PM2.5 10 µg/m³ Short-term: PM10 50 µg/m³ 24-hour mean and PM2.5 25 µg/m³ Exposure to PM10 and PM2.5 above these limits may significantly impact human health.The OpenAQ Recent Conditions in Air Quality layers show the latest mass concentrations and particulate count for PM2.5, PM10, and PM1 of the stations in the OpenAQ data set with at least one value reported in the past 30 days.Source: The source information is the OpenAQ community which reports measured concentrations (µg/m³) and particle matter count (particles/cm³) on a global scale by aggregating station data from national networks of air quality.Update Frequency: It is updated every hour using the Aggregated Live Feed (ALF) methodology.Area Covered: GlobalRevisionsJan 25, 2025: Upgrade to OpenAQ API version 3"Jan 18, 2024: Update to feed routine that allows stations w/o an identifier.Jun 23, 2023: Added new fields: Location ID, Station URL; Provider, and Instrument names. The live feed routine was updated to increase reliability and improve the overall update process.Jul 21, 2022: Added service to Live Feed Status Page for active monitoring!Feb 8, 2022: Update of live feed routine to use OpenAQ API v2:Addition of PM10 and PM1 layers.Values of particle matter count (particles/cm³) to all layers.Update of field labels.Removal of SourceName field.Feb 5, 2020: Official release of Feature Service offering.This layer is provided for informational purposes and is not monitored 24/7 for accuracy and currency.If you would like to be alerted to potential issues or simply see when this Service will update next, please visit our Live Feed Status Page!
https://www.gnu.org/licenses/gpl-3.0-standalone.htmlhttps://www.gnu.org/licenses/gpl-3.0-standalone.html
This record is a global open-source passenger air traffic dataset primarily dedicated to the research community. It gives a seating capacity available on each origin-destination route for a given year, 2019, and the associated aircraft and airline when this information is available. Context on the original work is given in the related article (https://journals.open.tudelft.nl/joas/article/download/7201/5683) and on the associated GitHub page (https://github.com/AeroMAPS/AeroSCOPE/).A simple data exploration interface will be available at www.aeromaps.eu/aeroscope.The dataset was created by aggregating various available open-source databases with limited geographical coverage. It was then completed using a route database created by parsing Wikipedia and Wikidata, on which the traffic volume was estimated using a machine learning algorithm (XGBoost) trained using traffic and socio-economical data. 1- DISCLAIMER The dataset was gathered to allow highly aggregated analyses of the air traffic, at the continental or country levels. At the route level, the accuracy is limited as mentioned in the associated article and improper usage could lead to erroneous analyses. Although all sources used are open to everyone, the Eurocontrol database is only freely available to academic researchers. It is used in this dataset in a very aggregated way and under several levels of abstraction. As a result, it is not distributed in its original format as specified in the contract of use. As a general rule, we decline any responsibility for any use that is contrary to the terms and conditions of the various sources that are used. In case of commercial use of the database, please contact us in advance. 2- DESCRIPTION Each data entry represents an (Origin-Destination-Operator-Aircraft type) tuple. Please refer to the support article for more details (see above). The dataset contains the following columns:
"First column" : index airline_iata : IATA code of the operator in nominal cases. An ICAO -> IATA code conversion was performed for some sources, and the ICAO code was kept if no match was found. acft_icao : ICAO code of the aircraft type acft_class : Aircraft class identifier, own classification.
WB: Wide Body NB: Narrow Body RJ: Regional Jet PJ: Private Jet TP: Turbo Propeller PP: Piston Propeller HE: Helicopter OTHER seymour_proxy: Aircraft code for Seymour Surrogate (https://doi.org/10.1016/j.trd.2020.102528), own classification to derive proxy aircraft when nominal aircraft type unavailable in the aircraft performance model. source: Original data source for the record, before compilation and enrichment.
ANAC: Brasilian Civil Aviation Authorities AUS Stats: Australian Civil Aviation Authorities BTS: US Bureau of Transportation Statistics T100 Estimation: Own model, estimation on Wikipedia-parsed route database Eurocontrol: Aggregation and enrichment of R&D database OpenSky World Bank seats: Number of seats available for the data entry, AFTER airport residual scaling n_flights: Number of flights of the data entry, when available iata_departure, iata_arrival : IATA code of the origin and destination airports. Some BTS inhouse identifiers could remain but it is marginal. departure_lon, departure_lat, arrival_lon, arrival_lat : Origin and destination coordinates, could be NaN if the IATA identifier is erroneous departure_country, arrival_country: Origin and destination country ISO2 code. WARNING: disable NA (Namibia) as default NaN at import departure_continent, arrival_continent: Origin and destination continent code. WARNING: disable NA (North America) as default NaN at import seats_no_est_scaling: Number of seats available for the data entry, BEFORE airport residual scaling distance_km: Flight distance (km) ask: Available Seat Kilometres rpk: Revenue Passenger Kilometres (simple calculation from ASK using IATA average load factor) fuel_burn_seymour: Fuel burn per flight (kg) when seymour proxy available fuel_burn: Total fuel burn of the data entry (kg) co2: Total CO2 emissions of the data entry (kg) domestic: Domestic/international boolean (Domestic=1, International=0)
3- Citation Please cite the support paper instead of the dataset itself.
Salgas, A., Sun, J., Delbecq, S., Planès, T., & Lafforgue, G. (2023). Compilation of an open-source traffic and CO2 emissions dataset for commercial aviation. Journal of Open Aviation Science. https://doi.org/10.59490/joas.2023.7201
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The dataset contains social network measures
This dataset lists out all software in use by NASA.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
We introduce a large-scale dataset of the complete texts of free/open source software (FOSS) license variants. To assemble it we have collected from the Software Heritage archive—the largest publicly available archive of FOSS source code with accompanying development history—all versions of files whose names are commonly used to convey licensing terms to software users and developers. The dataset consists of 6.5 million unique license files that can be used to conduct empirical studies on open source licensing, training of automated license classifiers, natural language processing (NLP) analyses of legal texts, as well as historical and phylogenetic studies on FOSS licensing. Additional metadata about shipped license files are also provided, making the dataset ready to use in various contexts; they include: file length measures, detected MIME type, detected SPDX license (using ScanCode), example origin (e.g., GitHub repository), oldest public commit in which the license appeared. The dataset is released as open data as an archive file containing all deduplicated license blobs, plus several portable CSV files for metadata, referencing blobs via cryptographic checksums.
For more details see the included README file and companion paper:
Stefano Zacchiroli. A Large-scale Dataset of (Open Source) License Text Variants. In proceedings of the 2022 Mining Software Repositories Conference (MSR 2022). 23-24 May 2022 Pittsburgh, Pennsylvania, United States. ACM 2022.
If you use this dataset for research purposes, please acknowledge its use by citing the above paper.
The State of Our Environment Report is an annual report on Austin's environment that is delivered to the city manager and Council each April. A PDF version of this report can be viewed at: http://www.austintexas.gov/sites/default/files/files/Watershed/SOE_report_2017.pdf
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The Open Grid Emissions Initiative seeks to fill a critical need for high-quality, publicly-accessible, hourly grid emissions data that can be used for GHG accounting, policymaking, academic research, and energy attribute certificate markets. The initiative includes this repository of open-source grid emissions data processing tools that use peer-reviewed, well-documented, and validated methodologies to create the accompanying public dataset of hourly, monthly, and annual U.S. electric grid generation, GHG, and air pollution data.
Data Structure
data/downloads
contains all files that are downloaded by functions in load_data
data/manual
contains all manually-created files, including the egrid static tablesdata/outputs
contains intermediate outputs from the data pipeline... any files created by our code that are not final resultsdata/results
contains all final output files that will be publishedRelease Notes
The 2019-2021 data archived here was created by v0.2.2 of the Open Grid Emissions Initiative, archived on Zenodo here.
For a summary of what's new in the 2019-2020 v.0.2.2 data, see the v0.2.2 release notes.
Using the archived files
The outputs and results files have been saved as separate zip folders for each year.
We did not archive the manual or downloads folder since the manual files are archived as part of the v0.2.1 code release, and downloading these files using v0.2.2 of the code should return versioned results.
Not seeing a result you expected?
Learn how you can add new datasets to our index.
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
This map is part of SDGs Today. Please see sdgstoday.orgAir pollution, both ambient and household, increases the risk of cardiovascular and respiratory disease and leads to some 8.8 million deaths worldwide every year. 90% of these deaths occur in developing countries, highlighting the disproportionate impact of air pollution. Sub-Saharan Africa and most of Asia and Oceania (excluding Australia/New Zealand) have the highest mortality rates associated with air pollution, as a large proportion of the population still rely on polluting fuels and technologies for cooking. OpenAQ, a non-profit organization, collects daily air quality information from stations around the world and provides it as free and open data to help better monitor and manage the air we breathe.For more information, contact OpenAQ at info@openaq.org.