100+ datasets found

Near-Earth Object Human Space Flight Accessible Targets Study (NHATS)
catalog.data.gov
datasets.ai
+3more
Updated Aug 30, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
National Aeronautics and Space Administration (2025). Near-Earth Object Human Space Flight Accessible Targets Study (NHATS) [Dataset]. https://catalog.data.gov/dataset/near-earth-object-human-space-flight-accessible-targets-study-nhats
Explore at:
Dataset updated
Aug 30, 2025
Dataset provided by
NASAhttp://nasa.gov/
Area covered
Earth
Description
This list of potential mission targets should not be interpreted as a complete list of viable NEAs for an actual human exploration mission. As the NEA orbits are updated, the viable mission targets and their mission parameters will change. To select an actual target and mission scenario, additional constraints must be applied including astronaut health and safety considerations, human space flight architecture elements, their performances and readiness, the physical nature of the target NEA and mission schedule constraints.
Data from: World Database on Protected Areas
kiribati-data.sprep.org
pacificdata.org
+13more
jpeg, pdf
Updated Feb 20, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Secretariat of the Pacific Regional Environment Programme (2025). World Database on Protected Areas [Dataset]. https://kiribati-data.sprep.org/dataset/world-database-protected-areas-0
Explore at:
jpeg, pdfAvailable download formats
Dataset updated
Feb 20, 2025
Dataset provided by
Pacific Regional Environment Programmehttps://www.sprep.org/
License
Public Domain Mark 1.0https://creativecommons.org/publicdomain/mark/1.0/
License information was derived automatically
Area covered
160.20703554153 -29.489341672009, 141.92578554153 -11.126668087769, 129.26953554153 27.605668449605, 129.26953554153 1.4588219018416, 204.92578983307 6.2279312638895)), 167.23828554153 25.085596467854, POLYGON ((205.20703554153 -28.505385171432, 141.3632941246 -0.22851555560937, Pacific Region
Description
The World Database on Protected Areas (WDPA) is the most comprehensive global database of marine and terrestrial protected areas, updated on a monthly basis, and is one of the key global biodiversity data sets being widely used by scientists, businesses, governments, International secretariats and others to inform planning, policy decisions and management. The WDPA is a joint project between UN Environment and the International Union for Conservation of Nature (IUCN). The compilation and management of the WDPA is carried out by UN Environment World Conservation Monitoring Centre (UNEP-WCMC), in collaboration with governments, non-governmental organisations, academia and industry. There are monthly updates of the data which are made available online through the Protected Planet website where the data is both viewable and downloadable. Data and information on the world's protected areas compiled in the WDPA are used for reporting to the Convention on Biological Diversity on progress towards reaching the Aichi Biodiversity Targets (particularly Target 11), to the UN to track progress towards the 2030 Sustainable Development Goals, to some of the Intergovernmental Science-Policy Platform on Biodiversity and Ecosystem Services (IPBES) core indicators, and other international assessments and reports including the Global Biodiversity Outlook, as well as for the publication of the United Nations List of Protected Areas. Every two years, UNEP-WCMC releases the Protected Planet Report on the status of the world's protected areas and recommendations on how to meet international goals and targets. Many platforms are incorporating the WDPA to provide integrated information to diverse users, including businesses and governments, in a range of sectors including mining, oil and gas, and finance. For example, the WDPA is included in the Integrated Biodiversity Assessment Tool, an innovative decision support tool that gives users easy access to up-to-date information that allows them to identify biodiversity risks and opportunities within a project boundary. The reach of the WDPA is further enhanced in services developed by other parties, such as the Global Forest Watch and the Digital Observatory for Protected Areas, which provide decision makers with access to monitoring and alert systems that allow whole landscapes to be managed better. Together, these applications of the WDPA demonstrate the growing value and significance of the Protected Planet initiative.
ExioML: Global Sectoral Sustainability Dataset
kaggle.com
Updated Jun 14, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yanming Yann Guo (2024). ExioML: Global Sectoral Sustainability Dataset [Dataset]. http://doi.org/10.34740/kaggle/dsv/8690108
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.34740/kaggle/dsv/8690108
Dataset updated
Jun 14, 2024
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Yanming Yann Guo
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
🙋‍♂️ Introduction

ExioML is the first ML-ready benchmark dataset in eco-economic research, designed for global sectoral sustainability analysis. It addresses significant research gaps by leveraging the high-quality, open-source EE-MRIO dataset ExioBase 3.8.2. ExioML covers 163 sectors across 49 regions from 1995 to 2022, overcoming data inaccessibility issues. The dataset includes both factor accounting in tabular format and footprint networks in graph structure.

We demonstrate a GHG emission regression task using a factor accounting table, comparing the performance of shallow and deep models. The results show a low Mean Squared Error (MSE), quantifying sectoral GHG emissions in terms of value-added, employment, and energy consumption, validating the dataset's usability. The footprint network in ExioML, inherent in the multi-dimensional MRIO framework, enables tracking resource flow between international sectors.

ExioML offers promising research opportunities, such as predicting embodied emissions through international trade, estimating regional sustainability transitions, and analyzing the topological changes in global trading networks over time. It reduces barriers and intensive data pre-processing for ML researchers, facilitates the integration of ML and eco-economic research, and provides new perspectives for sound climate policy and global sustainable development.

📊 Dataset

ExioML supports graph and tabular structure learning algorithms through the Footprint Network and Factor Accounting table. The dataset includes the following factors in PxP and IxI:

Region (Categorical feature)

Sector (Categorical feature)

Value Added M.EUR

Employment 1000 p.

GHG emissions kg CO2 eq.

Energy Carrier Net Total TJ

Year (Numerical feature)

☁️ Factor Accounting

The Factor Accounting table shares common features with the Footprint Network and summarizes the total heterogeneous characteristics of various sectors.

🚞 Footprint Network

The Footprint Network models the high-dimensional global trading network, capturing its economic, social, and environmental impacts. This network is structured as a directed graph, where directionality represents sectoral input-output relationships, delineating sectors by their roles as sources (exporting) and targets (importing). The basic element in the ExioML Footprint Network is international trade across different sectors with features such as value-added, emission amount, and energy input. The Footprint Network helps identify critical sectors and paths for sustainability management and optimization. The Footprint Network is hosted on Zenodo.

🔗 Code and Data Availability

The ExioML development toolkit in Python and the regression model used for validation are available on the GitHub repository: https://github.com/YVNMINC/ExioML. The complete ExioML dataset is hosted by Zenodo: https://zenodo.org/records/10604610.

💡 Additional Information

More details about the dataset are available in our paper: ExioML: Eco-economic dataset for Machine Learning in Global Sectoral Sustainability, accepted by the ICLR 2024 Climate Change AI workshop: https://arxiv.org/abs/2406.09046.

📄 Citation

@inproceedings{guo2024exioml, title={ExioML: Eco-economic dataset for Machine Learning in Global Sectoral Sustainability}, author={Yanming, Guo and Jin, Ma}, booktitle={ICLR 2024 Workshop on Tackling Climate Change with Machine Learning}, year={2024} }

🌟 Reference

Stadler, Konstantin, et al. "EXIOBASE 3." Zenodo. Retrieved March 22 (2021): 2023.
GlobPOP: A 33-year (1990-2022) global gridded population dataset (Version...
zenodo.org
tiff
Updated Sep 4, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Luling Liu; Xin Cao; Xin Cao; Shijie Li; Na Jie; Luling Liu; Shijie Li; Na Jie (2024). GlobPOP: A 33-year (1990-2022) global gridded population dataset (Version 2.0-test-alpha) [Dataset]. http://doi.org/10.5281/zenodo.11071249
Explore at:
tiffAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.11071249
Dataset updated
Sep 4, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Luling Liu; Xin Cao; Xin Cao; Shijie Li; Na Jie; Luling Liu; Shijie Li; Na Jie
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Data Usage Notice

This version is not recommended for download. Please check the newest version.

We would like to inform you that the updated GlobPOP dataset (2021-2022) have been available in version 2.0. The GlobPOP dataset (2021-2022) in the current version is not recommended for your work. The GlobPOP dataset (1990-2020) in the current version is the same as version 1.0.

Thank you for your continued support of the GlobPOP.

If you encounter any issues, please contact us via email at lulingliu@mail.bnu.edu.cn.

Introduction

Continuously monitoring global population spatial dynamics is essential for implementing effective policies related to sustainable development, such as epidemiology, urban planning, and global inequality.

Here, we present GlobPOP, a new continuous global gridded population product with a high-precision spatial resolution of 30 arcseconds from 1990 to 2020. Our data-fusion framework is based on cluster analysis and statistical learning approaches, which intends to fuse the existing five products(Global Human Settlements Layer Population (GHS-POP), Global Rural Urban Mapping Project (GRUMP), Gridded Population of the World Version 4 (GPWv4), LandScan Population datasets and WorldPop datasets to a new continuous global gridded population (GlobPOP). The spatial validation results demonstrate that the GlobPOP dataset is highly accurate. To validate the temporal accuracy of GlobPOP at the country level, we have developed an interactive web application, accessible at https://globpop.shinyapps.io/GlobPOP/, where data users can explore the country-level population time-series curves of interest and compare them with census data.

With the availability of GlobPOP dataset in both population count and population density formats, researchers and policymakers can leverage our dataset to conduct time-series analysis of population and explore the spatial patterns of population development at various scales, ranging from national to city level.

Data description

The product is produced in 30 arc-seconds resolution(approximately 1km in equator) and is made available in GeoTIFF format. There are two population formats, one is the 'Count'(Population count per grid) and another is the 'Density'(Population count per square kilometer each grid)

Each GeoTIFF filename has 5 fields that are separated by an underscore "_". A filename extension follows these fields. The fields are described below with the example filename:

GlobPOP_Count_30arc_1990_I32

Field 1: GlobPOP(Global gridded population)
Field 2: Pixel unit is population "Count" or population "Density"
Field 3: Spatial resolution is 30 arc seconds
Field 4: Year "1990"
Field 5: Data type is I32(Int 32) or F32(Float32)

More information

Please refer to the paper for detailed information:

Liu, L., Cao, X., Li, S. et al. A 31-year (1990–2020) global gridded population dataset generated by cluster analysis and statistical learning. Sci Data 11, 124 (2024). https://doi.org/10.1038/s41597-024-02913-0.

The fully reproducible codes are publicly available at GitHub: https://github.com/lulingliu/GlobPOP.
e
Data for "Prediction of Search Targets From Fixations in Open-World...
b2find.eudat.eu
darus.uni-stuttgart.de
Updated May 9, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2023). Data for "Prediction of Search Targets From Fixations in Open-World Settings" [Dataset]. https://b2find.eudat.eu/dataset/b3f54f2f-1413-5cb8-b83f-ea88d963f1ae
Explore at:
Dataset updated
May 9, 2023
Area covered
World
Description
We designed a human study to collect fixation data during visual search. We opted for a task that involved searching for a single image (the target) within a synthesised collage of images (the search set). Each of the collages are the random permutation of a finite set of images. To explore the impact of the similarity in appearance between target and search set on both fixation behaviour and automatic inference, we have created three different search tasks covering a range of similarities. In prior work, colour was found to be a particularly important cue for guiding search to targets and target-similar objects. Therefore we have selected for the first task 78 coloured O'Reilly book covers to compose the collages. These covers show a woodcut of an animal at the top and the title of the book in a characteristic font underneath. Given that overall cover appearance was very similar, this task allows us to analyse fixation behaviour when colour is the most discriminative feature. For the second task we use a set of 84 book covers from Amazon. In contrast to the first task, appearance of these covers is more diverse. This makes it possible to analyse fixation behaviour when both structure and colour information could be used by participants to find the target. Finally, for the third task, we use a set of 78 mugshots from a public database of suspects. In contrast to the other tasks, we transformed the mugshots to grey-scale so that they did not contain any colour information. In this case, allows abalysis of fixation behaviour when colour information was not available at all. We found faces to be particularly interesting given the relevance of searching for faces in many practical applications. 18 participants (9 males), age 18-30 Gaze data recorded with a stationary Tobii TX300 eye tracker More information about the dataset can be found in the README file. The data is only to be used for non-commercial scientific purposes.
f
The Urbanity Global Network Dataset
figshare.com
txt
Updated Dec 22, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Winston Yap (2023). The Urbanity Global Network Dataset [Dataset]. http://doi.org/10.6084/m9.figshare.22124219.v12
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.22124219.v12
Dataset updated
Dec 22, 2023
Dataset provided by
figshare
Authors
Winston Yap
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The Global Urban Network (GUN) dataset provides pre-computed node and edge attribute features for various cities. Each layer is available in .geojson format and can easily be converted into NetworkX, igraph, PyG, and DGL graph formats.

For node attributes, we adopt a uniform Euclidean approach, as it provides a consistent, straightforward, and extensible basis for integrating heterogeneous data sources across different network locations. Accordingly, we construct 100 metres euclidean buffers for each network node and compute the spatial intersection with spatial targets (e.g., street view imagery points, points of interest, and building footprints). To ensure spatial consistency and accurate distance computation, we project spatial entities into local coordinate reference systems (CRS). Users can employ the Urbanity package to generate Euclidean buffers of arbitrary distance.

For edge attributes, we adopt a two-step approach: 1) compute the distance between each spatial point of interest and its proximate edges in the network, and 2) assign entities to the corresponding edge with lowest distance. To account for remote edges (e.g., peripheral routes that are not located close to any amenities), we specify a distance threshold of 50 metres. For buildings, we compute the distance between building centroids and their respective network edge. Accordingly, we compute spatial indicators based on the set of elements assigned to each network edge.

We also release aggregated subzone statistics for each city. Similarly, users can employ the Urbanity package to generate aggregate statistics for any arbitrary geographic boundary.

Urbanity Python package: https://github.com/winstonyym/urbanity.
d
Protected Areas Database of the United States (PAD-US) 2.1 - World Database...
catalog.data.gov
data.usgs.gov
Updated Sep 30, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. Geological Survey (2025). Protected Areas Database of the United States (PAD-US) 2.1 - World Database on Protected Areas (WDPA) Submission (ver. 1.1, April 2021) [Dataset]. https://catalog.data.gov/dataset/protected-areas-database-of-the-united-states-pad-us-2-1-world-database-on-protected-areas
Explore at:
Dataset updated
Sep 30, 2025
Dataset provided by
United States Geological Surveyhttp://www.usgs.gov/
Area covered
United States
Description
The United States Geological Survey (USGS) - Science Analytics and Synthesis (SAS) - Gap Analysis Project (GAP) manages the Protected Areas Database of the United States (PAD-US), an Arc10x geodatabase, that includes a full inventory of areas dedicated to the preservation of biological diversity and to other natural, recreation, historic, and cultural uses, managed for these purposes through legal or other effective means (www.usgs.gov/core-science-systems/science-analytics-and-synthesis/gap/science/protected-areas). The PAD-US is developed in partnership with many organizations, including coordination groups at the [U.S.] Federal level, lead organizations for each State, and a number of national and other non-governmental organizations whose work is closely related to the PAD-US. Learn more about the USGS PAD-US partners program here: www.usgs.gov/core-science-systems/science-analytics-and-synthesis/gap/science/pad-us-data-stewards. The United Nations Environmental Program - World Conservation Monitoring Centre (UNEP-WCMC) tracks global progress toward biodiversity protection targets enacted by the Convention on Biological Diversity (CBD) through the World Database on Protected Areas (WDPA) and World Database on Other Effective Area-based Conservation Measures (WD-OECM) available at: www.protectedplanet.net. See the Aichi Target 11 dashboard (www.protectedplanet.net/en/thematic-areas/global-partnership-on-aichi-target-11) for official protection statistics recognized globally and developed for the CBD, or here for more information and statistics on the United States of America's protected areas: www.protectedplanet.net/country/USA. It is important to note statistics published by the National Oceanic and Atmospheric Administration (NOAA) Marine Protected Areas (MPA) Center (www.marineprotectedareas.noaa.gov/dataanalysis/mpainventory/) and the USGS-GAP (www.usgs.gov/core-science-systems/science-analytics-and-synthesis/gap/science/pad-us-statistics-and-reports) differ from statistics published by the UNEP-WCMC as methods to remove overlapping designations differ slightly and U.S. Territories are reported separately by the UNEP-WCMC (e.g. The largest MPA, "Pacific Remote Islands Marine Monument" is attributed to the United States Minor Outlying Islands statistics). At the time of PAD-US 2.1 publication (USGS-GAP, 2020), NOAA reported 26% of U.S. marine waters (including the Great Lakes) as protected in an MPA that meets the International Union for Conservation of Nature (IUCN) definition of biodiversity protection (www.iucn.org/theme/protected-areas/about). USGS-GAP plans to publish PAD-US 2.1 Statistics and Reports in the spring of 2021. The relationship between the USGS, the NOAA, and the UNEP-WCMC is as follows: - USGS manages and publishes the full inventory of U.S. marine and terrestrial protected areas data in the PAD-US representing many values, developed in collaboration with a partnership network in the U.S. and; - USGS is the primary source of U.S. marine and terrestrial protected areas data for the WDPA, developed from a subset of the PAD-US in collaboration with the NOAA, other agencies and non-governmental organizations in the U.S., and the UNEP-WCMC and; - UNEP-WCMC is the authoritative source of global protected area statistics from the WDPA and WD-OECM and; - NOAA is the authoritative source of MPA data in the PAD-US and MPA statistics in the U.S. and; - USGS is the authoritative source of PAD-US statistics (including areas primarily managed for biodiversity, multiple uses including natural resource extraction, and public access). The PAD-US 2.1 Combined Marine, Fee, Designation, Easement feature class (GAP Status Code 1 and 2 only) is the source of protected areas data in this WDPA update. Tribal areas and military lands represented in the PAD-US Proclamation feature class as GAP Status Code 4 (no known mandate for biodiversity protection) are not included as spatial data to represent internal protected areas are not available at this time. The USGS submitted more than 42,900 protected areas from PAD-US 2.1, including all 50 U.S. States and 6 U.S. Territories, to the UNEP-WCMC for inclusion in the May 2021 WDPA, available at www.protectedplanet.net. The NOAA is the sole source of MPAs in PAD-US and the National Conservation Easement Database (NCED, www.conservationeasement.us/) is the source of conservation easements. The USGS aggregates authoritative federal lands data directly from managing agencies for PAD-US (www.communities.geoplatform.gov/ngda-govunits/federal-lands-workgroup/), while a network of State data-stewards provide state, local government lands, and some land trust preserves. National nongovernmental organizations contribute spatial data directly (www.usgs.gov/core-science-systems/science-analytics-and-synthesis/gap/science/pad-us-data-stewards). The USGS translates the biodiversity focused subset of PAD-US into the WDPA schema (UNEP-WCMC, 2019) for efficient aggregation by the UNEP-WCMC. The USGS maintains WDPA Site Identifiers (WDPAID, WDPA_PID), a persistent identifier for each protected area, provided by UNEP-WCMC. Agency partners are encouraged to track WDPA Site Identifier values in source datasets to improve the efficiency and accuracy of PAD-US and WDPA updates. The IUCN protected areas in the U.S. are managed by thousands of agencies and organizations across the country and include over 42,900 designated sites such as National Parks, National Wildlife Refuges, National Monuments, Wilderness Areas, some State Parks, State Wildlife Management Areas, Local Nature Preserves, City Natural Areas, The Nature Conservancy and other Land Trust Preserves, and Conservation Easements. The boundaries of these protected places (some overlap) are represented as polygons in the PAD-US, along with informative descriptions such as Unit Name, Manager Name, and Designation Type. As the WDPA is a global dataset, their data standards (UNEP-WCMC 2019) require simplification to reduce the number of records included, focusing on the protected area site name and management authority as described in the Supplemental Information section in this metadata record. Given the numerous organizations involved, sites may be added or removed from the WDPA between PAD-US updates. These differences may reflect actual change in protected area status; however, they also reflect the dynamic nature of spatial data or Geographic Information Systems (GIS). Many agencies and non-governmental organizations are working to improve the accuracy of protected area boundaries, the consistency of attributes, and inventory completeness between PAD-US updates. In addition, USGS continually seeks partners to review and refine the assignment of conservation measures in the PAD-US.
e
Catalog of Earth-Like Exoplanet Survey Targets - Dataset - B2FIND
b2find.eudat.eu
Updated Oct 29, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2023). Catalog of Earth-Like Exoplanet Survey Targets - Dataset - B2FIND [Dataset]. https://b2find.eudat.eu/dataset/42b69881-b12a-53bf-8e82-ca9e1f6c4d80
Explore at:
Dataset updated
Oct 29, 2023
Area covered
Earth
Description
Locating planets in circumstellar habitable zones (HZs) is a priority for many exoplanet surveys. Space-based and ground-based surveys alike require robust toolsets to aid in target selection and mission planning. We present the Catalog of Earth-Like Exoplanet Survey Targets (CELESTA), a database of HZs around 37000 nearby stars. We calculated stellar parameters, including effective temperatures, masses, and radii, and we quantified the orbital distances and periods corresponding to the circumstellar HZs. We gauged the accuracy of our predictions by contrasting CELESTA's computed parameters to observational data. We ascertain a potential return on investment by computing the number of HZs probed for a given survey duration. A versatile framework for extending the functionality of CELESTA into the future enables ongoing comparisons to new observations, and recalculations when updates to HZ models, stellar temperatures, or parallax data become available. We expect to upgrade and expand CELESTA using data from the Gaia mission as the data become available.
Emissions by Country
kaggle.com
Updated Mar 10, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
The Devastator (2024). Emissions by Country [Dataset]. https://www.kaggle.com/datasets/thedevastator/global-fossil-co2-emissions-by-country-2002-2022
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Mar 10, 2024
Dataset provided by
Kagglehttp://kaggle.com/
Authors
The Devastator
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Emissions by Country

Quantifying Sources and Emission Levels

By [source]

About this dataset

This dataset provides an in-depth look into the global CO2 emissions at the country-level, allowing for a better understanding of how much each country contributes to the global cumulative human impact on climate. It contains information on total emissions as well as from coal, oil, gas, cement production and flaring, and other sources. The data also provides a breakdown of per capita CO2 emission per country - showing which countries are leading in pollution levels and identifying potential areas where reduction efforts should be concentrated. This dataset is essential for anyone who wants to get informed about their own environmental footprint or conduct research on international development trends

More Datasets

For more datasets, click here.

Featured Notebooks

🚨 Your notebook can be here! 🚨!

How to use the dataset

This dataset provides a country-level survey of global fossil CO2 emissions, including total emissions, emissions from coal, oil, gas, cement, flaring and other sources as well as per capita emissions.

For researchers looking to quantify global CO2 emission levels by country over time and understand the sources of these emissions this dataset can be a valuable resource.

The data is organized using the following columns: Country (the name of the country), ISO 3166-1 alpha-3 (the three letter code for the country), Year (the year of survey data), Total (the total amount of CO2 emitted by the country in that year), Coal (amount of CO2 emitted by coal in that year), Oil (amount emitted by oil) , Gas (amount emitted by gas) , Cement( amount emitted by cement) , Flaring(flaring emission levels ) and Other( other forms such as industrial processes ). In addition there is also one extra column Per Capita which provides an insight into how much personal carbon dioxide emission is present in each Country per individual .

To make use of these columns you can aggregate sum up Total column for a specific region or help define how much each source contributes to Total column such as how many percent it accounts for out of 100 or construct dashboard visualizations to explore what sources are responsible for higher level emission across different countries similar clusters or examine whether individual countries Focusing on Flaring — emissions associated with burning off natural gas while drilling—can improve overall Fossil Fuel Carbon Emission profiles better understanding of certain types nuclear power plants etc.

The main purpose behind this dataset was to facilitate government bodies private organizations universities NGO's research agencies alike applying analytical techniques tracking environment changes linked with influence cross regions providing resources needed analyze process monitor developing directed ways managing efficient ways get detailed comprehensive verified information

With insights gleaned from this dataset one can begin identify strategies efforts pollutant mitigation climate change combat etc while making decisions centered around sustainable developments with continent wide unified plans policy implementations keep an eye out evidences regional discrepancies being displayed improving quality life might certainly seem likely assure task easy quickly done “Global Fossil Carbon Dioxide Emissions:Country Level Survey 2002 2022 could exactly what us

Research Ideas

Using the per capita emissions data, develop a reporting system to track countries' progress in meeting carbon emission targets and give policy recommendations for how countries can reach those targets more quickly.

Analyze the correlation between different fossil fuel sources and CO2 emissions to understand how best to reduce CO2 emissions at a country-level.

Create an interactive map showing global CO2 levels over time that allows users to visualize trends by country or region across all fossil fuel sources

Acknowledgements

If you use this dataset in your research, please credit the original authors. Data Source

License

License: CC0 1.0 Universal (CC0 1.0) - Public Domain Dedication No Copyright - You can copy, modify, distribute and perform the work, even for commercial purposes, all without asking permission. See Other Information.

Columns

File: GCB2022v27_MtCO2_flat.csv | Column name | Description ...
H
Eswatini - Population Counts
data.humdata.org
geotiff
Updated Aug 26, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
WorldPop (2025). Eswatini - Population Counts [Dataset]. https://data.humdata.org/dataset/worldpop-population-counts-for-eswatini
Explore at:
geotiff(114004), geotiff(265507), geotiff(103794), geotiff(10432555), geotiff(266198), geotiff(114290), geotiff(103506), geotiff(114317), geotiff(114030), geotiff(9338665), geotiff(1641708), geotiff(103736), geotiff(265944), geotiff(265318), geotiff(9339635), geotiff(103852), geotiff(1896222), geotiff(10417454), geotiff(265728), geotiff(9336909), geotiff(114218), geotiff(265992), geotiff(265351), geotiff(9314046), geotiff(265572), geotiff(265223), geotiff(10418648), geotiff(10444258), geotiff(265648), geotiff(9334194), geotiff(10437360), geotiff(103788)Available download formats
Dataset updated
Aug 26, 2025
Dataset provided by
WorldPop
Area covered
Eswatini
Description
WorldPop produces different types of gridded population count datasets, depending on the methods used and end application. Please make sure you have read our Mapping Populations overview page before choosing and downloading a dataset.

Bespoke methods used to produce datasets for specific individual countries are available through the WorldPop Open Population Repository (WOPR) link below. These are 100m resolution gridded population estimates using customized methods ("bottom-up" and/or "top-down") developed for the latest data available from each country. They can also be visualised and explored through the woprVision App.
The remaining datasets in the links below are produced using the "top-down" method, with either the unconstrained or constrained top-down disaggregation method used. Please make sure you read the Top-down estimation modelling overview page to decide on which datasets best meet your needs. Datasets are available to download in Geotiff and ASCII XYZ format at a resolution of 3 and 30 arc-seconds (approximately 100m and 1km at the equator, respectively):

- Unconstrained individual countries 2000-2020 ( 1km resolution ): Consistent 1km resolution population count datasets created using unconstrained top-down methods for all countries of the World for each year 2000-2020.
- Unconstrained individual countries 2000-2020 ( 100m resolution ): Consistent 100m resolution population count datasets created using unconstrained top-down methods for all countries of the World for each year 2000-2020.
- Unconstrained individual countries 2000-2020 UN adjusted ( 100m resolution ): Consistent 100m resolution population count datasets created using unconstrained top-down methods for all countries of the World for each year 2000-2020 and adjusted to match United Nations national population estimates (UN 2019)
-Unconstrained individual countries 2000-2020 UN adjusted ( 1km resolution ): Consistent 1km resolution population count datasets created using unconstrained top-down methods for all countries of the World for each year 2000-2020 and adjusted to match United Nations national population estimates (UN 2019).
-Unconstrained global mosaics 2000-2020 ( 1km resolution ): Mosaiced 1km resolution versions of the "Unconstrained individual countries 2000-2020" datasets.
-Constrained individual countries 2020 ( 100m resolution ): Consistent 100m resolution population count datasets created using constrained top-down methods for all countries of the World for 2020.
-Constrained individual countries 2020 UN adjusted ( 100m resolution ): Consistent 100m resolution population count datasets created using constrained top-down methods for all countries of the World for 2020 and adjusted to match United Nations national population estimates (UN 2019).

Older datasets produced for specific individual countries and continents, using a set of tailored geospatial inputs and differing "top-down" methods and time periods are still available for download here: Individual countries and Whole Continent.

Data for earlier dates is available directly from WorldPop.

WorldPop (www.worldpop.org - School of Geography and Environmental Science, University of Southampton; Department of Geography and Geosciences, University of Louisville; Departement de Geographie, Universite de Namur) and Center for International Earth Science Information Network (CIESIN), Columbia University (2018). Global High Resolution Population Denominators Project - Funded by The Bill and Melinda Gates Foundation (OPP1134076). https://dx.doi.org/10.5258/SOTON/WP00645
State of Nature layers for Water Availability and Water Pollution to support...
zenodo.org
zip
Updated Jul 12, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Rafael Camargo; Rafael Camargo; Sara Walker; Elizabeth Saccoccia; Richard McDowell; Richard McDowell; Allen Townsend; Ariane Laporte-Bisquit; Samantha McCraine; Varsha Vijay; Sara Walker; Elizabeth Saccoccia; Allen Townsend; Ariane Laporte-Bisquit; Samantha McCraine; Varsha Vijay (2024). State of Nature layers for Water Availability and Water Pollution to support SBTN Step 1: Assess and Step 2: Interpret & Prioritize [Dataset]. http://doi.org/10.5281/zenodo.7797979
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.7797979
Dataset updated
Jul 12, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Rafael Camargo; Rafael Camargo; Sara Walker; Elizabeth Saccoccia; Richard McDowell; Richard McDowell; Allen Townsend; Ariane Laporte-Bisquit; Samantha McCraine; Varsha Vijay; Sara Walker; Elizabeth Saccoccia; Allen Townsend; Ariane Laporte-Bisquit; Samantha McCraine; Varsha Vijay
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
There are multiple well-recognized and peer-reviewed global datasets that can be used to assess water availability and water pollution. Each of these datasets are based on different inputs, modeling approaches, and assumptions. Therefore, in SBTN Step 1: Assess and Step 2: Interpret & Prioritize, companies are required to consult different global datasets for a robust and comprehensive State of Nature (SoN) assessment for water availability and water pollution.

To streamline this process, WWF, the World Resources Institute (WRI), and SBTN worked together to develop two ready-to-use unified layers of SoN – one for water availability and one for water pollution – in line with the Technical Guidance for Steps 1: Assess and Step 2: Interpret & Prioritize. The result is a single file (shapefile) containing the maximum value both for water availability and for water pollution, as well as the datasets’ raw values (as references). This data is publicly available for download from this repository.

These unified layers will make it easier for companies to implement a robust approach, and they will lead to more aligned and comparable results between companies. A temporary App is available at https://arcg.is/0z9mOD0 to help companies assess the SoN for water availability and water pollution around their operations and supply chain locations. In the future, these layers will become available both in the WRI’s Aqueduct and in the WWF Risk Filter Suite.

For the SoN for water availability, the following datasets were considered:

Baseline water stress (Hofste et al. 2019), data available here

Water depletion (Brauman et al. 2016), data available here

Blue water scarcity (Mekonnen & Hoekstra 2016), data upon request to the authors

For the SoN for water pollution, the following datasets were considered:

Coastal Eutrophication Potential (Hofste et al. 2019), data available here

Nitrate-Nitrite Concentration (Damania et al. 2019), data available here

Periphyton Growth Potential (McDowell et al. 2020), data available here

In general, the same processing steps were performed for all datasets:

Compute the area-weighted median of each dataset at a common spatial resolution, i.e. HydroSHEDS HydroBasins Level 6 in this case.

Classify datasets to a common range as reclassifying raw values to 1-5 values, where 0 (zero) was used for cells or features with no data. See the documentation for more details.

Identify the maximum value between the classified datasets, separately, for Water Availability and for Water Pollution.

For transparency and reproducibility, the code is publicly available at https://github.com/rafaexx/sbtn-SoN-water
GovTech Dataset
datacatalog.worldbank.org
excel
Updated Oct 26, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
The GovTech global dataset contains a rich set of data covering important aspects of the GovTech focus areas in 198 economies. It includes web links to relevant institutions and systems, coupled with the basic information on the operational status and capabilities of government systems, online services and portals. (2022). GovTech Dataset [Dataset]. https://datacatalog.worldbank.org/search/dataset/0037889/govtech-dataset
Explore at:
excelAvailable download formats
Dataset updated
Oct 26, 2022
Dataset provided by
World Bank Grouphttp://www.worldbank.org/
The GovTech global dataset contains a rich set of data covering important aspects of the GovTech focus areas in 198 economies. It includes web links to relevant institutions and systems, coupled with the basic information on the operational status and capabilities of government systems, online services and portals.
License
https://datacatalog.worldbank.org/public-licenses?fragment=cchttps://datacatalog.worldbank.org/public-licenses?fragment=cc
Description
The WBG launched the GovTech Maturity Index (GTMI) in 2020 as a composite index that uses 48 key indicators to measure critical aspects of four GovTech focus areas in 198 economies: supporting core government systems, enhancing service delivery, mainstreaming citizen engagement, and fostering GovTech enablers.
The construction of the GTMI is primarily based on the World Bank’s GovTech Dataset. The GTMI Report and GovTech dataset provides opportunities to replicate the study, identify gaps in digital transformation by comparing the differences among economies and groups of economies, as well as track changes over time transparently.
The 2020 GovTech dataset contained data/evidence collected from government websites using remotely measurable indicators (due to the COVID-19 pandemic) mostly reflecting de jure practices. The GTMI Team followed a different approach for the 2022 update of the GTMI and underlying GovTech Dataset.
First, the GTMI indicators were revised and extended to explore the performance of existing platforms and cover less known areas in consultation with 9 relevant organizations and 10 World Bank practices/groups from November 2021 to January 2022. A Central Government (CG) GTMI online survey was launched in March 2022 and 850+ officials from 164 countries accepted to join this exercise to reflect the latest developments and results of their GovTech initiatives. Additionally, a Subnational Government (SNG) GTMI online survey was launched in parallel as a pilot implementation for interested countries. Finally, a data validation phase was included to benefit from the clarifications and updates of all survey participants while checking the survey responses and calculating the GTMI scores and groups.
The GTMI includes 40 updated/expanded GovTech indicators measuring the maturity of four GovTech focus areas. Additionally, 8 highly relevant external indicators measured by other relevant indexes are used in the calculation of GTMI groups.
The 2022 GovTech Dataset presents all indicators based on the CG GTMI survey data submitted by 135 countries directly, as well as the remotely collected data from the web sites of 63 non-participating economies. Additionally, the dataset includes the SNG GTMI data submitted by 113 subnational government entities (states, municipalities) from 16 countries and this expanded the scope of GovTech Dataset considerably.
As a part of the 2022 GTMI update, a GTMI Data Dashboard was launched to create a data visualization portal with maps and graphs aimed at helping the end-user digest and explore the findings of the CG GTMI / GovTech Dataset, as well as the GovTech Projects Database (presenting the details of 1450+ digital government/GovTech projects funded by the WBG in 147 countries since 1995).
The GovTech Dataset is a substantially expanded version of the Digital Government Systems and Services (DGSS) global dataset, originally developed in 2014 and updated every two years to support the preparation of several WBG studies and flagship reports (e.g., 2014 FMIS and Open Budget Data Study; WDR 2016: Digital Dividends; 2018 WBG Digital Adoption Index; WDR 2021: Data for Better Lives; and 2020 GovTech Maturity Index). The dataset will be updated every two years to reflect progress in the GovTech domain globally.
BlogFeedback Data Set
kaggle.com
zip
Updated Jul 15, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Julio Tentor (2022). BlogFeedback Data Set [Dataset]. https://www.kaggle.com/datasets/jtentor/blogfeedback-data-set
Explore at:
zip(2550651 bytes)Available download formats
Dataset updated
Jul 15, 2022
Authors
Julio Tentor
Description
Source:

Krisztian Buza Budapest University of Technology and Economics buza '@' cs.bme.hu http://www.cs.bme.hu/~buza

You can download a zip file from https://archive.ics.uci.edu/ml/datasets/BlogFeedback

Data Set Information:

This data originates from blog posts. The raw HTML-documents of the blog posts were crawled and processed.

The prediction task associated with the data is the prediction of the number of comments in the upcoming 24 hours.

In order to simulate this situation, we choose a basetime (in the past) and select the blog posts that were published at most 72 hours before the selected base date/time. Then, we calculate all the features of the selected blog posts from the information that was available at the basetime, therefore each instance corresponds to a blog post. The target is the number of comments that the blog post received in the next 24 hours relative to the base time.

In the train data, the base times were in the years 2010 and 2011. In the test data the base times were in February and March 2012.

This simulates the real-world situation in which training data from the past is available to predict events in the future.

The train data was generated from different base times that may temporally overlap.

Therefore, if you simply split the train into disjoint partitions, the underlying time intervals may overlap.

Therefore, you should use the provided, temporally disjoint train and test splits in order to ensure that the evaluation is fair.

** Attribute Information:**

1...50: Average, standard deviation, min, max and median of the Attributes 51...60 for the source of the current blog post. With source we mean the blog on which the post appeared. For example, myblog.blog.org would be the source of the post myblog.blog.org/post_2010_09_10

51: Total number of comments before basetime 52: Number of comments in the last 24 hours before the base time 53: Let T1 denote the datetime 48 hours before basetime, Let T2 denote the datetime 24 hours before basetime. This attribute is the number of comments in the time period between T1 and T2 54: Number of comments in the first 24 hours after the publication of the blog post, but before basetime 55: The difference of Attribute 52 and Attribute 53 56...60: The same features as the attributes 51...55, but features 56...60 refer to the number of links (trackbacks), while features 51...55 refer to the number of comments. 61: The length of time between the publication of the blog post and base time 62: The length of the blog post 63...262: The 200 bag of words features for 200 frequent words of the text of the blog post 263...269: binary indicator features (0 or 1) for the weekday (Monday...Sunday) of the basetime 270...276: binary indicator features (0 or 1) for the weekday (Monday...Sunday) of the date of publication of the blog post 277: Number of parent pages: we consider a blog post P as a parent of blog post B, if B is a reply (trackback) to blog post P. 278...280: Minimum, maximum, average number of comments that the parents received 281: The target: the number of comments in the next 24 hours (relative to base time)

** Relevant Papers:**

Buza, K. (2014). Feedback Prediction for Blogs. In Data Analysis, Machine Learning and Knowledge Discovery (pp. 145-152). Springer International Publishing (http://cs.bme.hu/~buza/pdfs/gfkl2012_blogs.pdf).
A
Greenland - Population Counts
data.amerigeoss.org
data.humdata.org
geotiff
Updated Jun 18, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
UN Humanitarian Data Exchange (2025). Greenland - Population Counts [Dataset]. https://data.amerigeoss.org/it/dataset/worldpop-greenland-population
Explore at:
geotiffAvailable download formats
Dataset updated
Jun 18, 2025
Dataset provided by
UN Humanitarian Data Exchange
Area covered
Greenland
Description
WorldPop produces different types of gridded population count datasets, depending on the methods used and end application. Please make sure you have read our Mapping Populations overview page before choosing and downloading a dataset.

Bespoke methods used to produce datasets for specific individual countries are available through the WorldPop Open Population Repository (WOPR) link below. These are 100m resolution gridded population estimates using customized methods ("bottom-up" and/or "top-down") developed for the latest data available from each country. They can also be visualised and explored through the woprVision App.
The remaining datasets in the links below are produced using the "top-down" method, with either the unconstrained or constrained top-down disaggregation method used. Please make sure you read the Top-down estimation modelling overview page to decide on which datasets best meet your needs. Datasets are available to download in Geotiff and ASCII XYZ format at a resolution of 3 and 30 arc-seconds (approximately 100m and 1km at the equator, respectively):

- Unconstrained individual countries 2000-2020 ( 1km resolution ): Consistent 1km resolution population count datasets created using unconstrained top-down methods for all countries of the World for each year 2000-2020.
- Unconstrained individual countries 2000-2020 ( 100m resolution ): Consistent 100m resolution population count datasets created using unconstrained top-down methods for all countries of the World for each year 2000-2020.
- Unconstrained individual countries 2000-2020 UN adjusted ( 100m resolution ): Consistent 100m resolution population count datasets created using unconstrained top-down methods for all countries of the World for each year 2000-2020 and adjusted to match United Nations national population estimates (UN 2019)
-Unconstrained individual countries 2000-2020 UN adjusted ( 1km resolution ): Consistent 1km resolution population count datasets created using unconstrained top-down methods for all countries of the World for each year 2000-2020 and adjusted to match United Nations national population estimates (UN 2019).
-Unconstrained global mosaics 2000-2020 ( 1km resolution ): Mosaiced 1km resolution versions of the "Unconstrained individual countries 2000-2020" datasets.
-Constrained individual countries 2020 ( 100m resolution ): Consistent 100m resolution population count datasets created using constrained top-down methods for all countries of the World for 2020.
-Constrained individual countries 2020 UN adjusted ( 100m resolution ): Consistent 100m resolution population count datasets created using constrained top-down methods for all countries of the World for 2020 and adjusted to match United Nations national population estimates (UN 2019).

Older datasets produced for specific individual countries and continents, using a set of tailored geospatial inputs and differing "top-down" methods and time periods are still available for download here: Individual countries and Whole Continent.

Data for earlier dates is available directly from WorldPop.

WorldPop (www.worldpop.org - School of Geography and Environmental Science, University of Southampton; Department of Geography and Geosciences, University of Louisville; Departement de Geographie, Universite de Namur) and Center for International Earth Science Information Network (CIESIN), Columbia University (2018). Global High Resolution Population Denominators Project - Funded by The Bill and Melinda Gates Foundation (OPP1134076). https://dx.doi.org/10.5258/SOTON/WP00645
a
Statistically downscaled climate indices from CMIP6 global climate models...
catalogue.arctic-sdi.org
data.urbandatacentre.ca
+2more
Updated May 10, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). Statistically downscaled climate indices from CMIP6 global climate models (CanDCS-U6 & CanDCS-M6) [Dataset]. https://catalogue.arctic-sdi.org/geonetwork/srv/search?keyword=Statistical%20analysis
Explore at:
Dataset updated
May 10, 2025
Description
Environment and Climate Change Canada’s (ECCC) Climate Research Division (CRD) and the Pacific Climate Impacts Consortium (PCIC) previously produced statistically downscaled climate scenarios based on simulations from climate models that participated in the Coupled Model Intercomparison Project phase 5 (CMIP5) in 2015. ECCC and PCIC have now updated the CMIP5-based downscaled scenarios with two new sets of downscaled scenarios based on the next generation of climate projections from the Coupled Model Intercomparison Project phase 6 (CMIP6). The scenarios are named Canadian Downscaled Climate Scenarios–Univariate method from CMIP6 (CanDCS-U6) and Canadian Downscaled Climate Scenarios–Multivariate method from CMIP6 (CanDCS-M6). CMIP6 climate projections are based on both updated global climate models and new emissions scenarios called “Shared Socioeconomic Pathways” (SSPs). Statistically downscaled datasets have been produced from 26 CMIP6 global climate models (GCMs) under three different emission scenarios (i.e., SSP1-2.6, SSP2-4.5, and SSP5-8.5), with PCIC later adding SSP3-7.0 to the CanDCS-M6 dataset. The CanDCS-U6 was downscaled using the Bias Correction/Constructed Analogues with Quantile mapping version 2 (BCCAQv2) procedure, and the CanDCS-M6 was downscaled using the N-dimensional Multivariate Bias Correction (MBCn) method. The CanDCS-U6 dataset was produced using the same downscaling target data (NRCANmet) as the CMIP5-based downscaled scenarios, while the CanDCS-M6 dataset implements a new target dataset (ANUSPLIN and PNWNAmet blended dataset). Statistically downscaled individual model output and ensembles are available for download. Downscaled climate indices are available across Canada at 10km grid spatial resolution for the 1950-2014 historical period and for the 2015-2100 period following each of the three emission scenarios. A total of 31 climate indices have been calculated using the CanDCS-U6 and CanDCS-M6 datasets. The climate indices include 27 Climdex indices established by the Expert Team on Climate Change Detection and Indices (ETCCDI) and 4 additional indices that are slightly modified from the Climdex indices. These indices are calculated from daily precipitation and temperature values from the downscaled simulations and are available at annual or monthly temporal resolution, depending on the index. Monthly indices are also available in seasonal and annual versions. Note: projected future changes by statistically downscaled products are not necessarily more credible than those by the underlying climate model outputs. In many cases, especially for absolute threshold-based indices, projections based on downscaled data have a smaller spread because of the removal of model biases. However, this is not the case for all indices. Downscaling from GCM resolution to the fine resolution needed for impacts assessment increases the level of spatial detail and temporal variability to better match observations. Since these adjustments are GCM dependent, the resulting indices could have a wider spread when computed from downscaled data as compared to those directly computed from GCM output. In the latter case, it is not the downscaling procedure that makes future projection more uncertain; rather, it is indicative of higher variability associated with finer spatial scale. Individual model datasets and all related derived products are subject to the terms of use (https://pcmdi.llnl.gov/CMIP6/TermsOfUse/TermsOfUse6-1.html) of the source organization.
n
Gridded Population of the World, Version 4 (GPWv4): Population Density,...
earthdata.nasa.gov
s.cnmilf.com
+2more
Updated Dec 31, 2018
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
ESDIS (2018). Gridded Population of the World, Version 4 (GPWv4): Population Density, Revision 11 [Dataset]. http://doi.org/10.7927/H49C6VHW
Explore at:
Unique identifier
https://doi.org/10.7927/H49C6VHW
Dataset updated
Dec 31, 2018
Dataset authored and provided by
ESDIS
Description
The Gridded Population of the World, Version 4 (GPWv4): Population Density, Revision 11 consists of estimates of human population density (number of persons per square kilometer) based on counts consistent with national censuses and population registers, for the years 2000, 2005, 2010, 2015, and 2020.ï¿½A proportional allocation gridding algorithm, utilizing approximately 13.5 million national and sub-national administrative Units, was used to assign population counts to 30 arc-second grid cells. The population density rasters were created by dividing the population count raster for a given target year by the land area raster. The data files were produced as global rasters at 30 arc-second (~1 km at the equator) resolution. To enable faster global processing, and in support of research commUnities, the 30 arc-second count data were aggregated to 2.5 arc-minute, 15 arc-minute, 30 arc-minute and 1 degree resolutions to produce density rasters at these resolutions.
WDPA - World Database on Protected Areas polygons from WCMC
globil-panda.opendata.arcgis.com
globil-1-panda.hub.arcgis.com
+2more
Updated Dec 30, 2016
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
World Wide Fund for Nature (2016). WDPA - World Database on Protected Areas polygons from WCMC [Dataset]. https://globil-panda.opendata.arcgis.com/maps/61cde74cf99645b7b2c30212514ddae5
Explore at:
Dataset updated
Dec 30, 2016
Dataset authored and provided by
World Wide Fund for Naturehttp://wwf.org/
Area covered
Description
The World Database on Protected Areas (WDPA) is the most comprehensive global database of marine and terrestrial protected areas and is one of the key global biodiversity datasets being widely used by scientists, businesses, governments, International secretariats and others to inform planning, policy decisions and management.The WDPA is a joint project between the United Nations Environment Programme (UNEP) and the International Union for Conservation of Nature (IUCN). The compilation and management of the WDPA is carried out by UNEP World Conservation Monitoring Centre (UNEP-WCMC), in collaboration with governments, non-governmental organisations, academia and industry. There are monthly updates of the data which are made available online through the Protected Planet website where the data is both viewable and downloadable.Data and information on the world's protected areas compiled in the WDPA are used for reporting to the Convention on Biological Diversity on progress towards reaching the Aichi Biodiversity Targets (particularly Target 11), to the UN to track progress towards the 2030 Sustainable Development Goals, to some of the Intergovernmental Science-Policy Platform on Biodiversity and Ecosystem Services (IPBES) core indicators, and other international assessments and reports including the Global Biodiversity Outlook, as well as for the publication of the United Nations List of Protected Areas. Every two years, UNEP-WCMC releases the Protected Planet Report on the status of the world's protected areas and recommendations on how to meet international goals and targets.Many platforms are incorporating the WDPA to provide integrated information to diverse users, including businesses and governments, in a range of sectors including mining, oil and gas, and finance. For example, the WDPA is included in the Integrated Biodiversity Assessment Tool, an innovative decision support tool that gives users easy access to up-to-date information that allows them to identify biodiversity risks and opportunities within a project boundary.The reach of the WDPA is further enhanced in services developed by other parties, such as theGlobal Forest Watch and the Digital Observatory for Protected Areas, which provide decision makers with access to monitoring and alert systems that allow whole landscapes to be managed better. Together, these applications of the WDPA demonstrate the growing value and significance of the Protected Planet initiative.For more details on the WDPA please read through the WDPA User Manual.
f
Datasets used for model learning and validation.
figshare.com
xls
Updated Mar 26, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jialiang Sun; Jun Guo; Jian Liu (2024). Datasets used for model learning and validation. [Dataset]. http://doi.org/10.1371/journal.pcbi.1011972.t001
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pcbi.1011972.t001
Dataset updated
Mar 26, 2024
Dataset provided by
PLOS Computational Biology
Authors
Jialiang Sun; Jun Guo; Jian Liu
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Using the CRISPR-Cas9 system to perform base substitutions at the target site is a typical technique for genome editing with the potential for applications in gene therapy and agricultural productivity. When the CRISPR-Cas9 system uses guide RNA to direct the Cas9 endonuclease to the target site, it may misdirect it to a potential off-target site, resulting in an unintended genome editing. Although several computational methods have been proposed to predict off-target effects, there is still room for improvement in the off-target effect prediction capability. In this paper, we present an effective approach called CRISPR-M with a new encoding scheme and a novel multi-view deep learning model to predict the sgRNA off-target effects for target sites containing indels and mismatches. CRISPR-M takes advantage of convolutional neural networks and bidirectional long short-term memory recurrent neural networks to construct a three-branch network towards multi-views. Compared with existing methods, CRISPR-M demonstrates significant performance advantages running on real-world datasets. Furthermore, experimental analysis of CRISPR-M under multiple metrics reveals its capability to extract features and validates its superiority on sgRNA off-target effect predictions.
Data cleaning using unstructured data
zenodo.org
zip
Updated Jul 30, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Rihem Nasfi; Rihem Nasfi; Antoon Bronselaer; Antoon Bronselaer (2024). Data cleaning using unstructured data [Dataset]. http://doi.org/10.5281/zenodo.13135983
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.13135983
Dataset updated
Jul 30, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Rihem Nasfi; Rihem Nasfi; Antoon Bronselaer; Antoon Bronselaer
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
In this project, we work on repairing three datasets:

Trials design: This dataset was obtained from the European Union Drug Regulating Authorities Clinical Trials Database (EudraCT) register and the ground truth was created from external registries. In the dataset, multiple countries, identified by the attribute country_protocol_code, conduct the same clinical trials which is identified by eudract_number. Each clinical trial has a title that can help find informative details about the design of the trial.

Trials population: This dataset delineates the demographic origins of participants in clinical trials primarily conducted across European countries. This dataset include structured attributes indicating whether the trial pertains to a specific gender, age group or healthy volunteers. Each of these categories is labeled as (`1') or (`0') respectively denoting whether it is included in the trials or not. It is important to note that the population category should remain consistent across all countries conducting the same clinical trial identified by an eudract_number. The ground truth samples in the dataset were established by aligning information about the trial populations provided by external registries, specifically the CT.gov database and the German Trials database. Additionally, the dataset comprises other unstructured attributes that categorize the inclusion criteria for trial participants such as inclusion.

Allergens: This dataset contains information about products and their allergens. The data was collected from the German version of the `Alnatura' (Access date: 24 November, 2020), a free database of food products from around the world `Open Food Facts', and the websites: `Migipedia', 'Piccantino', and `Das Ist Drin'. There may be overlapping products across these websites. Each product in the dataset is identified by a unique code. Samples with the same code represent the same product but are extracted from a differentb source. The allergens are indicated by (‘2’) if present, or (‘1’) if there are traces of it, and (‘0’) if it is absent in a product. The dataset also includes information on ingredients in the products. Overall, the dataset comprises categorical structured data describing the presence, trace, or absence of specific allergens, and unstructured text describing ingredients.

N.B: Each '.zip' file contains a set of 5 '.csv' files which are part of the afro-mentioned datasets:

"{dataset_name}_train.csv": samples used for the ML-model training. (e.g "allergens_train.csv")

"{dataset_name}_test.csv": samples used to test the the ML-model performance. (e.g "allergens_test.csv")

"{dataset_name}_golden_standard.csv": samples represent the ground truth of the test samples. (e.g "allergens_golden_standard.csv")

"{dataset_name}_parker_train.csv": samples repaired using Parker Engine used for the ML-model training. (e.g "allergens_parker_train.csv")

"{dataset_name}_parker_train.csv": samples repaired using Parker Engine used to test the the ML-model performance. (e.g "allergens_parker_test.csv")
Sustainable Development Goals
kaggle.com
zip
Updated Jan 12, 2019
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
World Bank (2019). Sustainable Development Goals [Dataset]. https://www.kaggle.com/theworldbank/sustainable-development-goals
Explore at:
zip(20674194 bytes)Available download formats
Dataset updated
Jan 12, 2019
Dataset authored and provided by
World Bankhttp://topics.nytimes.com/top/reference/timestopics/organizations/w/world_bank/index.html
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Content

Relevant indicators drawn from the World Development Indicators, reorganized according to the goals and targets of the Sustainable Development Goals (SDGs). These indicators may help to monitor SDGs, but they are not always the official indicators for SDG monitoring.

Context

This is a dataset hosted by the World Bank. The organization has an open data platform found here and they update their information according the amount of data that is brought in. Explore the World Bank using Kaggle and all of the data sources available through the World Bank organization page!

Update Frequency: This dataset is updated daily.

Acknowledgements

This dataset is maintained using the World Bank's APIs and Kaggle's API.

Cover photo by NA on Unsplash
Unsplash Images are distributed under a unique Unsplash License.

Facebook

Twitter

Click to copy link

Link copied

Cite

National Aeronautics and Space Administration (2025). Near-Earth Object Human Space Flight Accessible Targets Study (NHATS) [Dataset]. https://catalog.data.gov/dataset/near-earth-object-human-space-flight-accessible-targets-study-nhats

Near-Earth Object Human Space Flight Accessible Targets Study (NHATS)

Explore at:

61 scholarly articles cite this dataset (View in Google Scholar)

Dataset updated

Aug 30, 2025

Dataset provided by

NASAhttp://nasa.gov/

Area covered

Earth

Description

This list of potential mission targets should not be interpreted as a complete list of viable NEAs for an actual human exploration mission. As the NEA orbits are updated, the viable mission targets and their mission parameters will change. To select an actual target and mission scenario, additional constraints must be applied including astronaut health and safety considerations, human space flight architecture elements, their performances and readiness, the physical nature of the target NEA and mission schedule constraints.

Clear search

Close search

Google apps

Main menu

Near-Earth Object Human Space Flight Accessible Targets Study (NHATS)

Data from: World Database on Protected Areas

ExioML: Global Sectoral Sustainability Dataset

🙋‍♂️ Introduction

📊 Dataset

☁️ Factor Accounting

🚞 Footprint Network

🔗 Code and Data Availability

💡 Additional Information

📄 Citation

🌟 Reference

GlobPOP: A 33-year (1990-2022) global gridded population dataset (Version...

Data Usage Notice

This version is not recommended for download. Please check the newest version.

Introduction

Data description

More information

Data for "Prediction of Search Targets From Fixations in Open-World...

The Urbanity Global Network Dataset

Protected Areas Database of the United States (PAD-US) 2.1 - World Database...

Catalog of Earth-Like Exoplanet Survey Targets - Dataset - B2FIND

Emissions by Country

Emissions by Country

Quantifying Sources and Emission Levels

About this dataset

More Datasets

Featured Notebooks

How to use the dataset

Research Ideas

Acknowledgements

License

Columns

Eswatini - Population Counts

State of Nature layers for Water Availability and Water Pollution to support...

GovTech Dataset

BlogFeedback Data Set

Source:

Data Set Information:

** Attribute Information:**

** Relevant Papers:**

Greenland - Population Counts

Statistically downscaled climate indices from CMIP6 global climate models...

Gridded Population of the World, Version 4 (GPWv4): Population Density,...

WDPA - World Database on Protected Areas polygons from WCMC

Datasets used for model learning and validation.

Data cleaning using unstructured data

Sustainable Development Goals

Content

Context

Acknowledgements

Near-Earth Object Human Space Flight Accessible Targets Study (NHATS)See More Versions

Attribute Information:

Relevant Papers:

Near-Earth Object Human Space Flight Accessible Targets Study (NHATS)