36 datasets found

Total population worldwide 1950-2100
ai-chatbox.pro
statista.com
Updated Apr 8, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista Research Department (2025). Total population worldwide 1950-2100 [Dataset]. https://www.ai-chatbox.pro/?_=%2Ftopics%2F13342%2Faging-populations%2F%23XgboD02vawLKoDs%2BT%2BQLIV8B6B4Q9itA
Explore at:
Dataset updated
Apr 8, 2025
Dataset provided by
Statistahttp://statista.com/
Authors
Statista Research Department
Area covered
World
Description
The world population surpassed eight billion people in 2022, having doubled from its figure less than 50 years previously. Looking forward, it is projected that the world population will reach nine billion in 2038, and 10 billion in 2060, but it will peak around 10.3 billion in the 2080s before it then goes into decline. Regional variations The global population has seen rapid growth since the early 1800s, due to advances in areas such as food production, healthcare, water safety, education, and infrastructure, however, these changes did not occur at a uniform time or pace across the world. Broadly speaking, the first regions to undergo their demographic transitions were Europe, North America, and Oceania, followed by Latin America and Asia (although Asia's development saw the greatest variation due to its size), while Africa was the last continent to undergo this transformation. Because of these differences, many so-called "advanced" countries are now experiencing population decline, particularly in Europe and East Asia, while the fastest population growth rates are found in Sub-Saharan Africa. In fact, the roughly two billion difference in population between now and the 2080s' peak will be found in Sub-Saharan Africa, which will rise from 1.2 billion to 3.2 billion in this time (although populations in other continents will also fluctuate). Changing projections The United Nations releases their World Population Prospects report every 1-2 years, and this is widely considered the foremost demographic dataset in the world. However, recent years have seen a notable decline in projections when the global population will peak, and at what number. Previous reports in the 2010s had suggested a peak of over 11 billion people, and that population growth would continue into the 2100s, however a sooner and shorter peak is now projected. Reasons for this include a more rapid population decline in East Asia and Europe, particularly China, as well as a prolongued development arc in Sub-Saharan Africa.
Climate Change: Earth Surface Temperature Data
kaggle.com
redivis.com
zip
Updated May 1, 2017
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Berkeley Earth (2017). Climate Change: Earth Surface Temperature Data [Dataset]. https://www.kaggle.com/datasets/berkeleyearth/climate-change-earth-surface-temperature-data
Explore at:
zip(88843537 bytes)Available download formats
Dataset updated
May 1, 2017
Dataset authored and provided by
Berkeley Earthhttp://berkeleyearth.org/
License
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Area covered
Earth
Description
Some say climate change is the biggest threat of our age while others say it’s a myth based on dodgy science. We are turning some of the data over to you so you can form your own view.

Even more than with other data sets that Kaggle has featured, there’s a huge amount of data cleaning and preparation that goes into putting together a long-time study of climate trends. Early data was collected by technicians using mercury thermometers, where any variation in the visit time impacted measurements. In the 1940s, the construction of airports caused many weather stations to be moved. In the 1980s, there was a move to electronic thermometers that are said to have a cooling bias.

Given this complexity, there are a range of organizations that collate climate trends data. The three most cited land and ocean temperature data sets are NOAA’s MLOST, NASA’s GISTEMP and the UK’s HadCrut.

We have repackaged the data from a newer compilation put together by the Berkeley Earth, which is affiliated with Lawrence Berkeley National Laboratory. The Berkeley Earth Surface Temperature Study combines 1.6 billion temperature reports from 16 pre-existing archives. It is nicely packaged and allows for slicing into interesting subsets (for example by country). They publish the source data and the code for the transformations they applied. They also use methods that allow weather observations from shorter time series to be included, meaning fewer observations need to be thrown away.

In this dataset, we have include several files:

Global Land and Ocean-and-Land Temperatures (GlobalTemperatures.csv):

Date: starts in 1750 for average land temperature and 1850 for max and min land temperatures and global ocean and land temperatures

LandAverageTemperature: global average land temperature in celsius

LandAverageTemperatureUncertainty: the 95% confidence interval around the average

LandMaxTemperature: global average maximum land temperature in celsius

LandMaxTemperatureUncertainty: the 95% confidence interval around the maximum land temperature

LandMinTemperature: global average minimum land temperature in celsius

LandMinTemperatureUncertainty: the 95% confidence interval around the minimum land temperature

LandAndOceanAverageTemperature: global average land and ocean temperature in celsius

LandAndOceanAverageTemperatureUncertainty: the 95% confidence interval around the global average land and ocean temperature

Other files include:

Global Average Land Temperature by Country (GlobalLandTemperaturesByCountry.csv)

Global Average Land Temperature by State (GlobalLandTemperaturesByState.csv)

Global Land Temperatures By Major City (GlobalLandTemperaturesByMajorCity.csv)

Global Land Temperatures By City (GlobalLandTemperaturesByCity.csv)

The raw data comes from the Berkeley Earth data page.
World-population2023
kaggle.com
Updated Jan 29, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Dinar khan (2023). World-population2023 [Dataset]. https://www.kaggle.com/dinarkhan/worldpopulation2023/discussion
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jan 29, 2023
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Dinar khan
Area covered
World
Description
The increased world population is among the fierce problems the world is facing right now and it will get uncontrolled in the coming future if proper steps for its betterment were not taken immediately. This world has observed the fastest growth during the 20th century. In the 1950s world population was 2.7 billion, By the end of this year it will cross 8 billion. This dataset is uploaded with the assumption to use your Data Science, Machine learning, and Predictive analytics skills and answer the following questions. 1. Which countries have the highest growth rate. 2. What are the densely populated countries in the world. 3. Keeping in view all the variables in mind which countries should take serious steps to control their population.
o
Geonames - All Cities with a population > 1000
public.opendatasoft.com
data.smartidf.services
+2more
csv, excel, geojson +1
Updated Mar 10, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). Geonames - All Cities with a population > 1000 [Dataset]. https://public.opendatasoft.com/explore/dataset/geonames-all-cities-with-a-population-1000/
Explore at:
csv, json, geojson, excelAvailable download formats
Dataset updated
Mar 10, 2024
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
All cities with a population > 1000 or seats of adm div (ca 80.000)Sources and ContributionsSources : GeoNames is aggregating over hundred different data sources. Ambassadors : GeoNames Ambassadors help in many countries. Wiki : A wiki allows to view the data and quickly fix error and add missing places. Donations and Sponsoring : Costs for running GeoNames are covered by donations and sponsoring.Enrichment:add country name
Hotel Dataset: Rates, Reviews & Amenities(6k+)
kaggle.com
Updated Apr 18, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Joy Shil (2023). Hotel Dataset: Rates, Reviews & Amenities(6k+) [Dataset]. http://doi.org/10.34740/kaggle/dsv/5449910
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.34740/kaggle/dsv/5449910
Dataset updated
Apr 18, 2023
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Joy Shil
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
This Hotel Dataset: Rates, Reviews & Amenities(6k+) dataset includes hotel rates, guest reviews, and available amenities from two popular travel websites, TripAdvisor and Booking.com. The dataset can be used to analyze trends and insights in the hospitality industry, and inform decisions related to pricing, marketing, and customer service. Booking.com: Founded in 1996 in Amsterdam, Booking.com has grown from a small Dutch start-up to one of the world’s leading digital travel companies. Part of Booking Holdings Inc. (NASDAQ: BKNG), Booking.com’s mission is to make it easier for everyone to experience the world.

By investing in technology that takes the friction out of travel, Booking.com seamlessly connects millions of travelers to memorable experiences, a variety of transportation options, and incredible places to stay – from homes to hotels, and much more. As one of the world’s largest travel marketplaces for both established brands and entrepreneurs of all sizes, Booking.com enables properties around the world to reach a global audience and grow their businesses.

Booking.com is available in 43 languages and offers more than 28 million reported accommodation listings, including over 6.6 million homes, apartments, and other unique places to stay. Wherever you want to go and whatever you want to do, Booking.com makes it easy and supports you with 24/7 customer support. Tripadvisor, the world's largest travel guidance platform*, helps hundreds of millions of people each month** become better travelers, from planning to booking to taking a trip. Travelers across the globe use the Tripadvisor site and app to discover where to stay, what to do and where to eat based on guidance from those who have been there before. With more than 1 billion reviews and opinions of nearly 8 million businesses, travelers turn to Tripadvisor to find deals on accommodations, book experiences, reserve tables at delicious restaurants and discover great places nearby. As a travel guidance company available in 43 markets and 22 languages, Tripadvisor makes planning easy no matter the trip type. The subsidiaries of Tripadvisor, Inc. (Nasdaq: TRIP), own and operate a portfolio of travel media brands and businesses, operating under various websites and apps.
a
India: Soils Harmonized World Soil Database - General
hub.arcgis.com
arc-gis-hub-home-arcgishub.hub.arcgis.com
+1more
Updated Feb 1, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
GIS Online (2022). India: Soils Harmonized World Soil Database - General [Dataset]. https://hub.arcgis.com/maps/9f9535990648488a92cdd4d3b76dd43e
Explore at:
Dataset updated
Feb 1, 2022
Dataset authored and provided by
GIS Online
Area covered

Description
Soil is a key natural resource that provides the foundation of basic ecosystem services. Soil determines the types of farms and forests that can grow on a landscape. Soil filters water. Soil helps regulate the Earth's climate by storing large amounts of carbon. Activities that degrade soils reduce the value of the ecosystem services that soil provides. For example, since 1850 35% of human caused green house gas emissions are linked to land use change. The Soil Science Society of America is a good source of of additional information.Dataset SummaryThis layer provides access to a 30 arc-second (roughly 1 km) cell-sized raster with attributes describing the basic properties of soil derived from the Harmonized World Soil Database v 1.2. The values in this layer are for the dominant soil in each mapping unit (sequence field = 1).Attributes in this layer include:Soil Phase 1 and Soil Phase 2 - Phases identify characteristics of soils important for land use or management. Soils may have up to 2 phases with phase 1 being more important than phase 2.Other Properties - provides additional information important for agriculture.Additionally, 3 class description fields were added by Esri based on the document Harmonized World Soil Database Version 1.2 for use in web map pop-ups:Soil Phase 1 DescriptionSoil Phase 2 DescriptionOther Properties DescriptionThe layer is symbolized with the Soil Unit Name field.The document Harmonized World Soil Database Version 1.2 provides more detail on the soil properties attributes contained in this layer.Other attributes contained in this layer include:Soil Mapping Unit Name - the name of the spatially dominant major soil groupSoil Mapping Unit Symbol - a two letter code for labeling the spatially dominant major soil group in thematic mapsData Source - the HWSD is an aggregation of datasets. The data sources are the European Soil Database (ESDB), the 1:1 million soil map of China (CHINA), the Soil and Terrain Database Program (SOTWIS), and the Digital Soil Map of the World (DSMW).Percentage of Mapping Unit covered by dominant componentMore information on the Harmonized World Soil Database is available here.Other layers created from the Harmonized World Soil Database are available on ArcGIS Online:World Soils Harmonized World Soil Database - Bulk DensityWorld Soils Harmonized World Soil Database – ChemistryWorld Soils Harmonized World Soil Database - Exchange CapacityWorld Soils Harmonized World Soil Database – HydricWorld Soils Harmonized World Soil Database – TextureThe authors of this data set request that projects using these data include the following citation:FAO/IIASA/ISRIC/ISSCAS/JRC, 2012. Harmonized World Soil Database (version 1.2). FAO, Rome, Italy and IIASA, Laxenburg, Austria.What can you do with this layer?This layer is suitable for both visualization and analysis. It can be used in ArcGIS Online in web maps and applications and can be used in ArcGIS Desktop.This layer has query, identify, and export image services available. This layer is restricted to a maximum area of 16,000 x 16,000 pixels - an area 4,000 kilometers on a side or an area approximately the size of Europe. The source data for this layer are available here.This layer is part of a larger collection of landscape layers that you can use to perform a wide variety of mapping and analysis tasks.The Living Atlas of the World provides an easy way to explore the landscape layers and many other beautiful and authoritative maps on hundreds of topics.Geonet is a good resource for learning more about landscape layers and the Living Atlas of the World. To get started follow these links:Living Atlas Discussion GroupSoil Data Discussion GroupThe Esri Insider Blog provides an introduction to the Ecophysiographic Mapping project.
T
United States GDP
tradingeconomics.com
fa.tradingeconomics.com
+13more
csv, excel, json, xml
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
TRADING ECONOMICS, United States GDP [Dataset]. https://tradingeconomics.com/united-states/gdp
Explore at:
xml, excel, json, csvAvailable download formats
Dataset authored and provided by
TRADING ECONOMICS
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
Dec 31, 1960 - Dec 31, 2024
Area covered
United States
Description
The Gross Domestic Product (GDP) in the United States was worth 29184.89 billion US dollars in 2024, according to official data from the World Bank. The GDP value of the United States represents 27.49 percent of the world economy. This dataset provides - United States GDP - actual values, historical data, forecast, chart, statistics, economic calendar and news.
Open Images
kaggle.com
opendatalab.com
zip
Updated Feb 12, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Google BigQuery (2019). Open Images [Dataset]. https://www.kaggle.com/bigquery/open-images
Explore at:
zip(0 bytes)Available download formats
Dataset updated
Feb 12, 2019
Dataset provided by
BigQueryhttps://cloud.google.com/bigquery
Googlehttp://google.com/
Authors
Google BigQuery
License
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Description
Context

Labeled datasets are useful in machine learning research.

Content

This public dataset contains approximately 9 million URLs and metadata for images that have been annotated with labels spanning more than 6,000 categories.

Tables: 1) annotations_bbox 2) dict 3) images 4) labels

Update Frequency: Quarterly

Querying BigQuery Tables

Fork this kernel to get started.

Acknowledgements

https://bigquery.cloud.google.com/dataset/bigquery-public-data:open_images

https://cloud.google.com/bigquery/public-data/openimages

APA-style citation: Google Research (2016). The Open Images dataset [Image urls and labels]. Available from github: https://github.com/openimages/dataset.

Use: The annotations are licensed by Google Inc. under CC BY 4.0 license.

The images referenced in the dataset are listed as having a CC BY 2.0 license. Note: while we tried to identify images that are licensed under a Creative Commons Attribution license, we make no representations or warranties regarding the license status of each image and you should verify the license for each image yourself.

Banner Photo by Mattias Diesel from Unsplash.

Inspiration

Which labels are in the dataset? Which labels have "bus" in their display names? How many images of a trolleybus are in the dataset? What are some landing pages of images with a trolleybus? Which images with cherries are in the training set?
Success.ai | LinkedIn Full Dataset | Enrichment API – 700M Public Profiles &...
datarade.ai
Updated Jan 1, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Success.ai (2022). Success.ai | LinkedIn Full Dataset | Enrichment API – 700M Public Profiles & 70M Companies – Best Price and Quality Guarantee [Dataset]. https://datarade.ai/data-products/success-ai-linkedin-full-dataset-enrichment-api-700m-pu-success-ai
Explore at:
.bin, .json, .xml, .csv, .xls, .sql, .txtAvailable download formats
Dataset updated
Jan 1, 2022
Dataset provided by
Area covered
Tunisia, Equatorial Guinea, Svalbard and Jan Mayen, Guatemala, Saint Barthélemy, Jordan, United Republic of, Qatar, Greenland, Nicaragua
Description
Success.ai’s LinkedIn Data Solutions offer unparalleled access to a vast dataset of 700 million public LinkedIn profiles and 70 million LinkedIn company records, making it one of the most comprehensive and reliable LinkedIn datasets available on the market today. Our employee data and LinkedIn data are ideal for businesses looking to streamline recruitment efforts, build highly targeted lead lists, or develop personalized B2B marketing campaigns.

Whether you’re looking for recruiting data, conducting investment research, or seeking to enrich your CRM systems with accurate and up-to-date LinkedIn profile data, Success.ai provides everything you need with pinpoint precision. By tapping into LinkedIn company data, you’ll have access to over 40 critical data points per profile, including education, professional history, and skills.

Key Benefits of Success.ai’s LinkedIn Data: Our LinkedIn data solution offers more than just a dataset. With GDPR-compliant data, AI-enhanced accuracy, and a price match guarantee, Success.ai ensures you receive the highest-quality data at the best price in the market. Our datasets are delivered in Parquet format for easy integration into your systems, and with millions of profiles updated daily, you can trust that you’re always working with fresh, relevant data.

API Integration: Our datasets are easily accessible via API, allowing for seamless integration into your existing systems. This ensures that you can automate data retrieval and update processes, maintaining the flow of fresh, accurate information directly into your applications.

Global Reach and Industry Coverage: Our LinkedIn data covers professionals across all industries and sectors, providing you with detailed insights into businesses around the world. Our geographic coverage spans 259M profiles in the United States, 22M in the United Kingdom, 27M in India, and thousands of profiles in regions such as Europe, Latin America, and Asia Pacific. With LinkedIn company data, you can access profiles of top companies from the United States (6M+), United Kingdom (2M+), and beyond, helping you scale your outreach globally.

Why Choose Success.ai’s LinkedIn Data: Success.ai stands out for its tailored approach and white-glove service, making it easy for businesses to receive exactly the data they need without managing complex data platforms. Our dedicated Success Managers will curate and deliver your dataset based on your specific requirements, so you can focus on what matters most—reaching the right audience. Whether you’re sourcing employee data, LinkedIn profile data, or recruiting data, our service ensures a seamless experience with 99% data accuracy.

Best Price Guarantee: We offer unbeatable pricing on LinkedIn data, and we’ll match any competitor.

Global Scale: Access 700 million LinkedIn profiles and 70 million company records globally.

AI-Verified Accuracy: Enjoy 99% data accuracy through our advanced AI and manual validation processes.

Real-Time Data: Profiles are updated daily, ensuring you always have the most relevant insights.

Tailored Solutions: Get custom-curated LinkedIn data delivered directly, without managing platforms.

Ethically Sourced Data: Compliant with global privacy laws, ensuring responsible data usage.

Comprehensive Profiles: Over 40 data points per profile, including job titles, skills, and company details.

Wide Industry Coverage: Covering sectors from tech to finance across regions like the US, UK, Europe, and Asia.

Key Use Cases:

Sales Prospecting and Lead Generation: Build targeted lead lists using LinkedIn company data and professional profiles, helping sales teams engage decision-makers at high-value accounts.

Recruitment and Talent Sourcing: Use LinkedIn profile data to identify and reach top candidates globally. Our employee data includes work history, skills, and education, providing all the details you need for successful recruitment.

Account-Based Marketing (ABM): Use our LinkedIn company data to tailor marketing campaigns to key accounts, making your outreach efforts more personalized and effective.

Investment Research & Due Diligence: Identify companies with strong growth potential using LinkedIn company data. Access key data points such as funding history, employee count, and company trends to fuel investment decisions.

Competitor Analysis: Stay ahead of your competition by tracking hiring trends, employee movement, and company growth through LinkedIn data. Use these insights to adjust your market strategy and improve your competitive positioning.

CRM Data Enrichment: Enhance your CRM systems with real-time updates from Success.ai’s LinkedIn data, ensuring that your sales and marketing teams are always working with accurate and up-to-date information.

Comprehensive Data Points for LinkedIn Profiles: Our LinkedIn profile data includes over 40 key data points for every individual and company, ensuring a complete understandin...
n
Data from: Net-zero 1.5 °C sectorial pathways for G20 countries: energy and...
data.niaid.nih.gov
datadryad.org
zip
Updated Sep 1, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sven Teske; Jonathan Rispler; Sarah Niklas; Maartje Feenstra; Soheil Mohseni; Simran Talwar; Saori Miyake (2023). Net-zero 1.5 °C sectorial pathways for G20 countries: energy and emissions data to inform science-based decarbonization targets [Dataset]. http://doi.org/10.5061/dryad.cz8w9gj82
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5061/dryad.cz8w9gj82
Dataset updated
Sep 1, 2023
Dataset provided by
University of Technology Sydney
Authors
Sven Teske; Jonathan Rispler; Sarah Niklas; Maartje Feenstra; Soheil Mohseni; Simran Talwar; Saori Miyake
License
https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
Description
This data for global, regional (EU-27), and country-specific (G20 member countries) energy and emission pathways required to achieve a defined carbon budget of under 450 Gt/CO2, developed to limit the mean global temperature rise to 1.5°C, over 50% likelihood. The data were calculated with the 1.5°C sectorial pathways of the One Earth Climate Model—an integrated energy assessment model devised at the University of Technology Sydney (UTS). The data consist of the following six zip-folder datasets (refer to Section 2 for an explanation of the data): 1. Appendix folder: Each file contains one worksheet, which summarizes the overall 1.5°C scenario. 2. Sector folder (XLSX): Each file contains one worksheet, which summarizes the industry sectors analysed. 3. Sector folder (CSV): The data contained are the same as those described in point 2. 4. Sector emissions folder: Each file contains one worksheet, which summarizes the total annual emissions for each industry sector. 5. Scope emissions folder (XLSX): Each file contains one worksheet, which summarizes the total annual emissions for each industry sector—with the additional specificity of emission scope. 6. Scope emissions folder (CSV): The data contained are the same as those described in point 5. Methods The data consist of the following six zipped dataset folders, each containing 21 separate files for each of the areas assessed. 1. Appendix zip folder: contains 21 XLSX files. Each file contains one worksheet, which summarizes the overall 1.5 °C scenario. This tab is called the ‘Appendix’ and contains: electricity generation (TWh/a), transport—final energy (PJ/a), heat supply and air conditioning (PJ/a), installed capacity (GW), final energy demand (PJ/a), energy-related CO2 emissions (million tons/a), and primary energy demand (PJ/a). 2. Sector zip folder (XLSX): contains 21 XLSX files. Each file contains one worksheet, which summarizes the industry sectors analysed. Key industry metrics are provided, such as the energy and carbon intensities of the GICS sectors analysed. Due to industry specificity—and the choice of methodology—the units of data vary between the different sectors. 3. Sector zip folder (CSV): contains 21 CSV files. The data contained are the same as those described in point 2. However, the data have been organized in a database layout and saved in the CSV file format, significantly improving data parsing. 4. Sector emission zip folder: contains 21 XLSX files. Each file contains one worksheet, which summarizes the total annual emissions (MtCO2/a) for each industry sector. 5. Scope emissions zip folder (XLSX): contains 21 XLSX files. Each file contains one worksheet, which summarizes the total annual emissions (MtCO2/a) for each industry sector—and specifies the emission scopes. This tab also provides an additional breakdown of emissions into the categories of CO2 and total GHG emissions. Two accounting methodologies are presented: (i) the OECM approach, which defines Scope 1 emissions as those related to heat and energy use; and (ii) the production-centric approach, which places the emission burden of other non-energy and Scope 3 emissions on the producer, because they are categorized as Scope 1 emissions. 6. Scope emissions zip folder (CSV): contains 21 CSV files. The data contained are the same as those described in point 5. However, the data have been organized in a database layout and saved in the CSV file format to improve data parsing. The six datasets are summarized in Table 1, with further information on the data presented in the following sub-sections. Table 1: Overview of the data files/datasets

Label

Name of data file/dataset

File types

Data repository and identifier (DOI or accession number)

Dataset 1

Appendix

XLSX

https://doi.org/10.5061/dryad.cz8w9gj82

Dataset 2

Sector_XLSX

XLSX

https://doi.org/10.5061/dryad.cz8w9gj82

Dataset 3

Sector_CSV

CSV

https://doi.org/10.5061/dryad.cz8w9gj82

Dataset 4

Sector_Emission

XLSX

https://doi.org/10.5061/dryad.cz8w9gj82

Dataset 5

Scope_Emission_XLSX

XLSX

https://doi.org/10.5061/dryad.cz8w9gj82

Dataset 6

Scope_Emission_CSV

CSV

https://doi.org/10.5061/dryad.cz8w9gj82

1.1. Description of data parameters The datasets contain the following scenario input parameters: 1. Market development: current and assumed development of the demand by sector, such as cement produced, passenger kilometers travelled, or assumed market volume in US$2015 gross domestic product (GDP). 2. Energy intensity—activity based: energy use per unit of service and/or product; for example, in megajoules (MJ) per passenger kilometer travelled (MJ/pkm), MJ per ton of steel (MJ/ton steel), aluminum, or cement. 3. Energy intensity—finance based: energy use per unit of investment in MJ per US$ GDP (MJ/$GDP) contributed by, for example, the forestry or agricultural sector. The dataset contains the following scenario output parameters: 4. Carbon intensity: current and future carbon intensities per unit of product or service; for example, in tons of CO2 per ton of steel produced (tCO2/ton steel) or grams of carbon dioxide per passenger kilometer (gCO2/pkm). 5. Scope 1, 2, and 3 emissions: datasets for each of the industry sectors and countries analysed. In addition to the emissions data, the deviations of the emissions from those of the year 2019 are provided. 6. Country scenarios: complete country scenario datasets of historical data (2012, 2015–2020) and future projections (2025–2050 in 5-year increments). Energy demand and supply data by technology, fuel, and sector are provided, including the overall energy and carbon emissions balance of the country analysed. 1.2. Geographic resolution: country data provided The dataset contains data for the following 21 countries and regions: · Regions: global, EU-27 · Countries: G20 member countries—Canada, USA, Mexico, Brazil, Argentina, Germany, France, Italy, United Kingdom, Türkiye, Russian Federation, Saudi Arabia, South Africa, Indonesia, India, China, Japan, South Korea, and Australia 1.3. Sectorial resolution: industry sector data provided The dataset contains data for the following industry sectors: Agriculture & food processing, forestry & wood products, chemical industry, aluminum industry, construction and buildings, water utilities, textile & leather industry, steel industry, cement industry, transport sector (aviation: freight & passenger transport; shipping: freight & passenger transport; and road transport: freight & passenger transport). 1.4. Time resolution The scenario data are provided for the years 2017, 2018, 2019, 2020, 2025, 2030, 2035, 2040, 2045, and 2050.
T
World GDP
tradingeconomics.com
it.tradingeconomics.com
+10more
csv, excel, json, xml
Updated Dec 15, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
TRADING ECONOMICS (2024). World GDP [Dataset]. https://tradingeconomics.com/world/gdp
Explore at:
json, excel, csv, xmlAvailable download formats
Dataset updated
Dec 15, 2024
Dataset authored and provided by
TRADING ECONOMICS
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
Dec 31, 1960 - Dec 31, 2024
Area covered
World, World
Description
The Gross Domestic Product (GDP) in World was worth 111326.37 billion US dollars in 2024, according to official data from the World Bank. This dataset includes a chart with historical data for World GDP.
h
lmsys-chat-1m
huggingface.co
opendatalab.com
Updated Sep 17, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Large Model Systems Organization (2023). lmsys-chat-1m [Dataset]. https://huggingface.co/datasets/lmsys/lmsys-chat-1m
Explore at:
Dataset updated
Sep 17, 2023
Dataset authored and provided by
Large Model Systems Organization
Description
LMSYS-Chat-1M: A Large-Scale Real-World LLM Conversation Dataset

This dataset contains one million real-world conversations with 25 state-of-the-art LLMs. It is collected from 210K unique IP addresses in the wild on the Vicuna demo and Chatbot Arena website from April to August 2023. Each sample includes a conversation ID, model name, conversation text in OpenAI API JSON format, detected language tag, and OpenAI moderation API tag. User consent is obtained through the "Terms of use"… See the full description on the dataset page: https://huggingface.co/datasets/lmsys/lmsys-chat-1m.
u
Data from: A dataset of spatiotemporally sampled MODIS Leaf Area Index with...
agdatacommons.nal.usda.gov
application/csv
Updated May 1, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yanghui Kang; Mutlu Ozdogan; Feng Gao; Martha C. Anderson; William A. White; Yun Yang; Yang Yang; Tyler A. Erickson (2025). A dataset of spatiotemporally sampled MODIS Leaf Area Index with corresponding Landsat surface reflectance over the contiguous US [Dataset]. http://doi.org/10.15482/USDA.ADC/1521097
Explore at:
application/csvAvailable download formats
Unique identifier
https://doi.org/10.15482/USDA.ADC/1521097
Dataset updated
May 1, 2025
Dataset provided by
Ag Data Commons
Authors
Yanghui Kang; Mutlu Ozdogan; Feng Gao; Martha C. Anderson; William A. White; Yun Yang; Yang Yang; Tyler A. Erickson
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Contiguous United States, United States
Description
Leaf Area Index (LAI) is a fundamental vegetation structural variable that drives energy and mass exchanges between the plant and the atmosphere. Moderate-resolution (300m – 7km) global LAI data products have been widely applied to track global vegetation changes, drive Earth system models, monitor crop growth and productivity, etc. Yet, cutting-edge applications in climate adaptation, hydrology, and sustainable agriculture require LAI information at higher spatial resolution (< 100m) to model and understand heterogeneous landscapes. This dataset was built to assist a machine-learning-based approach for mapping LAI from 30m-resolution Landsat images across the contiguous US (CONUS). The data was derived from the Moderate Resolution Imaging Spectroradiometer (MODIS) Version 6 LAI/FPAR, Landsat Collection 1 surface reflectance, and NLCD Land Cover datasets over 2006 – 2018 using Google Earth Engine. Each record/sample/row includes a MODIS LAI value, corresponding Landsat surface reflectance in green, red, NIR, SWIR1 bands, a land cover (biome) type, geographic location, and other auxiliary information. Each sample represents a MODIS LAI pixel (500m) within which a single biome type dominates 90% of the area. The spatial homogeneity of the samples was further controlled by a screening process based on the coefficient of variation of the Landsat surface reflectance. In total, there are approximately 1.6 million samples, stratified by biome, Landsat sensor, and saturation status from the MODIS LAI algorithm. This dataset can be used to train machine learning models and generate LAI maps for Landsat 5, 7, 8 surface reflectance images within CONUS. Detailed information on the sample generation and quality control can be found in the related journal article. Resources in this dataset:Resource Title: README. File Name: LAI_train_samples_CONUS_README.txtResource Description: Description and metadata of the main datasetResource Software Recommended: Notepad,url: https://www.microsoft.com/en-us/p/windows-notepad/9msmlrh6lzf3?activetab=pivot:overviewtab Resource Title: LAI_training_samples_CONUS. File Name: LAI_train_samples_CONUS_v0.1.1.csvResource Description: This CSV file consists of the training samples for estimating Leaf Area Index based on Landsat surface reflectance images (Collection 1 Tire 1). Each sample has a MODIS LAI value and corresponding surface reflectance derived from Landsat pixels within the MODIS pixel. Contact: Yanghui Kang (kangyanghui@gmail.com)
Column description

UID: Unique identifier. Format: LATITUDE_LONGITUDE_SENSOR_PATHROW_DATE
Landsat_ID: Landsat image ID Date: Landsat image date in "YYYYMMDD" Latitude: Latitude (WGS84) of the MODIS LAI pixel center Longitude: Longitude (WGS84) of the MODIS LAI pixel center MODIS_LAI: MODIS LAI value in "m2/m2" MODIS_LAI_std: MODIS LAI standard deviation in "m2/m2" MODIS_LAI_sat: 0 - MODIS Main (RT) method used no saturation; 1 - MODIS Main (RT) method with saturation NLCD_class: Majority class code from the National Land Cover Dataset (NLCD) NLCD_frequency: Percentage of the area cover by the majority class from NLCD Biome: Biome type code mapped from NLCD (see below for more information) Blue: Landsat surface reflectance in the blue band Green: Landsat surface reflectance in the green band Red: Landsat surface reflectance in the red band Nir: Landsat surface reflectance in the near infrared band Swir1: Landsat surface reflectance in the shortwave infrared 1 band Swir2: Landsat surface reflectance in the shortwave infrared 2 band Sun_zenith: Solar zenith angle from the Landsat image metadata. This is a scene-level value. Sun_azimuth: Solar azimuth angle from the Landsat image metadata. This is a scene-level value. NDVI: Normalized Difference Vegetation Index computed from Landsat surface reflectance EVI: Enhanced Vegetation Index computed from Landsat surface reflectance NDWI: Normalized Difference Water Index computed from Landsat surface reflectance GCI: Green Chlorophyll Index = Nir/Green - 1

Biome code

1 - Deciduous Forest
2 - Evergreen Forest
3 - Mixed Forest
4 - Shrubland
5 - Grassland/Pasture
6 - Cropland
7 - Woody Wetland
8 - Herbaceous Wetland

Reference Dataset: All data was accessed through Google Earth Engine Gorelick, N., Hancher, M., Dixon, M., Ilyushchenko, S., Thau, D., & Moore, R. (2017). Google Earth Engine: Planetary-scale geospatial analysis for everyone. Remote Sensing of Environment. MODIS Version 6 Leaf Area Index/FPAR 4-day L5 Global 500m Myneni, R., Y. Knyazikhin, T. Park. MOD15A2H MODIS/Terra Leaf Area Index/FPAR 8-Day L4 Global 500m SIN Grid V006. 2015, distributed by NASA EOSDIS Land Processes DAAC, https://doi.org/10.5067/MODIS/MOD15A2H.006 Landsat 5/7/8 Collection 1 Surface Reflectance Landsat Level-2 Surface Reflectance Science Product courtesy of the U.S. Geological Survey. Masek, J.G., Vermote, E.F., Saleous N.E., Wolfe, R., Hall, F.G., Huemmrich, K.F., Gao, F., Kutler, J., and Lim, T-K. (2006). A Landsat surface reflectance dataset for North America, 1990–2000. IEEE Geoscience and Remote Sensing Letters 3(1):68-72. http://dx.doi.org/10.1109/LGRS.2005.857030. Vermote, E., Justice, C., Claverie, M., & Franch, B. (2016). Preliminary analysis of the performance of the Landsat 8/OLI land surface reflectance product. Remote Sensing of Environment. http://dx.doi.org/10.1016/j.rse.2016.04.008. National Land Cover Dataset (NLCD) Yang, Limin, Jin, Suming, Danielson, Patrick, Homer, Collin G., Gass, L., Bender, S.M., Case, Adam, Costello, C., Dewitz, Jon A., Fry, Joyce A., Funk, M., Granneman, Brian J., Liknes, G.C., Rigge, Matthew B., Xian, George, A new generation of the United States National Land Cover Database—Requirements, research priorities, design, and implementation strategies: ISPRS Journal of Photogrammetry and Remote Sensing, v. 146, p. 108–123, at https://doi.org/10.1016/j.isprsjprs.2018.09.006 Resource Software Recommended: Microsoft Excel,url: https://www.microsoft.com/en-us/microsoft-365/excel
EMIT L2B Carbon Dioxide Enhancement Data 60 m V002 - Dataset - NASA Open...
data.nasa.gov
Updated May 11, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
nasa.gov (2025). EMIT L2B Carbon Dioxide Enhancement Data 60 m V002 - Dataset - NASA Open Data Portal [Dataset]. https://data.nasa.gov/dataset/emit-l2b-carbon-dioxide-enhancement-data-60-m-v002
Explore at:
Dataset updated
May 11, 2025
Dataset provided by
NASAhttp://nasa.gov/
Description
The Earth Surface Mineral Dust Source Investigation (EMIT) instrument measures surface mineralogy, targeting the Earth’s arid dust source regions. EMIT is installed on the International Space Station. EMIT uses imaging spectroscopy to take measurements of sunlit regions of interest between 52° N latitude and 52° S latitude. An interactive map showing the regions being investigated, current and forecasted data coverage, and additional data resources can be found on the VSWIR Imaging Spectroscopy Interface for Open Science (VISIONS) EMIT Open Data Portal.In addition to its primary objective described above, EMIT has demonstrated the capacity to characterize methane (CH4) and carbon dioxide (CO2) point-source emissions by measuring gas absorption features in the shortwave infrared bands. The EMIT Level 2B Carbon Dioxide Enhancement Data (EMITL2BCO2ENH) Version 2 data product is a total vertical column enhancement estimate of carbon dioxide in parts per million meter (ppm m) based on an adaptive matched filter approach. EMITL2BCO2ENH provides per-pixel carbon dioxide enhancement data used to identify carbon dioxide plume complexes, per-pixel carbon dioxide uncertainty due to sensor noise, and per-pixel carbon dioxide sensitivity that can be used to remove bias from the enhancement data. The EMITL2BCO2ENH Version 2 data product includes methane enhancement granules for all captured scenes, regardless of carbon dioxide plume complex identification. Each granule contains three Cloud Optimized GeoTIFF (COG) files at a spatial resolution of 60 meters (m): Carbon Dioxide Enhancement (EMIT_L2B_CO2ENH), Carbon Dioxide Uncertainty (EMIT_L2B_CO2UNCERT), and Carbon Dioxide Sensitivity (EMIT_L2B_CO2SENS). The EMITL2BCO2ENH COG files contain carbon dioxide enhancement data based primarily on EMITL1BRAD radiance values.Each granule is approximately 75 kilometers (km) by 75 km, nominal at the equator, with some granules near the end of an orbit segment reaching 150 km in length.Known Issues Data acquisition gap: From September 13, 2022, through January 6, 2023, a power issue outside of EMIT caused a pause in operations. Due to this shutdown, no data were acquired during that timeframe.Improvements/Changes from Previous Versions Carbon dioxide uncertainty and sensitivity variables have been added. For more details on the uncertainty variable, see Section 6 of the Algorithm Theoretical Basis Document (ATBD) and Section 4.2.2 for details on the sensitivity variable. Enhancement, uncertainty, and sensitivity data are now included for all granules, including those without plume complexes. Version 1 of this product only included enhancement data for granules where plumes were present. The matched filter used to produce carbon dioxide enhancement data has been improved by adjusting the channels used to those that fall within 500-1340 nanometer (nm), 1500-1790 nm, or 1950-2450 nm. More details can be found in Section 4.2.3 of the ATBD.
g
Data from: Analysis of cosmogenic 10Be concentrations of Siwalik sediments...
dataservices.gfz-potsdam.de
Updated 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sanjay Kumar Mandal; Dirk Scherler; Hella Wittmann (2021). Analysis of cosmogenic 10Be concentrations of Siwalik sediments and modern river sands from the north-western Himalaya and the calculated 10Be-derived paleoerosion rates [Dataset]. http://doi.org/10.5880/gfz.3.3.2021.006
Explore at:
Unique identifier
https://doi.org/10.5880/gfz.3.3.2021.006
Dataset updated
2021
Dataset provided by
datacite
GFZ Data Services
Authors
Sanjay Kumar Mandal; Dirk Scherler; Hella Wittmann
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered

Description
These datasets were used to evaluate the main controls on last ~6 million years erosion rate variability of the northwestern Himalaya. The Earth’s climate has been cooling during the last ~15 million years and started fluctuating between cold and warm periods since ~2-3 million years ago. Many researchers think that these long-term climatic changes were accompanied by changes in continental erosion. However, quantifying erosion rates in the geological past is challenging, and previous studies reached contrasting conclusions. In this study, we quantified erosion rates in the north-western Indian Himalaya over the past 6 million years by measuring in situ-produced cosmogenic 10Be in exhumed older foreland basin sediments. The 10Be is produced by cosmic rays in minerals at the Earth's surface, and its abundance indicates erosion rates. Our reconstructed erosion rates show a quasi-cyclic pattern with a periodicity of ~1 million year and a gradual increase towards the present. We suggest that both patterns—cyclicity and gradual increase—are unrelated to climatic changes. Instead, we propose that the growth of the Himalaya by repeatedly scraping off rocks from the Indian plate (basal accretion), resulted in changes of its topography that were accompanied by changes in erosion rates. In this scenario, basal accretion episodically changes rock-uplift patterns, which brings landscapes out of equilibrium and results in quasi-cyclic variations in erosion rates. We used numerical landscape evolution simulations to demonstrate that this hypothesis is physically plausible. Datasets provided here includes summary of the location, depositional age, and stratigraphic position of 41 Siwalik sandstone samples collected from the Haripur section in Himachal Pradesh, India (Dataset S1); 10Be analysis results of Siwalik samples (2021-006_Mandal-et-al_Dataset-S1); sample location and 10Be analysis results of modern river sands from the Yamuna River and its tributaries near the Dehradun Basin (2021-006_Mandal-et-al_Dataset-S2); input parameters for the calculation of paleoerosion rates (2021-006_Mandal-et-al_Dataset-S3); and reconstructed 10Be paleoconcentrations and paleoerosion rates (Dataset S4). Moreover, the data include a compilation of published magnetostratigraphy-derived sediment accumulation rates in the late Cenozoic Himalayan foreland basin (2021-006_Mandal-et-al_Dataset-S5). We also include a movie (2021-006_Mandal-et-al_Movie-S1) that is a complete numerical landscape evolution model run with four consecutive accretion cycles of equal magnitude. For more information (for e.g., sampling method, analytical procedure, and data processing) please refer to the associated data description file and the main article (Mandal et al., 2021).
Z
CURE-TSR: Challenging Unreal and Real Environments for Traffic Sign...
data.niaid.nih.gov
explore.openaire.eu
+1more
Updated Jun 28, 2020
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ghassan AlRegib (2020). CURE-TSR: Challenging Unreal and Real Environments for Traffic Sign Recognition [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_3903065
Explore at:
Dataset updated
Jun 28, 2020
Dataset provided by
Dogancan Temel
Ghassan AlRegib
Mohit Prabhushankar
Gukyeong Kwon
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
As one of the research directions at OLIVES Lab @ Georgia Tech, we focus on the robustness of data-driven algorithms under diverse challenging conditions where trained models can possibly be depolyed. To achieve this goal, we introduced a large-sacle (>2M images) traffic sign recognition dataset (CURE-TSR) which is among the most comprehensive datasets with controlled synthetic challenging conditions. Traffic sign images in the CURE-TSR dataset were cropped from the CURE-TSD dataset, which includes around 1.7 million real-world and simulator images with more than 2 million traffic sign instances. Real-world images were obtained from the BelgiumTS video sequences and simulated images were generated with the Unreal Engine 4 game development tool. Sign types include speed limit, goods vehicles, no overtaking, no stopping, no parking, stop, bicycle, hump, no left, no right, priority to, no entry, yield, and parking. Unreal and real sequences were processed with state-of-the-art visual effect software Adobe(c) After Effects to simulate challenging conditions, which include rain, snow, haze, shadow, darkness, brightness, blurriness, dirtiness, colorlessness, sensor and codec errors. Please refer to our GitHub page for code, papers, and more information.

Instructions:

The name format of the provided images are as follows: "sequenceType_signType_challengeType_challengeLevel_Index.bmp"

sequenceType: 01 - Real data 02 - Unreal data

signType: 01 - speed_limit 02 - goods_vehicles 03 - no_overtaking 04 - no_stopping 05 - no_parking 06 - stop 07 - bicycle 08 - hump 09 - no_left 10 - no_right 11 - priority_to 12 - no_entry 13 - yield 14 - parking

challengeType: 00 - No challenge 01 - Decolorization 02 - Lens blur 03 - Codec error 04 - Darkening 05 - Dirty lens 06 - Exposure 07 - Gaussian blur 08 - Noise 09 - Rain 10 - Shadow 11 - Snow 12 - Haze

challengeLevel: A number in between [01-05] where 01 is the least severe and 05 is the most severe challenge.

Index: A number shows different instances of traffic signs in the same conditions.
m
The Climate Change Twitter Dataset
data.mendeley.com
kaggle.com
Updated May 19, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Dimitrios Effrosynidis (2022). The Climate Change Twitter Dataset [Dataset]. http://doi.org/10.17632/mw8yd7z9wc.2
Explore at:
Unique identifier
https://doi.org/10.17632/mw8yd7z9wc.2
Dataset updated
May 19, 2022
Authors
Dimitrios Effrosynidis
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
If you use the dataset, cite the paper: https://doi.org/10.1016/j.eswa.2022.117541

The most comprehensive dataset to date regarding climate change and human opinions via Twitter. It has the heftiest temporal coverage, spanning over 13 years, includes over 15 million tweets spatially distributed across the world, and provides the geolocation of most tweets. Seven dimensions of information are tied to each tweet, namely geolocation, user gender, climate change stance and sentiment, aggressiveness, deviations from historic temperature, and topic modeling, while accompanied by environmental disaster events information. These dimensions were produced by testing and evaluating a plethora of state-of-the-art machine learning algorithms and methods, both supervised and unsupervised, including BERT, RNN, LSTM, CNN, SVM, Naive Bayes, VADER, Textblob, Flair, and LDA.

The following columns are in the dataset:

➡ created_at: The timestamp of the tweet. ➡ id: The unique id of the tweet. ➡ lng: The longitude the tweet was written. ➡ lat: The latitude the tweet was written. ➡ topic: Categorization of the tweet in one of ten topics namely, seriousness of gas emissions, importance of human intervention, global stance, significance of pollution awareness events, weather extremes, impact of resource overconsumption, Donald Trump versus science, ideological positions on global warming, politics, and undefined. ➡ sentiment: A score on a continuous scale. This scale ranges from -1 to 1 with values closer to 1 being translated to positive sentiment, values closer to -1 representing a negative sentiment while values close to 0 depicting no sentiment or being neutral. ➡ stance: That is if the tweet supports the belief of man-made climate change (believer), if the tweet does not believe in man-made climate change (denier), and if the tweet neither supports nor refuses the belief of man-made climate change (neutral). ➡ gender: Whether the user that made the tweet is male, female, or undefined. ➡ temperature_avg: The temperature deviation in Celsius and relative to the January 1951-December 1980 average at the time and place the tweet was written. ➡ aggressiveness: That is if the tweet contains aggressive language or not.

Since Twitter forbids making public the text of the tweets, in order to retrieve it you need to do a process called hydrating. Tools such as Twarc or Hydrator can be used to hydrate tweets.
The GDELT Project
kaggle.com
zip
Updated Feb 12, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
The GDELT Project (2019). The GDELT Project [Dataset]. https://www.kaggle.com/datasets/gdelt/gdelt
Explore at:
zip(0 bytes)Available download formats
Dataset updated
Feb 12, 2019
Dataset authored and provided by
The GDELT Project
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Context

The GDELT Project is the largest, most comprehensive, and highest resolution open database of human society ever created. Just the 2015 data alone records nearly three quarters of a trillion emotional snapshots and more than 1.5 billion location references, while its total archives span more than 215 years, making it one of the largest open-access spatio-temporal datasets in existance and pushing the boundaries of "big data" study of global human society. Its Global Knowledge Graph connects the world's people, organizations, locations, themes, counts, images and emotions into a single holistic network over the entire planet. How can you query, explore, model, visualize, interact, and even forecast this vast archive of human society?

Content

GDELT 2.0 has a wealth of features in the event database which includes events reported in articles published in 65 live translated languages, measurements of 2,300 emotions and themes, high resolution views of the non-Western world, relevant imagery, videos, and social media embeds, quotes, names, amounts, and more.

You may find these code books helpful:
GDELT Global Knowledge Graph Codebook V2.1 (PDF)
GDELT Event Codebook V2.0 (PDF)

Querying BigQuery tables

You can use the BigQuery Python client library to query tables in this dataset in Kernels. Note that methods available in Kernels are limited to querying data. Tables are at bigquery-public-data.github_repos.[TABLENAME]. [Fork this kernel to get started][98] to learn how to safely manage analyzing large BigQuery datasets.

Acknowledgements

You may redistribute, rehost, republish, and mirror any of the GDELT datasets in any form. However, any use or redistribution of the data must include a citation to the GDELT Project and a link to the website (https://www.gdeltproject.org/).

GeoDAR: Georeferenced global Dams And Reservoirs dataset for bridging...

zenodo.org
data.niaid.nih.gov

zip

Updated Jan 19, 2024

Facebook

Twitter

Click to copy link

Link copied

Cite

Jida Wang; Jida Wang; Blake A. Walter; Fangfang Yao; Fangfang Yao; Chunqiao Song; Chunqiao Song; Meng Ding; Abu S. Maroof; Jingying Zhu; Chenyu Fan; Jordan M. McAlister; Md Safat Sikder; Md Safat Sikder; Yongwei Sheng; Yongwei Sheng; George H. Allen; George H. Allen; Jean-François Crétaux; Yoshihide Wada; Yoshihide Wada; Blake A. Walter; Meng Ding; Abu S. Maroof; Jingying Zhu; Chenyu Fan; Jordan M. McAlister; Jean-François Crétaux (2024). GeoDAR: Georeferenced global Dams And Reservoirs dataset for bridging attributes and geolocations [Dataset]. http://doi.org/10.5281/zenodo.6163413

Explore at:

zipAvailable download formats

Unique identifier

https://doi.org/10.5281/zenodo.6163413

Dataset updated

Jan 19, 2024

Dataset provided by

Zenodohttp://zenodo.org/

Authors

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Documented March 19, 2023

!!NEW!!!

GeoDAR reservoirs were registered to the drainage network! Please see the auxiliary data "GeoDAR-TopoCat" at https://zenodo.org/records/7750736. "GeoDAR-TopoCat" contains the drainage topology (reaches and upstream/downstream relationships) and catchment boundary for each reservoir in GeoDAR, based on the algorithm used for Lake-TopoCat (doi:10.5194/essd-15-3483-2023).

Documented April 1, 2022

Citation

Wang, J., Walter, B. A., Yao, F., Song, C., Ding, M., Maroof, A. S., Zhu, J., Fan, C., McAlister, J. M., Sikder, M. S., Sheng, Y., Allen, G. H., Crétaux, J.-F., and Wada, Y.: GeoDAR: georeferenced global dams and reservoirs database for bridging attributes and geolocations. Earth System Science Data, 14, 1869–1899, 2022, https://doi.org/10.5194/essd-14-1869-2022.

Please cite the reference above (which was fully peer-reviewed), NOT the preprint version. Thank you.

Contact

Dr. Jida Wang, jidawang@ksu.edu, gdbruins@ucla.edu

Data description and components

Data folder “GeoDAR_v10_v11” (.zip) contains two consecutive, peer-reviewed versions (v1.0 and v1.1) of the Georeferenced global Dams And Reservoirs (GeoDAR) dataset:

GeoDAR_v10_dams (in both shapefile format and the comma-separated values (csv) format): GeoDAR version 1.0, including 22,560 dam points georeferenced based on the World Register of Dams (WRD), the International Commission on Large Dams (ICOLD; https://www.icold-cigb.org; last access on March 13th, 2019).
GeoDAR_v11_dams (in both shapefile and csv): GeoDAR version 1.1 dam points, including 24,783 dams which harmonized GeoDAR_v10_dams and the Global Reservoir and Dam Database (GRanD) v1.3 (Lehner et al., 2011).
GeoDAR_v11_reservoirs (in shapefile): GeoDAR version 1.1 reservoirs, including 21,515 reservoir polygons retrieved by associating GeoDAR_v11_dams with GRanD v1.3 reservoirs, HydroLAKES v1.0 (Messager et al., 2016), and the UCLA Circa 2015 Lake Inventory (Sheng et al., 2016). The reservoir retrieval follows a one-to-one relationship between dams and reservoirs.

As by-products of GeoDAR harmonization, folder “GeoDAR_v10_v11” also contains:

GRanD_v13_issues.csv: This file contains the original records of all 7,320 dam points in GRanD v1.3, with 94 of them marked by our identified issues and suggested corrections. These 94 records are placed at the beginning of this table. They include 89 records showing possible georeferencing and/or attribute errors, and another 5 records documented as subsumed or replaced. Our added fields start from column BG and include:
- “Issue”: main issue(s) of this record
- “Description”: more detailed explanation of the issue
- “Lat_corrected”: suggested correction for latitude (if any) in decimal degree
- “Lon_corrected”: suggested correction for longitude (if any) in decimal degree
- “Correction_source”: correction source(s)
- “Harmonized”: whether this GRanD dam was harmonized in GeoDAR v1.1 and the reason.
Wada_et_al_2017_harmonized.csv: This csv file contains the original records of all 139 georeferenced large dams/reservoirs in Wada et al. (2017; doi:10.1007/s10712-016-9399-6), with our revised storage capacities and spatial coordinates for data harmonization. Our added fields start from column E and include:
- Revised_capacity_km3: Our revised reservoir storage capacity in cubic kilometers used for harmonization
- Revised_lat: Revised latitude in decimal degree
- Revised_lon: Revised longitude in decimal degree
- Verification_notes: Description of the issues, verification sources, and other information used for harmonization.

Attribute description

Attribute	Description and values
v1.0 dams (file name: GeoDAR_v10_dams; format: comma-separated values (csv) and point shapefile)
id_v10	Dam ID for GeoDAR version 1.0 (type: integer). Note this is not the same as the International Code in ICOLD WRD but is linked to the International Code via encryption.
lat	Latitude of the dam point in decimal degree (type: float) based on datum World Geodetic System (WGS) 1984.
lon	Longitude of the dam point in decimal degree (type: float) on WGS 1984.
geo_mtd	Georeferencing method (type: text). Unique values include “geo-matching CanVec”, “geo-matching LRD”, “geo-matching MARS”, “geo-matching NID”, “geo-matching ODC”, “geo-matching ODM”, “geo-matching RSB”, “geocoding (Google Maps)”, and “Wada et al. (2017)”. Refer to Table 2 in Wang et al. (2022) for abbreviations.
qa_rank	Quality assurance (QA) ranking (type: text). Unique values include “M1”, “M2”, “M3”, “C1”, “C2”, “C3”, “C4”, and “C5”. The QA ranking provides a general measure for our georeferencing quality. Refer to Supplementary Tables S1 and S3 in Wang et al. (2022) for more explanation.
rv_mcm	Reservoir storage capacity in million cubic meters (type: float). Values are only available for large dams in Wada et al. (2017). Capacity values of other WRD records are not released due to ICOLD’s proprietary restriction. Also see Table S4 in Wang et al. (2022).
val_scn	Validation result (type: text). Unique values include “correct”, “register”, “mismatch”, “misplacement”, and “Google Maps”. Refer to Table 4 in Wang et al. (2022) for explanation.
val_src	Primary validation source (type: text). Values include “CanVec”, “Google Maps”, “JDF”, “LRD”, “MARS”, “NID”, “NPCGIS”, “NRLD”, “ODC”, “ODM”, “RSB”, and “Wada et al. (2017)”. Refer to Table 2 in Wang et al. (2022) for abbreviations.
qc	Roles and name initials of co-authors/participants during data quality control (QC) and validation. Name initials are given to each assigned dam or region and are listed generally in chronological order for each role. Collation and harmonization of large dams in Wada et al. (2017) (see Table S4 in Wang et al. (2022)) were performed by JW, and this information is not repeated in the qc attribute for a reduced file size. Although we tried to track the name initials thoroughly, the lists may not be always exhaustive, and other undocumented adjustments and corrections were most likely performed by JW.
v1.1 dams (file name: GeoDAR_v11_dams; format: comma-separated values (csv) and point shapefile)
id_v11	Dam ID for GeoDAR version 1.1 (type: integer). Note this is not the same as the International Code in ICOLD WRD but is linked to the International Code via encryption.
id_v10	v1.0 ID of this dam/reservoir (as in id_v10) if it is also included in v1.0 (type: integer).
id_grd_v13	GRanD ID of this dam if also included in GRanD v1.3 (type: integer).
lat	Latitude of the dam point in decimal degree (type: float) on WGS 1984. Value may be different from that in v1.0.
lon	Longitude of the dam point in decimal degree (type: float) on WGS 1984. Value may be different from that in v1.0.
geo_mtd	Same as the value of geo_mtd in v1.0 if this dam is included in v1.0.
qa_rank	Same as the value of qa_rank in v1.0 if this dam is included in v1.0.
val_scn	Same as the value of val_scn in v1.0 if this dam is included in v1.0.
val_src	Same as the value of val_src in v1.0 if this dam is included in v1.0.
rv_mcm_v10	Same as the value of rv_mcm in v1.0 if this dam is included in v1.0.
rv_mcm_v11	Reservoir storage capacity in million cubic meters (type: float). Due to ICOLD’s proprietary restriction, provided values are limited to dams in Wada et al. (2017) and GRanD v1.3. If a dam is in both Wada et al. (2017) and GRanD v1.3, the value from the latter (if valid) takes precedence.
har_src	Source(s) to harmonize the dam points. Unique values include “GeoDAR v1.0 alone”, “GRanD v1.3 and GeoDAR 1.0”, “GRanD v1.3 and other ICOLD”, and “GRanD v1.3 alone”. Refer to Table 1 in Wang et al. (2022) for more details.
pnt_src	Source(s) of the dam point spatial coordinates. Unique values include “GeoDAR v1.0”, “original

Synthetic Data Video Generator Market Research Report 2033
dataintelo.com
csv, pdf, pptx
Updated Jun 28, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Dataintelo (2025). Synthetic Data Video Generator Market Research Report 2033 [Dataset]. https://dataintelo.com/report/synthetic-data-video-generator-market
Explore at:
pdf, pptx, csvAvailable download formats
Dataset updated
Jun 28, 2025
Dataset authored and provided by
Dataintelo
License
https://dataintelo.com/privacy-and-policyhttps://dataintelo.com/privacy-and-policy
Time period covered
2024 - 2032
Area covered
Global
Description
Synthetic Data Video Generator Market Outlook

According to our latest research, the global synthetic data video generator market size reached USD 1.32 billion in 2024 and is anticipated to grow at a robust CAGR of 38.7% from 2025 to 2033. By the end of 2033, the market is projected to reach USD 18.59 billion, driven by rapid advancements in artificial intelligence, the growing need for high-quality training data for machine learning models, and increasing adoption across industries such as autonomous vehicles, healthcare, and surveillance. The surge in demand for data privacy, coupled with the necessity to overcome data scarcity and bias in real-world datasets, is significantly fueling the synthetic data video generator market's growth trajectory.

One of the primary growth factors for the synthetic data video generator market is the escalating demand for high-fidelity, annotated video datasets required to train and validate AI-driven systems. Traditional data collection methods are often hampered by privacy concerns, high costs, and the sheer complexity of obtaining diverse and representative video samples. Synthetic data video generators address these challenges by enabling the creation of large-scale, customizable, and bias-free datasets that closely mimic real-world scenarios. This capability is particularly vital for sectors such as autonomous vehicles and robotics, where the accuracy and safety of AI models depend heavily on the quality and variety of training data. As organizations strive to accelerate innovation and reduce the risks associated with real-world data collection, the adoption of synthetic data video generation technologies is expected to expand rapidly.

Another significant driver for the synthetic data video generator market is the increasing regulatory scrutiny surrounding data privacy and compliance. With stricter regulations such as GDPR and CCPA coming into force, organizations face mounting challenges in using real-world video data that may contain personally identifiable information. Synthetic data offers an effective solution by generating video datasets devoid of any real individuals, thereby ensuring compliance while still enabling advanced analytics and machine learning. Moreover, synthetic data video generators empower businesses to simulate rare or hazardous events that are difficult or unethical to capture in real life, further enhancing model robustness and preparedness. This advantage is particularly pronounced in healthcare, surveillance, and automotive industries, where data privacy and safety are paramount.

Technological advancements and increasing integration with cloud-based platforms are also propelling the synthetic data video generator market forward. The proliferation of cloud computing has made it easier for organizations of all sizes to access scalable synthetic data generation tools without significant upfront investments in hardware or infrastructure. Furthermore, the continuous evolution of generative adversarial networks (GANs) and other deep learning techniques has dramatically improved the realism and utility of synthetic video data. As a result, companies are now able to generate highly realistic, scenario-specific video datasets at scale, reducing both the time and cost required for AI development. This democratization of synthetic data technology is expected to unlock new opportunities across a wide array of applications, from entertainment content production to advanced surveillance systems.

From a regional perspective, North America currently dominates the synthetic data video generator market, accounting for the largest share in 2024, followed closely by Europe and Asia Pacific. The strong presence of leading AI technology providers, robust investment in research and development, and early adoption by automotive and healthcare sectors are key contributors to North America's market leadership. Europe is also witnessing significant growth, driven by stringent data privacy regulations and increased focus on AI-driven innovation. Meanwhile, Asia Pacific is emerging as a high-growth region, fueled by rapid digital transformation, expanding IT infrastructure, and increasing investments in autonomous systems and smart city projects. Latin America and Middle East & Africa, while still nascent, are expected to experience steady uptake as awareness and technological capabilities continue to grow.

Component Analysis

The synthetic data video generator market by comp

Facebook

Twitter

Click to copy link

Link copied

Cite

Statista Research Department (2025). Total population worldwide 1950-2100 [Dataset]. https://www.ai-chatbox.pro/?_=%2Ftopics%2F13342%2Faging-populations%2F%23XgboD02vawLKoDs%2BT%2BQLIV8B6B4Q9itA

Total population worldwide 1950-2100

Explore at:

Dataset updated

Apr 8, 2025

Dataset provided by

Statistahttp://statista.com/

Authors

Statista Research Department

Area covered

World

Description

The world population surpassed eight billion people in 2022, having doubled from its figure less than 50 years previously. Looking forward, it is projected that the world population will reach nine billion in 2038, and 10 billion in 2060, but it will peak around 10.3 billion in the 2080s before it then goes into decline. Regional variations The global population has seen rapid growth since the early 1800s, due to advances in areas such as food production, healthcare, water safety, education, and infrastructure, however, these changes did not occur at a uniform time or pace across the world. Broadly speaking, the first regions to undergo their demographic transitions were Europe, North America, and Oceania, followed by Latin America and Asia (although Asia's development saw the greatest variation due to its size), while Africa was the last continent to undergo this transformation. Because of these differences, many so-called "advanced" countries are now experiencing population decline, particularly in Europe and East Asia, while the fastest population growth rates are found in Sub-Saharan Africa. In fact, the roughly two billion difference in population between now and the 2080s' peak will be found in Sub-Saharan Africa, which will rise from 1.2 billion to 3.2 billion in this time (although populations in other continents will also fluctuate). Changing projections The United Nations releases their World Population Prospects report every 1-2 years, and this is widely considered the foremost demographic dataset in the world. However, recent years have seen a notable decline in projections when the global population will peak, and at what number. Previous reports in the 2010s had suggested a peak of over 11 billion people, and that population growth would continue into the 2100s, however a sooner and shorter peak is now projected. Reasons for this include a more rapid population decline in East Asia and Europe, particularly China, as well as a prolongued development arc in Sub-Saharan Africa.

Clear search

Close search

Google apps

Main menu

Total population worldwide 1950-2100

Climate Change: Earth Surface Temperature Data

World-population2023

Geonames - All Cities with a population > 1000

Hotel Dataset: Rates, Reviews & Amenities(6k+)

India: Soils Harmonized World Soil Database - General

United States GDP

Open Images

Context

Content

Querying BigQuery Tables

Acknowledgements

Inspiration

Success.ai | LinkedIn Full Dataset | Enrichment API – 700M Public Profiles &...

Data from: Net-zero 1.5 °C sectorial pathways for G20 countries: energy and...

World GDP

lmsys-chat-1m

Data from: A dataset of spatiotemporally sampled MODIS Leaf Area Index with...

EMIT L2B Carbon Dioxide Enhancement Data 60 m V002 - Dataset - NASA Open...

Data from: Analysis of cosmogenic 10Be concentrations of Siwalik sediments...

CURE-TSR: Challenging Unreal and Real Environments for Traffic Sign...

The Climate Change Twitter Dataset

The GDELT Project

Context

Content

Querying BigQuery tables

Acknowledgements

GeoDAR: Georeferenced global Dams And Reservoirs dataset for bridging...

Synthetic Data Video Generator Market Research Report 2033

Synthetic Data Video Generator Market Outlook

Component Analysis

Total population worldwide 1950-2100