51 datasets found
  1. Project CCHAIN

    • kaggle.com
    zip
    Updated Sep 30, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Thinking Machines Data Science (2024). Project CCHAIN [Dataset]. https://www.kaggle.com/datasets/thinkdatasci/project-cchain
    Explore at:
    zip(396604023 bytes)Available download formats
    Dataset updated
    Sep 30, 2024
    Authors
    Thinking Machines Data Science
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The Project Climate Change, Health, and Artificial Intelligence (Project CCHAIN) dataset is a validated, open-sourced linked dataset containing 20 years (2003-2022) of climate, environmental, socioeconomic, and health dimensions at the barangay (village) level across twelve Philippine cities (Dagupan, Palayan, Navotas, Mandaluyong, Muntinlupa, Legazpi, Iloilo, Mandaue, Tacloban, Zamboanga, Cagayan de Oro, Davao). The full documentation can be accessed here.

    The tables are designed in a way that users can choose variables that are most relevant to their focus city and use case, and link these variables to form a single dataset by merging using standard geography codes and calendar dates. This can be done using the provided linking notebook, or offline using the user's own code.

    Here are some tips on how make most use of this dataset: - Focus on one location. Starting with a detailed analysis of one location allows for a better understanding of the local dynamics, which may differ across locations. - Choose one health data source. Pick one of either a central or local data source. Using two different data health sources is not advised because it will lead to double/overcounting of disease cases. - Do not use all variables at once- do a literature review first to identify possible key variables to identify possible key variables. More often than not, using all variables is not necessary and may even yield subpar results. - Decide whether or not to use regular or downscaled climate data. Our downscaled climate data provides nuanced insights on spatial patterns of a few climate variables. Kindly read the documentation before deciding to use this data. If you are uncertain, consider using only the climate_atmosphere table instead - Check data availability on your focus location and make sure they fit the requirements of your study.

    This dataset also includes household surveys tables (see schema here and here) done on partner informal settlement communities in the cities of Muntinlupa, Davao, Iloilo, and Mandaue and administered on various dates from 2001 to 2024. Due to the sensitive nature of surveys and the vulnerability of the subjects involved, requests for access must be submitted for review and approval by the Philippine Action for Community-Led Shelter Initiatives, Inc. (PACSII). To submit a request, please use this form.

    The Project CCHAIN dataset adapted the Creative Commons Attribution 4.0 International (CC BY 4.0) license. This allows anyone to share (copy and redistribute) and adapt (remix, transform, and build upon) a work, as long as they give appropriate credit to the original creator.

    One exception, the tm_open_buildings table, follows the Open Database License (ODbL) as directed by its source, OpenStreetMap. Under the ODbL, users are free to use, modify, and distribute the database, but on top of CC BY 4.0's attribution requirement, this license requires to share any modifications they make under the same ODbL license.

  2. Real Time Temperature Dataset

    • kaggle.com
    zip
    Updated Mar 17, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Devendra Parihar (2023). Real Time Temperature Dataset [Dataset]. https://www.kaggle.com/datasets/dev523/real-time-temperature-dataset
    Explore at:
    zip(348468 bytes)Available download formats
    Dataset updated
    Mar 17, 2023
    Authors
    Devendra Parihar
    Description

    Website link to get more datasets: https://power.larc.nasa.gov/

    The NASA POWER Project provides a wealth of data to support various applications related to energy, climate, and agriculture. One of the key datasets provided by the project is temperature data, which offers valuable insights into regional and global temperature patterns and trends. The temperature datasets are generated using advanced satellite remote sensing technologies and cover a wide range of spatial and temporal scales, from daily to monthly, and from local to global.

    The temperature data sets provided by the POWER Project have a number of uses. For example, they can be used to monitor and analyze the impacts of climate change on the planet, and to understand how changes in temperature are affecting ecosystems and the distribution of plant and animal species. They can also be used to inform energy planning and management decisions, such as the design and operation of renewable energy systems and building energy efficiency measures. The temperature data sets are also useful for agricultural planning and management, providing critical information on crop growth, water usage, and other factors that impact food production and food security.

    The temperature datasets from the NASA POWER Project are freely available to researchers, policymakers, and the general public, making them an important resource for anyone interested in the impacts of climate change and the use of renewable energy. Whether you're looking to understand the changing climate of our planet, plan and manage sustainable energy systems, or to ensure food security, the temperature datasets from the POWER Project are a valuable resource that can help you make informed decisions.

  3. Climate Change: Earth Surface Temperature Data

    • redivis.com
    • kaggle.com
    application/jsonl +7
    Updated Feb 17, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Columbia Data Platform Demo (2021). Climate Change: Earth Surface Temperature Data [Dataset]. https://redivis.com/datasets/1e0a-f4931vvyg
    Explore at:
    avro, csv, sas, stata, parquet, spss, arrow, application/jsonlAvailable download formats
    Dataset updated
    Feb 17, 2021
    Dataset provided by
    Redivis Inc.
    Authors
    Columbia Data Platform Demo
    Time period covered
    Nov 1, 1743 - Dec 1, 2015
    Area covered
    Earth
    Description

    Abstract

    Compilation of Earth Surface temperatures historical. Source: https://www.kaggle.com/berkeleyearth/climate-change-earth-surface-temperature-data

    Documentation

    Data compiled by the Berkeley Earth project, which is affiliated with Lawrence Berkeley National Laboratory. The Berkeley Earth Surface Temperature Study combines 1.6 billion temperature reports from 16 pre-existing archives. It is nicely packaged and allows for slicing into interesting subsets (for example by country). They publish the source data and the code for the transformations they applied. They also use methods that allow weather observations from shorter time series to be included, meaning fewer observations need to be thrown away.

    In this dataset, we have include several files:

    Global Land and Ocean-and-Land Temperatures (GlobalTemperatures.csv):

    • Date: starts in 1750 for average land temperature and 1850 for max and min land temperatures and global ocean and land temperatures

    %3C!-- --%3E

    • LandAverageTemperature: global average land temperature in celsius

    %3C!-- --%3E

    • LandAverageTemperatureUncertainty: the 95% confidence interval around the average

    %3C!-- --%3E

    • LandMaxTemperature: global average maximum land temperature in celsius

    %3C!-- --%3E

    • LandMaxTemperatureUncertainty: the 95% confidence interval around the maximum land temperature

    %3C!-- --%3E

    • LandMinTemperature: global average minimum land temperature in celsius

    %3C!-- --%3E

    • LandMinTemperatureUncertainty: the 95% confidence interval around the minimum land temperature

    %3C!-- --%3E

    • LandAndOceanAverageTemperature: global average land and ocean temperature in celsius

    %3C!-- --%3E

    • LandAndOceanAverageTemperatureUncertainty: the 95% confidence interval around the global average land and ocean temperature

    %3C!-- --%3E

    **Other files include: **

    • Global Average Land Temperature by Country (GlobalLandTemperaturesByCountry.csv)

    %3C!-- --%3E

    • Global Average Land Temperature by State (GlobalLandTemperaturesByState.csv)

    %3C!-- --%3E

    • Global Land Temperatures By Major City (GlobalLandTemperaturesByMajorCity.csv)

    %3C!-- --%3E

    • Global Land Temperatures By City (GlobalLandTemperaturesByCity.csv)

    %3C!-- --%3E

    The raw data comes from the Berkeley Earth data page.

  4. Global Wildfires & Climate Dataset (1881–2025)

    • kaggle.com
    Updated Aug 16, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Berke Karakanlı (2025). Global Wildfires & Climate Dataset (1881–2025) [Dataset]. https://www.kaggle.com/datasets/berkekarakanl/global-wildfires-and-climate-dataset-18812025
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Aug 16, 2025
    Dataset provided by
    Kaggle
    Authors
    Berke Karakanlı
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    This dataset provides a long-term record of global forest fires between 1881 and 2025. It includes both wildfire information and related climate variables, making it useful for:

    Exploratory Data Analysis (EDA)

    Machine Learning projects

    Time-series analysis of wildfire frequency

    Studying the relationship between climate change and wildfires

    🌍 Dataset Features:

    Covers multiple countries and regions worldwide

    Includes historical and recent fire events

    Captures environmental factors influencing fire behavior

    Researchers, data scientists, and students can use this dataset to analyze wildfire patterns, predict future risks, and explore the impact of climate change on global fire activity.

  5. The Weather Dataset

    • kaggle.com
    zip
    Updated Sep 3, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Guillem SD (2023). The Weather Dataset [Dataset]. https://www.kaggle.com/datasets/guillemservera/global-daily-climate-data
    Explore at:
    zip(223125687 bytes)Available download formats
    Dataset updated
    Sep 3, 2023
    Authors
    Guillem SD
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    Feel free to FORK THIS NOTEBOOK in order to correctly load the data for your project!

    Overview: This dataset offers a comprehensive collection of Daily weather readings from major cities around the world. In the first release, it included only capitals, but now it also adds main cities worldwide and hourly data as well, making up to ~1250 cities. Some locations provide historical data tracing back to January 2, 1833, giving users a deep dive into long-term weather patterns and their evolution.

    Data License and Updates: This dataset is updated every Sunday using data from Meteostat API, ensuring access to the latest week's data without overburdening the data source.

    Cities DataFrame (cities.csv)

    This dataframe offers details about individual cities and weather stations. - Columns: - station_id: Unique ID for the weather station. - city_name: Name of the city. - country: The country where the city is located. - state: The state or province within the country. - iso2: The two-letter country code. - iso3: The three-letter country code. - latitude: Latitude coordinate of the city. - longitude: Longitude coordinate of the city.

    Countries DataFrame (countires.csv)

    This dataframe contains information about different countries, providing insights into their geographic and demographic characteristics. - Columns: - iso3: The three-letter code representing the country. - country: The English name of the country. - native_name: The native name of the country. - iso2: The two-letter code representing the country. - population: The population of the country. - area: The total land area of the country in square kilometers. - capital: The name of the capital city. - capital_lat: The latitude coordinate of the capital city. - capital_lng: The longitude coordinate of the capital city. - region: The specific region within the continent where the country is located. - continent: The continent to which the country belongs. - hemisphere: The hemisphere in which the country is located (e.g., Northern, Southern).

    Daily Weather DataFrame (daily_weather.parquet)

    This dataframe provides weather data on a daily basis. - Columns: - station_id: Unique ID for the weather station. - city_name: Name of the city where the station is located. - date: Date of the weather record. - season: Season corresponding to the date (e.g., summer, winter). - avg_temp_c: Average temperature in Celsius. - min_temp_c: Minimum temperature in Celsius. - max_temp_c: Maximum temperature in Celsius. - precipitation_mm: Precipitation in millimeters. - snow_depth_mm: Snow depth in millimeters. - avg_wind_dir_deg: Average wind direction in degrees. - avg_wind_speed_kmh: Average wind speed in kilometers per hour. - peak_wind_gust_kmh: Peak wind gust in kilometers per hour. - avg_sea_level_pres_hpa: Average sea-level pressure in hectopascals. - sunshine_total_min: Total sunshine duration in minutes.

    These dataframes can be utilized for various analyses such as weather trend prediction, climate studies, geographic analysis, demographic insights, and more.

    Dataset Image Source: Photo credits to 越过山丘. View the original image here.

  6. Weather Prediction

    • kaggle.com
    • zenodo.org
    zip
    Updated Mar 10, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Devastator (2024). Weather Prediction [Dataset]. https://www.kaggle.com/datasets/thedevastator/weather-prediction
    Explore at:
    zip(958204 bytes)Available download formats
    Dataset updated
    Mar 10, 2024
    Authors
    The Devastator
    Description

    Credit to the original author: The dataset was originally published here

    Weather prediction dataset

    A dataset for teaching machine learning and deep learning

    Hands-on teaching of modern machine learning and deep learning techniques heavily relies on the use of well-suited datasets. The "weather prediction dataset" is a novel tabular dataset that was specifically created for teaching machine learning and deep learning to an academic audience. The dataset contains intuitively accessible weather observations from 18 locations in Europe. It was designed to be suitable for a large variety of different training goals, many of which are not easily giving way to unrealistically high prediction accuracy. Teachers or instructors thus can chose the difficulty of the training goals and thereby match it with the respective learner audience or lesson objective. The compact size and complexity of the dataset make it possible to quickly train common machine learning and deep learning models on a standard laptop so that they can be used in live hands-on sessions.

    The dataset can be found in the `\dataset` folder and be downloaded from zenodo: https://doi.org/10.5281/zenodo.4980359

    References

    If you make use of this dataset, in particular if this is in form of an academic contribution, then please cite the following two references:

    • Klein Tank, A.M.G. and Coauthors, 2002. Daily dataset of 20th-century surface air temperature and precipitation series for the European Climate Assessment. Int. J. of Climatol., 22, 1441-1453. Data and metadata available at http://www.ecad.eu
    • Florian Huber, Dafne van Kuppevelt, Peter Steinbach, Colin Sauze, Yang Liu, Berend Weel, "Will the sun shine? – An accessible dataset for teaching machine learning and deep learning", DOI TO BE ADDED!

    Map of the locations of the 18 weather stations from which data was collected

    Map of weather stations

  7. Climate Change Impact Data

    • kaggle.com
    zip
    Updated Oct 21, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    AP6621 (2024). Climate Change Impact Data [Dataset]. https://www.kaggle.com/datasets/aloktantrik/agricultural-productivity-and-environmental-impact
    Explore at:
    zip(67543 bytes)Available download formats
    Dataset updated
    Oct 21, 2024
    Authors
    AP6621
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Climate Change Impact Data

    Overview

    This dataset provides valuable insights into agricultural productivity across various states in India from 1990 to 2024. It includes data on crop yield, fertilizer consumption, annual rainfall, irrigation area, cropping intensity, agricultural credit, maximum temperature, and gross sown area over a span of 31 years. This information is useful for researchers, policymakers, and agricultural analysts aiming to understand factors affecting crop yields and make informed decisions.

    Dataset Information

    • Total Entries: 620
    • Unique Years: 31
    • Unique States: 20

    Column Descriptions

    Column NameDescription
    YearThe year of the observation (e.g., 1990, 1991).
    StateThe state where the data was collected (e.g., Andhra, Karnataka).
    Yield_per_hectareThe yield per hectare categorized into ranges (e.g., 503.35 - 1103.52).
    Fertilizer_conspThe fertilizer consumption categorized into ranges (e.g., 0.60 - 50.00).
    AnnualRainfallThe annual rainfall categorized into ranges (e.g., 201.51 - 553.25).
    Gross_irrigated_areaThe gross irrigated area categorized into ranges (e.g., 38.00 - 1876.60).
    Cropping_intensityThe cropping intensity categorized into ranges (e.g., 100.00 - 121.29).
    Agri_creditThe agricultural credit categorized into ranges (e.g., 0.00 - 198.05).
    MaxTempThe maximum temperature categorized into ranges (e.g., -3.30 - -0.15).
    Gross_sown_areaThe gross sown area categorized into ranges (e.g., 187.00 - 3729.80).

    Data Characteristics

    • Missing Values: None
    • Mismatched Entries: None
    • Statistical Summary:
      • Mean Yield_per_hectare: 2.28k
      • Mean Fertilizer Consumption: 107
      • Mean Annual Rainfall: 1.42k
      • Mean Gross Irrigated Area: 3.55k
      • Mean Cropping Intensity: 151
      • Mean Agricultural Credit: 179
      • Mean Maximum Temperature: 22.6
      • Mean Gross Sown Area: 9.6k

    Usage

    This dataset can be utilized for:

    • Agricultural Research: Analyze trends in crop yields and the impact of various factors.
    • Policy Making: Inform policies related to agriculture and sustainability.
    • Data Science Projects: Build predictive models for agricultural productivity.
  8. climate risk mitigation

    • kaggle.com
    zip
    Updated Oct 26, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Legendary (2024). climate risk mitigation [Dataset]. https://www.kaggle.com/datasets/jimohyusuf/climate-risk-mitigationn
    Explore at:
    zip(80261 bytes)Available download formats
    Dataset updated
    Oct 26, 2024
    Authors
    Legendary
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    This dataset is used for research in climate projects, with variables that support the development of machine learning models by highlighting environmental risks and mitigation strategies.

  9. Climate Metrics Dataset

    • kaggle.com
    zip
    Updated Aug 4, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    AJ (2025). Climate Metrics Dataset [Dataset]. https://www.kaggle.com/datasets/smayanj/climate-metrics-dataset
    Explore at:
    zip(293056 bytes)Available download formats
    Dataset updated
    Aug 4, 2025
    Authors
    AJ
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    This dataset contains dummy 20,000 daily weather records. Each entry includes the date, temperature in three units (Celsius, Kelvin, and Fahrenheit), precipitation in millimeters, and wind speed in kilometers per hour. The data can be used to study weather trends, analyze temperature conversions, or build predictive models for rainfall or wind conditions. It's suitable for climate analysis, time-series forecasting, and educational projects focused on meteorological data.

  10. Chicago Crime with Climate Data, 2021

    • kaggle.com
    zip
    Updated Dec 24, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mark Rozenberg (2021). Chicago Crime with Climate Data, 2021 [Dataset]. https://www.kaggle.com/datasets/markrozenberg/chicago-crime-with-climate-data-2021
    Explore at:
    zip(5305421 bytes)Available download formats
    Dataset updated
    Dec 24, 2021
    Authors
    Mark Rozenberg
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Area covered
    Chicago
    Description

    In this project I used machine learning and deep learning multiclass classification algorithms to predict types of crime commited in the city of Chicago in 2021. Moreover, I added weather data as features to the models with hope that the last will enrich the models and improve predictions.

    project page on GitHub:

    https://github.com/Mark-Rozenberg/Crime-And-Climate

  11. ALL NATURAL DISASTERS 1900-2021 / EOSDIS

    • kaggle.com
    zip
    Updated Oct 10, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Baris Dincer (2021). ALL NATURAL DISASTERS 1900-2021 / EOSDIS [Dataset]. https://www.kaggle.com/brsdincer/all-natural-disasters-19002021-eosdis
    Explore at:
    zip(2427001 bytes)Available download formats
    Dataset updated
    Oct 10, 2021
    Authors
    Baris Dincer
    License

    http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/

    Description

    ALL NATURAL DISASTERS 1900-2021 / EOSDIS

    As we know, the global climate disaster makes its impact on us more felt day by day. Understanding the parameters created by the climate crisis will be helpful in deciding the measures we will take against it.

    In this dataset, you will see the natural disasters of all countries.

    EOSDIS SYSTEM

  12. NOAA GSOD

    • kaggle.com
    zip
    Updated Aug 30, 2019
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    NOAA (2019). NOAA GSOD [Dataset]. https://www.kaggle.com/datasets/noaa/gsod
    Explore at:
    zip(0 bytes)Available download formats
    Dataset updated
    Aug 30, 2019
    Dataset provided by
    National Oceanic and Atmospheric Administrationhttp://www.noaa.gov/
    Authors
    NOAA
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Overview

    Global Surface Summary of the Day is derived from The Integrated Surface Hourly (ISH) dataset. The ISH dataset includes global data obtained from the USAF Climatology Center, located in the Federal Climate Complex with NCDC. The latest daily summary data are normally available 1-2 days after the date-time of the observations used in the daily summaries.

    Content

    Over 9000 stations' data are typically available.

    The daily elements included in the dataset (as available from each station) are: Mean temperature (.1 Fahrenheit) Mean dew point (.1 Fahrenheit) Mean sea level pressure (.1 mb) Mean station pressure (.1 mb) Mean visibility (.1 miles) Mean wind speed (.1 knots) Maximum sustained wind speed (.1 knots) Maximum wind gust (.1 knots) Maximum temperature (.1 Fahrenheit) Minimum temperature (.1 Fahrenheit) Precipitation amount (.01 inches) Snow depth (.1 inches)

    Indicator for occurrence of: Fog, Rain or Drizzle, Snow or Ice Pellets, Hail, Thunder, Tornado/Funnel

    Querying BigQuery tables

    You can use the BigQuery Python client library to query tables in this dataset in Kernels. Note that methods available in Kernels are limited to querying data. Tables are at bigquery-public-data.github_repos.[TABLENAME]. Fork this kernel to get started to learn how to safely manage analyzing large BigQuery datasets.

    Acknowledgements

    This public dataset was created by the National Oceanic and Atmospheric Administration (NOAA) and includes global data obtained from the USAF Climatology Center. This dataset covers GSOD data between 1929 and present, collected from over 9000 stations. Dataset Source: NOAA

    Use: This dataset is publicly available for anyone to use under the following terms provided by the Dataset Source — http://www.data.gov/privacy-policy#data_policy — and is provided "AS IS" without any warranty, express or implied, from Google. Google disclaims all liability for any damages, direct or indirect, resulting from the use of the dataset.

    Photo by Allan Nygren on Unsplash

  13. 100 US Continental Cities: Climate & Carfree Index

    • kaggle.com
    zip
    Updated Aug 24, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Idermaji (2024). 100 US Continental Cities: Climate & Carfree Index [Dataset]. https://www.kaggle.com/datasets/idermaji/us-cities-livability-by-environmental-factors
    Explore at:
    zip(11097 bytes)Available download formats
    Dataset updated
    Aug 24, 2024
    Authors
    Idermaji
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Where should we live in the next 10 years? Where should we settle down without relying on public transport? Which city should we move to without fearing losing our homes?

    As weather patterns become more unpredictable with aggressive changes in temperatures, I collected some data below to see if there would be a city that could help assess our answers to the prior questions. I am curious to see if cities that typically have great infrastructure for walking, biking or public transit will be better prepared than those that are more typically car centric. Whichever you prefer, we can have a sense on where you might be migrating, and to which areas.

    Here's how the data was collected:

    1. Rhodium & ProPublica's combined work on counties risk factors against climate change across continental US (excludes Hawaii, Alaska, Puerto Rico and Guam. Washington D.C, is excluded as it does not have a county.)
    2. The available Walk Score of major cities that have a population above 100,000 represented. Cities like Delaware's Wilmington or Maine's Portland are not considered as it falls under a small-city definition
    3. Maximum temperatures (for select cities): This dataset is collected from the UC Davis Department of Agricultural and Resource by Prof. Aaron Smith. I selected a monthly temporal unit and county spatial unit ranging from 2019 - 2024 July. This dataset is extracted based on the average of highest temperatures in each selected counties. I did not use the overall daily average as it can easily shadow the extremities of temperature fluctuations.

    The columns have different rating systems. The counties have all major climate risks expected in the future, while corresponding cities in each county have walking, transit and biking scores to assess livability without cars.

    Understanding County Climate Risks The counties were were represented on a 1- 10 scale, based on RCP 8.5 levels. Here are the following explanations (0 = lowest, 10 = highest)

    1) Heat: Heat is one of the largest drivers changing the niche of human habitability. Rhodium Group researchers estimate that, between 2040 and 2060 extreme temperatures, many counties will face extremely high temperatures for half a year. The measure shows how many weeks per year will we anticipate temperatures to soar above 95 degrees. (0 = 0 weeks, 10 = 26 weeks).

    2) Wet Bulb: Wet bulb temperatures occur when heat meets excessive humidity. This is commonplace across cities that have a urban island heat effects (dense concentration of pavements, less nature, higher chances of absorbing heat). That combination creates wet bulb temperatures, where 82 degrees can feel like southern Alabama on its hottest day, making it dangerous to work outdoors and for children to play school sports. As wet bulb temperatures increase even higher, so will the risk of heat stroke — and even death. The measure shows how many days will a county experience high wet bulb temperatures yearly, from 2040 to 2060. (0 = 0 days, 10 = 70 days)

    3) Farm Crop Yield: With rising temperatures, it will become more difficult to grow food. Corn and soy are the most prevalent crops in the U.S. and the basis for livestock feed and other staple foods, and they have critical economic significance. Because of their broad regional spread, they offer the best proxy for predicting how farming will be affected by rising temperatures and changing water supplies. As corn and soy production gets more sensitive to heat than drought, the US will see a huge continental divide between cooler counties now having more ability to produce, while current warmer counties loosing all abilities to produce basic crops. The expected measure shows the percent decline yields from 2040 to 2060 (0 = -20.5% decline, 10 = 92% decline).

    4) Sea Level Rise: As sea levels rise, the share of property submerged by high tides increases dramatically, affecting a small sliver of the nation's land but a disproportionate share of its population. The rating measures how much of property in the county will go below high tide from 2040 to 2060 (0 = 0%, 10 = 25%).

    5) Very Large Fires: With heat and evermore prevalent drought, the likelihood that very large wildfires (ones that burn over 12,000 acres) will affect U.S. regions increases substantially, particularly in the West, Northwest and the Rocky Mountains. The rating calculates how many average number of large fires will we expect to see per year (0 = N/A, 10 = 2.45) from 2040 to 2071.

    6) Economic Damages: Rising energy costs, lower labor productivity, poor crop yields and increasing cr...

  14. Evaluation of Chilean projects

    • kaggle.com
    zip
    Updated May 23, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Victor Caquilpan (2021). Evaluation of Chilean projects [Dataset]. https://www.kaggle.com/vcaquilpan/chilean-projects
    Explore at:
    zip(20562822 bytes)Available download formats
    Dataset updated
    May 23, 2021
    Authors
    Victor Caquilpan
    Description

    Background

    The Environmental Assessment Service (or SEA as its name in Spanish) is the institution responsible for authorizing the operation of projects in Chile, which could have potential impacts in the population health or the environment. When a company wants to carry out a project of a relatively large magnitude, it should present a requirement to the SEA to evaluate the correct and safety operation of that project. In this way if a project is detectable as harmful to the environment or population, the service can deny the environmental permit and thereby the start-up of a project. From the starting point of SEA in 1997, more than 15 thousand of projects have been evaluated by this service, thereby the database of SEA contains a large number of registers, which can be useful to analyze.

    The data presented in this page was scraped from the SEA page leveraging this is public information. The script used to get data collects general information about the projects evaluated by the SEA.

    Content

    The data set is composed by a data frame with the projects presented to the SEA. The columns of this data frame consider the next fields:

    • name: name of project
    • type: type of evaluation process.Projects can present a environmental impact statement (DIA in spanish) or a environmental impact study (EIA). This depend on the magnitude of potential impacts to the environment or health population. DIA means a simple evaluation of impacts, while EIA corresponds to a more complex assessment
    • region: region where the project is carrying out
    • typology: kind of projects based on its sector
    • typology_descr: a description of the typology
    • investment: investment amount in USD
    • entry_date: date where the project enters to the SEA process
    • state: the current state of the project' evaluation
    • qualification_date: date where the final SEA' resolution was issued. This resolution considers options as approved or denied. No all projects are qualified due to several are withdrawn earlier
    • id_project: id of each project inSEA
    • latitude: latitude in degrees using Datum WGS84. These coordinates are validates by SEA
    • longitude: longitude in degrees using Datum WGS84. These coordinates are validates by SEA
    • n_docs: number of documents available through evaluation process
    • n_addendum: number of addenda done in the evaluation process of a project
    • n_participatory_act: number of participatory activities done in the evaluation process of a project
    • description: general description of the project
    • main_url: url of the evaluation process of a project

    For the moment, the content of data frame is in Spanish, but in the future this will be translate to English. Data can contain mistakes due different factors. It is encouraged that people can detect and mention these problems in discussion section. Some mistakes are detected, specially in description field, however these mistakes are from the web page.

    Scraping was done on 15 March 2021 and the script utilized in available in Github.

    Inspiration

    With this data, different analysis can be done. Based on this information, you can know which different kind of projects have been approved or which are been denied. Some productive sectors are more presented in the country than others. In addition, you can analyze projects based on the investment amount or in their locations on the country. Also, you can observe that some places are saturated by certain productive sectors, which can affect the health of the population or the environment due to the number of projects concentrated in specific regions. For instance, in Chile is known that salmon farmings have caused several impacts in the south of the country, while that problems of air pollution have been evidenced in Quintero, where repeated peaks of SO2 have affected the health of people. In both cases, a core of industries are associated to these issues.

  15. Sensor-Based Data: Temperature & Humidity

    • kaggle.com
    zip
    Updated Mar 22, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Tejas Gupta (2025). Sensor-Based Data: Temperature & Humidity [Dataset]. https://www.kaggle.com/datasets/tejasgupta7/sensor-based-data-temperature-and-humidity/data
    Explore at:
    zip(41529 bytes)Available download formats
    Dataset updated
    Mar 22, 2025
    Authors
    Tejas Gupta
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Dataset Description: Environmental Sensor Readings from Mars Rover Prototype

    Research Hypothesis A scaled-down Mars Rover prototype can effectively collect temperature and humidity data, demonstrating how real-time environmental monitoring can be used for autonomous navigation, climate analysis, and anomaly detection.

    By analyzing the collected data, we aim to identify trends, evaluate sensor accuracy, and explore potential improvements in robotic exploration. This includes assessing response time, consistency, and anomalies caused by external factors like human interference or sudden environmental changes.

    What the Data Shows This dataset contains timestamped temperature and humidity readings collected at regular time intervals by the rover’s onboard DHT22 sensor. The data highlights:
    - Gradual fluctuations in environmental conditions.
    - Notable temperature spikes (~10°C) introduced using a lighter to test sensor response.
    - Stable humidity levels with minor deviations due to air circulation or sensor drift.

    Notable Findings - Controlled Temperature Spikes: Short bursts of heat resulted in clear temperature increases (~10°C), demonstrating the sensor's ability to detect and log transient changes.
    - Humidity Stability: Humidity levels remained within a narrow range, confirming minimal impact from applied temperature fluctuations.
    - Gradual Environmental Variations: Small temperature and humidity shifts were observed, likely due to ambient conditions and ventilation effects.

    How the Data Was Gathered - Sensor Used: DHT22 (for temperature & humidity).
    - Data Collection Frequency: Logged every few seconds.
    - Controlled Testing: Heat spikes added using a lighter to simulate external interference.
    - Data Transmission: Logged in real-time via wireless communication to a laptop.

    How to Interpret and Use the Data
    - Identify Trends: Observe temperature and humidity variations over time.
    - Detect Anomalies: Locate sharp temperature spikes (~10°C increases) caused by external heating.
    - Compare Sensor Performance: Evaluate how quickly temperature normalizes after a spike.
    - Develop Predictive Models: Train machine learning models to predict environmental changes.

    Potential Applications - Autonomous Environment Monitoring: Detecting and responding to environmental anomalies.
    - Sensor Calibration & Validation: Testing DHT22 sensor accuracy under different conditions.
    - Climate Simulation & Research: Indoor climate modeling & environmental trend analysis.
    - Robotics & AI: Training AI for automated responses to climate fluctuations.

    You can find related information on the GitHub repository of the project.

  16. climatewatch

    • kaggle.com
    zip
    Updated Oct 31, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Adam S (2020). climatewatch [Dataset]. https://www.kaggle.com/adams3/climatewatch
    Explore at:
    zip(128881591 bytes)Available download formats
    Dataset updated
    Oct 31, 2020
    Authors
    Adam S
    Description

    Climate Watch: Data for Climate Action

    Climate Watch is an online platform designed to empower policymakers, researchers, media and other stakeholders with the open climate data, visualizations and resources they need to gather insights on national and global progress on climate change.

    Climate Watch is managed by World Resources Institute. It is a contribution to the NDC Partnership.

    Encompassing data on Historical emissions by country, region, industry, and gas by year (1850-2018) Nationally Determined Contributions (NDCs); Linkages between Nationally Determined Contributions (NDCs) and the Sustainable Development Goals (SDGs) Emissions scenario pathways for major emitting countries, derived from a growing library of models;

    Acknowledgements

    The CDP Kaggle project prompted me to seek out datasets on climate change and I stumbled upon this one in my research. Climate watch is a great platform and their website offers many opportunities to explore and visualize the data.

    I encourage everyone to check out their website: https://www.climatewatchdata.org/

  17. NOAA ICOADS

    • kaggle.com
    zip
    Updated Mar 13, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    NOAA (2018). NOAA ICOADS [Dataset]. https://www.kaggle.com/datasets/noaa/noaa-icoads
    Explore at:
    zip(0 bytes)Available download formats
    Dataset updated
    Mar 13, 2018
    Dataset provided by
    National Oceanic and Atmospheric Administrationhttp://www.noaa.gov/
    Authors
    NOAA
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Overview

    The International Comprehensive Ocean-Atmosphere Data Set (ICOADS) is a global ocean marine meteorological and surface ocean dataset. It is formed by merging many national and international data sources that contain measurements and visual observations from ships (merchant, navy, research), moored and drifting buoys, coastal stations, and other marine and near-surface ocean platforms. Each marine report contains individual observations of meteorological and oceanographic variables, such as sea surface and air temperatures, wind, pressure, humidity, and cloudiness. The coverage is global and sampling density varies depending on date and geographic position relative to shipping routes and ocean observing systems.

    Content

    The ICOADS dataset contains global marine data from ships (merchant, navy, research) and buoys, each capturing details according to the current weather or ocean conditions (wave height, sea temperature, wind speed, and so on). Each record contains the exact location of the observation which is great for visualizations. The historical depth of the data is quite comprehensive — There are records going back to 1662!

    Querying BigQuery tables

    You can use the BigQuery Python client library to query tables in this dataset in Kernels. Note that methods available in Kernels are limited to querying data. Tables are at bigquery-public-data.github_repos.[TABLENAME]. Fork this kernel to get started to learn how to safely manage analyzing large BigQuery datasets.

    Acknowledgements

    Dataset Source: NOAA Category: Meteorological, Climate, Transportation

    Citation: National Centers for Environmental Information/NESDIS/NOAA/U.S. Department of Commerce, Research Data Archive/Computational and Information Systems Laboratory/National Center for Atmospheric Research/University Corporation for Atmospheric Research, Earth System Research Laboratory/NOAA/U.S. Department of Commerce, Cooperative Institute for Research in Environmental Sciences/University of Colorado, National Oceanography Centre/Natural Environment Research Council/United Kingdom, Met Office/Ministry of Defence/United Kingdom, Deutscher Wetterdienst (German Meteorological Service)/Germany, Department of Atmospheric Science/University of Washington, and Center for Ocean-Atmospheric Prediction Studies/Florida State University. 2016, updated monthly. International Comprehensive Ocean-Atmosphere Data Set (ICOADS) Release 3, Individual Observations. Research Data Archive at the National Center for Atmospheric Research, Computational and Information Systems Laboratory: https://doi.org/10.5065/D6ZS2TR3. Accessed 01 04 2017.

    Use: This dataset is publicly available for anyone to use under the following terms provided by the Dataset Source - http://www.data.gov/privacy-policy#data_policy — and is provided "AS IS" without any warranty, express or implied, from Google. Google disclaims all liability for any damages, direct or indirect, resulting from the use of the dataset.

    Photo by Gleb Kozenko on Unsplash

  18. Precipitation Prediction in LA

    • kaggle.com
    zip
    Updated Jan 22, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Varun Nagpal Spyz (2022). Precipitation Prediction in LA [Dataset]. https://www.kaggle.com/datasets/varunnagpalspyz/precipitation-prediction-in-la
    Explore at:
    zip(23191 bytes)Available download formats
    Dataset updated
    Jan 22, 2022
    Authors
    Varun Nagpal Spyz
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Area covered
    Los Angeles
    Description

    Context

    This Dataset is part of a basic DIY Machine Learning project offered by my college, Indian Institute of Technology, Guwahati (IIT G). The main aim of this project was to get familiar with the workflow and various techniques involved in a Machine Learning project.

    Content

    The dataset is fairly simple and contains various features regarding precipitation. PRCP = Precipitation (tenths of mm) TMAX = Maximum temperature (tenths of degrees C) TMIN = Minimum temperature (tenths of degrees C) PGTM = Peak gust time (hours and minutes, i.e., HHMM) AWND = Average daily wind speed (tenths of meters per second) TAVG = Average temperature (tenths of degrees C) WDFx = Direction of fastest x-minute wind (degrees) WSFx = Fastest x-minute wind speed (tenths of meters per second) WT = Weather Type

    Acknowledgements

    All Credits go to the Coding Club of Indian Institute of Technology, Guwahati (IIT Guwahati). Instagram: https://www.instagram.com/codingclubiitg/ LinkedIn : https://www.linkedin.com/company/coding-club-iitg/

    Inspiration

    Hope that this dataset + my notebook (https://www.kaggle.com/varunnagpalspyz/precipitation-prediction/notebook) helps all beginners like me.

  19. Crop yield Data with Soil And Weather Dataset

    • kaggle.com
    zip
    Updated Aug 31, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Anshu mishra (2025). Crop yield Data with Soil And Weather Dataset [Dataset]. https://www.kaggle.com/datasets/anshumish/crop-yield-data-with-soil-and-weather-dataset
    Explore at:
    zip(468133 bytes)Available download formats
    Dataset updated
    Aug 31, 2025
    Authors
    Anshu mishra
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    📊 About the Dataset

    This dataset collection combines crop yields, soil composition, and weather data across Indian states, providing a comprehensive view of agriculture between 1997 and 2020. It includes:

    Crop Yield Data: Crop-wise area, production, fertilizer/pesticide use, and yield trends.

    Soil Data: State-level soil nutrients (N, P, K) and pH values.

    Weather Data: Annual averages of temperature, rainfall, and humidity.

    By integrating these datasets, users can explore how soil health, climatic conditions, and farm inputs interact to influence agricultural productivity.

    🎯 Use Cases

    1. Analyze crop yield trends over time and across states.

    2. Study the impact of soil nutrients and pH on productivity.

    3. Assess climate effects (rainfall, temperature, humidity) on crop yields.

    4. Build machine learning models for yield prediction and crop recommendation.

    5. Support climate-smart agriculture and policy planning.

    This dataset is ideal for data science, machine learning, and academic projects focusing on agriculture, sustainability, and climate change.

  20. Sustainable Energy Projects in Ladakh

    • kaggle.com
    zip
    Updated Jul 22, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    sacramento technology (2023). Sustainable Energy Projects in Ladakh [Dataset]. https://www.kaggle.com/datasets/sacramentotechnology/sustainable-energy-projects-in-ladakh/code
    Explore at:
    zip(24964 bytes)Available download formats
    Dataset updated
    Jul 22, 2023
    Authors
    sacramento technology
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Area covered
    Ladakh
    Description

    Overview This dataset presents the findings of a research study conducted to explore the impact of comprehensive approaches on sustainable energy projects in the Ladakh region of Northern India. The study focuses on the integration of sustainable energy initiatives with climate change adaptation and poverty reduction. Data was collected from various sustainable energy projects implemented in the region, and key outcomes were analyzed to understand the effectiveness and viability of different technology transfer approaches and project management strategies.

    Context Ladakh, known for its remote and challenging terrain, has been actively pursuing sustainable energy initiatives to combat climate change and uplift local communities out of poverty. This dataset delves into the diverse technology transfer approaches, financial support mechanisms, policy frameworks, community engagement, and primary barriers encountered during the implementation of sustainable energy projects.

    Content The dataset comprises responses from 200 respondents, including project stakeholders, beneficiaries, government officials, and community members, who were involved in different sustainable energy initiatives. The data covers a wide range of aspects, including respondent demographics (age, gender, education level, and annual income), technology transfer approaches, financial support levels, policy frameworks, community engagement levels, primary barriers faced, the impact of barriers, partnership types, and the effectiveness of projects in achieving climate change adaptation and poverty reduction goals.

    Methodology The research methodology involved data collection through surveys, interviews, and project documentation analysis. Data analysis techniques, including descriptive statistics, frequency distributions, and thematic analysis, were employed to derive insights from the dataset. The study aimed to highlight the significance of comprehensive approaches to technology transfer and project management in driving sustainable energy projects' success and long-term viability.

    Potential Uses Researchers, policymakers, and sustainable energy enthusiasts can utilize this dataset to gain valuable insights into the factors influencing the effectiveness of sustainable energy projects in Ladakh. The dataset provides valuable information on the impact of different technology transfer approaches and project management practices on project outcomes and long-term viability. It can be used to identify successful strategies for climate change adaptation, poverty reduction, and community development through sustainable energy initiatives.

    Acknowledgments: The dataset was collected and compiled by a team of researchers dedicated to promoting sustainable energy solutions and climate resilience in Ladakh. We extend our gratitude to all the project stakeholders, participants, and organizations involved in supporting this research.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Thinking Machines Data Science (2024). Project CCHAIN [Dataset]. https://www.kaggle.com/datasets/thinkdatasci/project-cchain
Organization logo

Project CCHAIN

Open validated health, climate, environment, socioeconomic data in 12 PH cities

Explore at:
2 scholarly articles cite this dataset (View in Google Scholar)
zip(396604023 bytes)Available download formats
Dataset updated
Sep 30, 2024
Authors
Thinking Machines Data Science
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

The Project Climate Change, Health, and Artificial Intelligence (Project CCHAIN) dataset is a validated, open-sourced linked dataset containing 20 years (2003-2022) of climate, environmental, socioeconomic, and health dimensions at the barangay (village) level across twelve Philippine cities (Dagupan, Palayan, Navotas, Mandaluyong, Muntinlupa, Legazpi, Iloilo, Mandaue, Tacloban, Zamboanga, Cagayan de Oro, Davao). The full documentation can be accessed here.

The tables are designed in a way that users can choose variables that are most relevant to their focus city and use case, and link these variables to form a single dataset by merging using standard geography codes and calendar dates. This can be done using the provided linking notebook, or offline using the user's own code.

Here are some tips on how make most use of this dataset: - Focus on one location. Starting with a detailed analysis of one location allows for a better understanding of the local dynamics, which may differ across locations. - Choose one health data source. Pick one of either a central or local data source. Using two different data health sources is not advised because it will lead to double/overcounting of disease cases. - Do not use all variables at once- do a literature review first to identify possible key variables to identify possible key variables. More often than not, using all variables is not necessary and may even yield subpar results. - Decide whether or not to use regular or downscaled climate data. Our downscaled climate data provides nuanced insights on spatial patterns of a few climate variables. Kindly read the documentation before deciding to use this data. If you are uncertain, consider using only the climate_atmosphere table instead - Check data availability on your focus location and make sure they fit the requirements of your study.

This dataset also includes household surveys tables (see schema here and here) done on partner informal settlement communities in the cities of Muntinlupa, Davao, Iloilo, and Mandaue and administered on various dates from 2001 to 2024. Due to the sensitive nature of surveys and the vulnerability of the subjects involved, requests for access must be submitted for review and approval by the Philippine Action for Community-Led Shelter Initiatives, Inc. (PACSII). To submit a request, please use this form.

The Project CCHAIN dataset adapted the Creative Commons Attribution 4.0 International (CC BY 4.0) license. This allows anyone to share (copy and redistribute) and adapt (remix, transform, and build upon) a work, as long as they give appropriate credit to the original creator.

One exception, the tm_open_buildings table, follows the Open Database License (ODbL) as directed by its source, OpenStreetMap. Under the ODbL, users are free to use, modify, and distribute the database, but on top of CC BY 4.0's attribution requirement, this license requires to share any modifications they make under the same ODbL license.

Search
Clear search
Close search
Google apps
Main menu