100+ datasets found
  1. Weather and Housing in North America

    • kaggle.com
    zip
    Updated Feb 13, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Devastator (2023). Weather and Housing in North America [Dataset]. https://www.kaggle.com/datasets/thedevastator/weather-and-housing-in-north-america
    Explore at:
    zip(512280 bytes)Available download formats
    Dataset updated
    Feb 13, 2023
    Authors
    The Devastator
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Area covered
    North America
    Description

    Weather and Housing in North America

    Exploring the Relationship between Weather and Housing Conditions in 2012

    By [source]

    About this dataset

    This comprehensive dataset explores the relationship between housing and weather conditions across North America in 2012. Through a range of climate variables such as temperature, wind speed, humidity, pressure and visibility it provides unique insights into the weather-influenced environment of numerous regions. The interrelated nature of housing parameters such as longitude, latitude, median income, median house value and ocean proximity further enhances our understanding of how distinct climates play an integral part in area real estate valuations. Analyzing these two data sets offers a wealth of knowledge when it comes to understanding what factors can dictate the value and comfort level offered by residential areas throughout North America

    More Datasets

    For more datasets, click here.

    Featured Notebooks

    • 🚨 Your notebook can be here! 🚨!

    How to use the dataset

    This dataset offers plenty of insights into the effects of weather and housing on North American regions. To explore these relationships, you can perform data analysis on the variables provided.

    First, start by examining descriptive statistics (i.e., mean, median, mode). This can help show you the general trend and distribution of each variable in this dataset. For example, what is the most common temperature in a given region? What is the average wind speed? How does this vary across different regions? By looking at descriptive statistics, you can get an initial idea of how various weather conditions and housing attributes interact with one another.

    Next, explore correlations between variables. Are certain weather variables correlated with specific housing attributes? Is there a link between wind speeds and median house value? Or between humidity and ocean proximity? Analyzing correlations allows for deeper insights into how different aspects may influence one another for a given region or area. These correlations may also inform broader patterns that are present across multiple North American regions or countries.

    Finally, use visualizations to further investigate this relationship between climate and housing attributes in North America in 2012. Graphs allow you visualize trends like seasonal variations or long-term changes over time more easily so they are useful when interpreting large amounts of data quickly while providing larger context beyond what numbers alone can tell us about relationships between different aspects within this dataset

    Research Ideas

    • Analyzing the effect of climate change on housing markets across North America. By looking at temperature and weather trends in combination with housing values, researchers can better understand how climate change may be impacting certain regions differently than others.
    • Investigating the relationship between median income, house values and ocean proximity in coastal areas. Understanding how ocean proximity plays into housing prices may help inform real estate investment decisions and urban planning initiatives related to coastal development.
    • Utilizing differences in weather patterns across different climates to determine optimal seasonal rental prices for property owners. By analyzing changes in temperature, wind speed, humidity, pressure and visibility from season to season an investor could gain valuable insights into seasonal market trends to maximize their profits from rentals or Airbnb listings over time

    Acknowledgements

    If you use this dataset in your research, please credit the original authors. Data Source

    License

    License: CC0 1.0 Universal (CC0 1.0) - Public Domain Dedication No Copyright - You can copy, modify, distribute and perform the work, even for commercial purposes, all without asking permission. See Other Information.

    Columns

    File: Weather.csv | Column name | Description | |:---------------------|:-----------------------------------------------| | Date/Time | Date and time of the observation. (Date/Time) | | Temp_C | Temperature in Celsius. (Numeric) | | Dew Point Temp_C | Dew point temperature in Celsius. (Numeric) | | Rel Hum_% | Relative humidity in percent. (Numeric) | | Wind Speed_km/h | Wind speed in kilometers per hour. (Numeric) | | Visibility_km | Visibilit...

  2. d

    Highway-Runoff Database (HRDB) Version 1.1.0

    • catalog.data.gov
    • data.usgs.gov
    Updated Nov 26, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Geological Survey (2025). Highway-Runoff Database (HRDB) Version 1.1.0 [Dataset]. https://catalog.data.gov/dataset/highway-runoff-database-hrdb-version-1-1-0
    Explore at:
    Dataset updated
    Nov 26, 2025
    Dataset provided by
    United States Geological Surveyhttp://www.usgs.gov/
    Description

    The Highway-Runoff Database (HRDB) was developed by the U.S. Geological Survey, in cooperation with the Federal Highway Administration (FHWA) to provide planning-level information for decision makers, planners, and highway engineers to assess and mitigate possible adverse effects of highway runoff on the Nation’s receiving waters. The HRDB was assembled by using a Microsoft Access database application to facilitate use of the data and to calculate runoff-quality statistics with methods that properly handle censored-concentration data. This data release provides highway-runoff data, including information about monitoring sites, precipitation, runoff, and event-mean concentrations of water-quality constituents. The dataset was compiled from 37 studies as documented in 113 scientific or technical reports. The dataset includes data from 242 highway sites across the country. It includes data from 6,837 storm events with dates ranging from April 1975 to November 2017. Therefore, these data span more than 40 years; vehicle emissions and background sources of highway-runoff constituents have changed markedly during this time. For example, some of the early data is affected by use of leaded gasoline, phosphorus-based detergents, and industrial atmospheric deposition. The dataset includes 106,441 concentration values with data for 414 different water-quality constituents. This dataset was assembled from various sources and the original data was collected and analyzed by using various protocols. Where possible the USGS worked with State departments of transportation and the original researchers to obtain, document, and verify the data that was included in the HRDB. This new version (1.1.0) of the database contains software updates to provide data-quality information within the Graphical User Interface (GUI), calculate statistics for multiple sites in batch mode, and output additional statistics. However, inclusion in this dataset does not constitute endorsement by the USGS or the FHWA. People who use this data are responsible for ensuring that the data are complete and correct and that it is suitable for their intended purposes.

  3. Transport Mode Symbols and Pictograms

    • developer.transport.nsw.gov.au
    • data.nsw.gov.au
    • +3more
    Updated Nov 18, 2018
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    developer.transport.nsw.gov.au (2018). Transport Mode Symbols and Pictograms [Dataset]. https://developer.transport.nsw.gov.au/data/dataset/transport-mode-symbols-and-pictograms
    Explore at:
    Dataset updated
    Nov 18, 2018
    Dataset provided by
    Transport for NSWhttp://www.transport.nsw.gov.au/
    License

    Attribution-NonCommercial 2.0 (CC BY-NC 2.0)https://creativecommons.org/licenses/by-nc/2.0/
    License information was derived automatically

    Description

    Here you can find symbols and pictograms for all transport modes to use in your apps, products and other projects. Symbols and icons are available in various formats, while all can be found as vector files that can be opened directly in software such as Adobe Illustrator.

  4. V

    2022 - 2024 NTD Annual Data - Track & Roadway (by Mode)

    • data.virginia.gov
    • data.transportation.gov
    csv, json, rdf, xsl
    Updated Oct 17, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S Department of Transportation (2025). 2022 - 2024 NTD Annual Data - Track & Roadway (by Mode) [Dataset]. https://data.virginia.gov/dataset/2022-2024-ntd-annual-data-track-roadway-by-mode
    Explore at:
    xsl, csv, json, rdfAvailable download formats
    Dataset updated
    Oct 17, 2025
    Dataset provided by
    Federal Transit Administration
    Authors
    U.S Department of Transportation
    Description

    This dataset details track and roadway mileage/characteristics for each agency, mode, and type of service, as reported to the National Transit Database in the 2022, 2023, and 2024 report years. These data include the types of track/roadway elements employed in transit operation, as well as the length and/or count of certain elements.

    NTD Data Tables organize and summarize data from the 2022 - 2024 National Transit Database in a manner that is more useful for quick reference and summary analysis. This dataset is based on the 2022 - 2024 Transit Way Mileage database files.

    In years 2015-2021, you can find this data in the "Track and Roadway" data table on NTD Program website, at https://transit.dot.gov/ntd/ntd-data.

    In versions of the data tables from before 2015, you can find corresponding data in the file called "Transit Way Mileage - Rail Modes" and "Transit Way Mileage - Non-Rail Modes."

    This dataset's 2024 data comes from the NTD as of September 4, 2025.

    If you have any other questions about this table, please contact the NTD Help Desk at NTDHelp@dot.gov.

  5. D

    NTD Annual Data View - Track & Roadway (by Agency)

    • data.transportation.gov
    csv, xlsx, xml
    Updated Oct 17, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Federal Transit Administration (2025). NTD Annual Data View - Track & Roadway (by Agency) [Dataset]. https://data.transportation.gov/w/pvgq-a73e/m7rw-edbr?cur=u9IHJAJqQp9
    Explore at:
    xlsx, xml, csvAvailable download formats
    Dataset updated
    Oct 17, 2025
    Dataset authored and provided by
    Federal Transit Administration
    License

    https://www.usa.gov/government-workshttps://www.usa.gov/government-works

    Description

    Provides agency-wide totals for track and roadway components. Data is from the National Transit Database in the 2022 and 2023 report years. These data include the types of track/roadway elements employed in transit operation, as well as the length and/or count of certain elements. This view is based off of the "2022 - 2023 NTD Annual Data - Track & Roadway (by Mode)" dataset, which displays the same data at a lower level of aggregation. This view displays the data at a higher level (by agency).

    NTD Data Tables organize and summarize data from the 2022 and 2023 National Transit Database in a manner that is more useful for quick reference and summary analysis. The dataset that this view references is based on the 2022 and 2023 Transit Way Mileage database files.

    In years 2015-2021, you can find this data in the "Track and Roadway" data table on NTD Program website, at https://transit.dot.gov/ntd/ntd-data.

    In versions of the data tables from before 2015, you can find corresponding data in the file called "Transit Way Mileage - Rail Modes" and "Transit Way Mileage - Non-Rail Modes."

    If you have any other questions about this table, please contact the NTD Help Desk at NTDHelp@dot.gov.

  6. MISR Level 1B1 Local Mode Radiance Data V002

    • data.nasa.gov
    • cmr.earthdata.nasa.gov
    • +2more
    Updated Apr 1, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    nasa.gov (2025). MISR Level 1B1 Local Mode Radiance Data V002 [Dataset]. https://data.nasa.gov/dataset/misr-level-1b1-local-mode-radiance-data-v002-7db9a
    Explore at:
    Dataset updated
    Apr 1, 2025
    Dataset provided by
    NASAhttp://nasa.gov/
    Description

    MIB1LM_002 is the Multi-angle Imaging SpectroRadiometer (MISR) Level 1B1 Local Mode Radiance Data version 2. It contains the data numbers (DNs) radiometrically scaled to radiances with no geometric resampling. Multi-angle Imaging SpectroRadiometer (MISR) Level 1B1 Radiance data product contains spectral radiances for all MISR channels. Each value represents the incident radiance averaged over the sensor's total band response. Processing includes both radiance scaling and conditioning steps. Radiance scaling converts the Level 1A data from digital counts to radiances, using coefficients derived from the onboard calibrator (OBC) and vicarious calibrations. The OBC contains Spectralon calibration panels, deployed monthly and reflect sunlight into cameras. The OBC detector standards then measure this reflected light to provide the calibration. No out-of-band correction is done for this product, nor are the data geometrically corrected or resampled. Data collection for this product is ongoing.The MISR instrument consists of nine push-broom cameras that measure radiance in four spectral bands. Global coverage is achieved in nine days. The cameras are arranged with one camera pointing toward the nadir, four forward, and four aftward. It takes seven minutes for all nine cameras to view the same surface location. The view angles relative to the surface reference ellipsoid are 0, 26.1, 45.6, 60.0, and 70.5 degrees. The spectral band shapes are nominally Gaussian, centered at 443, 555, 670, and 865 nm.MISR is designed to view Earth with cameras pointed in 9 different directions. As the instrument flies overhead, each piece of Earth's surface below is successively imaged by all nine cameras in 4 wavelengths (blue, green, red, and near-infrared). The goal of MISR is to improve our understanding of the effects of sunlight on Earth and distinguish different types of clouds, particles, and surfaces. Specifically, MISR monitors the monthly, seasonal, and long-term trends in three areas: 1) amount and type of atmospheric particles (aerosols), including those formed by natural sources and by human activities; 2) amounts, types, and heights of clouds, and 3) distribution of land surface cover, including vegetation canopy structure.

  7. D

    NTD Annual Data View - Employees (By Mode)

    • data.transportation.gov
    csv, xlsx, xml
    Updated Oct 17, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Federal Transit Administration (2025). NTD Annual Data View - Employees (By Mode) [Dataset]. https://data.transportation.gov/w/wsxw-2rpq/m7rw-edbr?cur=I_mJSalBXzx&from=08liZNRsmtN
    Explore at:
    csv, xml, xlsxAvailable download formats
    Dataset updated
    Oct 17, 2025
    Dataset authored and provided by
    Federal Transit Administration
    License

    https://www.usa.gov/government-workshttps://www.usa.gov/government-works

    Description

    This dataset details data on hours worked by public transportation employees and the head counts of employees for each applicable agency reporting to the National Transit Database in the 2022 and 2023 report years at the mode and type of service level.

    NTD Data Tables organize and summarize data from the 2022 and 2023 National Transit Database in a manner that is more useful for quick reference and summary analysis. This dataset is based on the 2022 and 2023 Transit Agency Employees database files.

    In years 2015-2021, you can find this data in the "Employees" data table on NTD Program website, at https://transit.dot.gov/ntd/ntd-data.

    If you have any other questions about this table, please contact the NTD Help Desk at NTDHelp@dot.gov.

  8. Risky Business: Factor Analysis of Survey Data – Assessing the Probability...

    • plos.figshare.com
    txt
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Cees van der Eijk; Jonathan Rose (2023). Risky Business: Factor Analysis of Survey Data – Assessing the Probability of Incorrect Dimensionalisation [Dataset]. http://doi.org/10.1371/journal.pone.0118900
    Explore at:
    txtAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Cees van der Eijk; Jonathan Rose
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This paper undertakes a systematic assessment of the extent to which factor analysis the correct number of latent dimensions (factors) when applied to ordered-categorical survey items (so-called Likert items). We simulate 2400 data sets of uni-dimensional Likert items that vary systematically over a range of conditions such as the underlying population distribution, the number of items, the level of random error, and characteristics of items and item-sets. Each of these datasets is factor analysed in a variety of ways that are frequently used in the extant literature, or that are recommended in current methodological texts. These include exploratory factor retention heuristics such as Kaiser’s criterion, Parallel Analysis and a non-graphical scree test, and (for exploratory and confirmatory analyses) evaluations of model fit. These analyses are conducted on the basis of Pearson and polychoric correlations. We find that, irrespective of the particular mode of analysis, factor analysis applied to ordered-categorical survey data very often leads to over-dimensionalisation. The magnitude of this risk depends on the specific way in which factor analysis is conducted, the number of items, the properties of the set of items, and the underlying population distribution. The paper concludes with a discussion of the consequences of over-dimensionalisation, and a brief mention of alternative modes of analysis that are much less prone to such problems.

  9. h

    pick-the-cup-hard-mode

    • huggingface.co
    Updated Nov 24, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Javier (2025). pick-the-cup-hard-mode [Dataset]. https://huggingface.co/datasets/Javiertxu22/pick-the-cup-hard-mode
    Explore at:
    Dataset updated
    Nov 24, 2025
    Authors
    Javier
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    This dataset was created using LeRobot.

      Dataset Structure
    

    meta/info.json: { "codebase_version": "v3.0", "robot_type": "so101_follower", "total_episodes": 31, "total_frames": 6450, "total_tasks": 1, "chunks_size": 1000, "data_files_size_in_mb": 100, "video_files_size_in_mb": 500, "fps": 30, "splits": { "train": "0:31" }, "data_path": "data/chunk-{chunk_index:03d}/file-{file_index:03d}.parquet", "video_path":… See the full description on the dataset page: https://huggingface.co/datasets/Javiertxu22/pick-the-cup-hard-mode.

  10. w

    COVID-19 High Frequency Phone Survey of Households 2020 - World Bank LSMS...

    • microdata.worldbank.org
    • catalog.ihsn.org
    • +1more
    Updated Oct 25, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Central Statistics Agency of Ethiopia (2021). COVID-19 High Frequency Phone Survey of Households 2020 - World Bank LSMS Harmonized Dataset - Ethiopia [Dataset]. https://microdata.worldbank.org/index.php/catalog/4072
    Explore at:
    Dataset updated
    Oct 25, 2021
    Dataset authored and provided by
    Central Statistics Agency of Ethiopia
    Time period covered
    2018 - 2021
    Area covered
    Ethiopia
    Description

    Abstract

    To facilitate the use of data collected through the high-frequency phone surveys on COVID-19, the Living Standards Measurement Study (LSMS) team has created the harmonized datafiles using two household surveys: 1) the country’ latest face-to-face survey which has become the sample frame for the phone survey, and 2) the country’s high-frequency phone survey on COVID-19.

    The LSMS team has extracted and harmonized variables from these surveys, based on the harmonized definitions and ensuring the same variable names. These variables include demography as well as housing, household consumption expenditure, food security, and agriculture. Inevitably, many of the original variables are collected using questions that are asked differently. The harmonized datafiles include the best available variables with harmonized definitions.

    Two harmonized datafiles are prepared for each survey. The two datafiles are: 1. HH: This datafile contains household-level variables. The information include basic household characterizes, housing, water and sanitation, asset ownership, consumption expenditure, consumption quintile, food security, livestock ownership. It also contains information on agricultural activities such as crop cultivation, use of organic and inorganic fertilizer, hired labor, use of tractor and crop sales. 2. IND: This datafile contains individual-level variables. It includes basic characteristics of individuals such as age, sex, marital status, disability status, literacy, education and work.

    Geographic coverage

    National coverage

    Analysis unit

    • Households
    • Individuals

    Universe

    The survey covered all de jure households excluding prisons, hospitals, military barracks, and school dormitories.

    Kind of data

    Sample survey data [ssd]

    Sampling procedure

    See “Ethiopia - Socioeconomic Survey 2018-2019” and “Ethiopia - COVID-19 High Frequency Phone Survey of Households 2020” available in the Microdata Library for details.

    Mode of data collection

    Computer Assisted Personal Interview [capi]

    Cleaning operations

    Ethiopia Socioeconomic Survey (ESS) 2018-2019 and Ethiopia COVID-19 High Frequency Phone Survey of Households (HFPS) 2020 data were harmonized following the harmonization guidelines (see “Harmonized Datafiles and Variables for High-Frequency Phone Surveys on COVID-19” for more details).

    The high-frequency phone survey on COVID-19 has multiple rounds of data collection. When variables are extracted from multiple rounds of the survey, the originating round of the survey is noted with “_rX” in the variable name, where X represents the number of the round. For example, a variable with “_r3” presents that the variable was extracted from Round 3 of the high-frequency phone survey. Round 0 refers to the country’s latest face-to-face survey which has become the sample frame for the high-frequency phone surveys on COVID-19. When the variables are without “_rX”, they were extracted from Round 0.

    Response rate

    See “Ethiopia - Socioeconomic Survey 2018-2019” and “Ethiopia - COVID-19 High Frequency Phone Survey of Households 2020” available in the Microdata Library for details.

  11. Data from: VG2 NEP PLS DERIVED RDR ION OUTBND MAGSHTH L-MODE 48SEC V1.0

    • data.nasa.gov
    • gimi9.com
    • +2more
    Updated Jul 17, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    nasa.gov (2025). VG2 NEP PLS DERIVED RDR ION OUTBND MAGSHTH L-MODE 48SEC V1.0 [Dataset]. https://data.nasa.gov/dataset/vg2-nep-pls-derived-rdr-ion-outbnd-magshth-l-mode-48sec-v1-0
    Explore at:
    Dataset updated
    Jul 17, 2025
    Dataset provided by
    NASAhttp://nasa.gov/
    License

    U.S. Government Workshttps://www.usa.gov/government-works
    License information was derived automatically

    Description

    This data set gives the best available values for ion densities, temperatures, and velocities near Neptune derived from data obtained by the Voyager 2 plasma experiment. All parameters are obtained by fitting the observed spectra (current as a function of energy) with Maxwellian plasma distributions, using a non-linear least squares fitting routine to find the plasma parameters which, when coupled with the full instrument response, best simulate the data. The PLS instrument measures energy/charge, so composition is not uniquely determined but can be deduced in some cases by the separation of the observed current peaks in energy (assuming the plasma is co-moving).

  12. p

    High Frequency Phone Survey, Continuous Data Collection 2023 - Solomon...

    • microdata.pacificdata.org
    Updated Mar 19, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Darian Naidoo and William Seitz (2025). High Frequency Phone Survey, Continuous Data Collection 2023 - Solomon Islands [Dataset]. https://microdata.pacificdata.org/index.php/catalog/875
    Explore at:
    Dataset updated
    Mar 19, 2025
    Dataset authored and provided by
    Darian Naidoo and William Seitz
    Time period covered
    2023 - 2024
    Area covered
    Solomon Islands
    Description

    Abstract

    Access to up-to-date socio-economic data is a widespread challenge in Solomon Islands and other Pacific Island Countries. To increase data availability and promote evidence-based policymaking, the Pacific Observatory provides innovative solutions and data sources to complement existing survey data and analysis. One of these data sources is a series of High Frequency Phone Surveys (HFPS), which began in 2020 as a way to monitor the socio-economic impacts of the COVID-19 Pandemic, and since 2023 has grown into a series of continuous surveys for socio-economic monitoring. See https://www.worldbank.org/en/country/pacificislands/brief/the-pacific-observatory for further details.

    For Solmon Islands, after five rounds of data collection from 2020-2020, in April 2023 a monthly HFPS data collection commenced and continued for 18 months (ending September 2024) –on topics including employment, income, food security, health, food prices, assets and well-being. Fieldwork took place in two non-consecutive weeks of each month. Data for April 2023-December 2023 were a repeated cross section, while January 2024 established the first month of a panel, the was continued to September 2024. Each month has approximately 550 households in the sample and is representative of urban and rural areas, but is not representative at the province level. This dataset contains combined monthly survey data for all months of the continuous HFPS in Solomon Islands. There is one date file for household level data with a unique household ID. and a separate file for individual level data within each household data, that can be matched to the household file using the household ID, and which also has a unique individual ID within the household data which can be used to track individuals over time within households, where the data is panel data.

    Geographic coverage

    Urban and rural areas of Solomon Islands.

    Analysis unit

    Household, individual.

    Kind of data

    Sample survey data [ssd]

    Sampling procedure

    The initial sample was drawn through Random Digit Dialing (RDD) with geographic stratification. As an objective of the survey was to measure changes in household economic wellbeing over time, the HFPS sought to contact a consistent number of households across each province month to month. This was initially a repeated cross section from April 2023-Dec 2023. The initial sample was drawn from information provided by a major phone service provider in Solomon Islands, covering all the provinces in the country. It had a probability-based weighted design, with a proportionate stratification to achieve geographical representation. The geographical distribution compared to the 2019 Census is listed below for the first month of the HFPS monthly survey:

    Choiseul : Census: 4.3%, HFPS: 5.2% Western : Census: 14.4%, HFPS: 13.7% Isabel : Census: 4.8%, HFPS: 4.7% Central : Census: 3.6%, HFPS: 5.2% Ren Bell : Census: 0.6%, HFPS: 1.4% Guadalcanal: Census: 19.8%, HFPS: 21.1% Malaita : Census: 23.1%, HFPS: 18.7% Makira : Census: 5.6%, HFPS: 5.6% Temotu: Census: 3.0%, HFPS: 3% Honiara: Census: 20.7%, HFPS: 21.3%

    Source: Census of Population and Housing 2019

    Note: The values in the HFPS column represent the proportion of survey participants residing in each province, based on the raw HFPS data from April.

    In April 2023, the geographic distribution of World Bank HFPS participants was generally similar to that of the census data at the province level, though within provinces, areas with less mobile phone connectivity are likely to be underrepresented. One indication of this is that urban areas constituted 38.2 percent of the survey sample, which is a slight overrepresentation, compared to 32.5 percent in the Census 2019.

    A monthly panel was established in January 2024, that is ongoing as of March 2025. In each subsequent month after January 2024, the survey firm would first attempt to contact all households from the previous month and then attempt to contact households from earlier months that had dropped out. After previous numbers were exhausted, RDD with geographic stratification was used for replacement households. Across all months of the survey a total of, 9,926 interviews were completed.

    Mode of data collection

    Computer Assisted Telephone Interview [cati]

    Research instrument

    The questionnaire, which can be found in the External Resources of this documentation, is available in English, with Solomons Pijin translation. There were few changes to the questionnaire across the survey months, but some sections were only introduced in 2024, namely energy access questions and questions to inform the baseline data of the Solomon Islands Government Integrated Economic Development and Climate Resilience (IEDCR) project.

    Cleaning operations

    The raw data were cleaned by the World Bank team using STATA. This included formatting and correcting errors identified through the survey’s monitoring and quality control process. The data are presented in two datasets: a household dataset and an individual dataset. The total number of observations is 9,926 in the household dataset and 62,054 in the individual dataset. The individual dataset contains information on individual demographics and labor market outcomes of all household members aged 15 and above, and the household data set contains information about household demographics, education, food security, food prices, household income, agriculture activities, social protection, access to services, and durable asset ownership. The household identifier (hhid) is available in both the household dataset and the individual dataset. The individual identifier (id_member) can be found in the individual dataset.

  13. h

    cumcm_test

    • huggingface.co
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    sxj1024, cumcm_test [Dataset]. https://huggingface.co/datasets/sxj1024/cumcm_test
    Explore at:
    Authors
    sxj1024
    Description

    Download Dataset

    from datasets import load_dataset

    1. Specify the dataset's "repository ID"

    Replace "your-username/your-dataset-name" with the actual ID of the dataset you want to download

    repo_id = "sxj1024/cumcm_test"

    2. Call load_dataset()

    This will automatically download the data from the Hub (if not cached locally),

    then load it into memory (or in streaming mode)

    dataset = load_dataset(repo_id)

    3. View and use the dataset

    print(dataset)

    You can access… See the full description on the dataset page: https://huggingface.co/datasets/sxj1024/cumcm_test.

  14. Osu! Standard Rankings

    • kaggle.com
    zip
    Updated Jan 30, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Julliane Pierre (2023). Osu! Standard Rankings [Dataset]. https://www.kaggle.com/datasets/jullianepierre/osu-standard-rankings/data
    Explore at:
    zip(3788 bytes)Available download formats
    Dataset updated
    Jan 30, 2023
    Authors
    Julliane Pierre
    Description

    Context:

    osu! is a music rhythm game that has 4 modes (check for more info). In this dataset, you can examine the rankings of the standard mode, taken on 30/01/2023 around 3 PM. The ranking is based on pp (performance points) awarded after every play, which are influenced by play accuracy and score; pps are then summed with weights: your top play will award you the whole pp points of the map, then the percentage is decreased (this can maintain balance between strong players and players who play too much). You can find here many other statistics.

    Contents:

    The dataset contains some columns (see below) reporting statistics for every player in the top 100 of the game in the standard mode. The ranking is ordered by pp. Some players seem to have the same points, but there are decimals that are not shown in the ranking chart on the site

    Variables:

    • rank: global rank (you can use this like an id too)
    • player_name: in-game nickname
    • country: country of origin
    • accuracy: mean accuracy of your top plays
    • play_count: lifetime plays
    • level: level (not very influent on stats)
    • hours: total hours played
    • performance_points: pp which determine the rankings
    • ss: number of ss plays (accuracy=100% and no miss)
    • s: number of s plays (accuracy>=93% and no miss)
    • a: number of a plays (accuracy>=93% but there are misses)
    • watched_by: number of replays of the player watched by others

    Acknowledgements:

    I created this database to use it for my upcoming project in our Data Science.

    I used the 2017 osu! rankings and description by Svidon as a reference in order to produce the 2023 osu! ranking in the top 100 as of January 30, 2023

    This data will be public and can be accessible on this link https://osu.ppy.sh/rankings/osu/performance.

    Here is his kaggle: https://www.kaggle.com/svidon

  15. Data from: Preclinical PET data

    • zenodo.org
    • data.niaid.nih.gov
    zip
    Updated Apr 22, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ville-Veikko Wettenhovi; Ville-Veikko Wettenhovi; Kimmo Jokivarsi; Kimmo Jokivarsi (2021). Preclinical PET data [Dataset]. http://doi.org/10.5281/zenodo.3528056
    Explore at:
    zipAvailable download formats
    Dataset updated
    Apr 22, 2021
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Ville-Veikko Wettenhovi; Ville-Veikko Wettenhovi; Kimmo Jokivarsi; Kimmo Jokivarsi
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    An open preclinical PET dataset. This dataset has been measured with the preclinical Siemens Inveon PET machine. The measured target is a (naive) rat with an injected dose of 21.4 MBq of FDG. The injection was done intravenously (IV) to the tail vein. No specific organ was investigated, but rather the glucose metabolism as a whole. The examination is a 60 minute dynamic acquisition. The measurement was conducted according to the ethical standards set by the University of Eastern Finland.

    The dataset contains the original list-mode data, the (dynamic) sinogram created by the Siemens Inveon Acquisition Workplace (IAW) software (28 frames), the (dynamic) scatter sinogram created by the IAW software (28 frames), the attenuation sinogram created by the IAW software and the normalization coefficients created by the IAW software. Header files are included for all the different data files.

    For documentation on reading the list-mode binary data, please ask Siemens.

    This dataset can be used in the OMEGA software, including the list-mode data, to import the data to MATLAB/Octave, create sinograms from the list-mode data and reconstruct the imported data. For help on using the dataset with OMEGA, see the wiki.

  16. U

    Dataset for "Identification of Soft Modes Across the...

    • researchdata.bath.ac.uk
    zip
    Updated Oct 9, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Daniel Wolverson (2024). Dataset for "Identification of Soft Modes Across the Commensurate-to-Incommensurate Charge Density Wave Transition in 1T-TaSe2" [Dataset]. http://doi.org/10.15125/BATH-01357
    Explore at:
    zipAvailable download formats
    Dataset updated
    Oct 9, 2024
    Dataset provided by
    University of Bath
    Authors
    Daniel Wolverson
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Dataset funded by
    Horizon 2020 Framework Programme (H2020)
    Description

    The dataset contains the inputs necessary to reproduce the theoretical calculations presented in the associated paper, the abstract of which is as follows:

    1T-TaSe2 is a prototypical charge density wave (CDW) material for which electron-phonon coupling and associated lattice reconstruction play an important role in driving and stabilising the CDW phase. Here, we investigate the lattice dynamics of bulk 1T-TaSe2 using angle-resolved ultralow wavenumber Raman spectroscopy down to 10 cm−1. Our high-resolution Raman spectra allow us to identify at least 27 peaks in the commensurate (CCDW) phase in the region 50 - 300 cm−1. Contrary to other layered materials, we do not find evidence of interlayer breathing or shear modes, suggestive of AA stacking in the bulk. Polarisation dependence of the mode intensities allows the assignment of their symmetry, which is supported by calculations of the phonon frequencies for the bulk structure using density functional theory. A detailed temperature dependence in the range T = 80 - 500 K allows us to clearly identify the soft modes associated with the CDW superlattice. Above the commensurate (CCDW) to incommensurate (ICCDW) phase transition at 473 K, we observe a dramatic loss of resolution of all modes, and significant linewidth broadening associated with a reduced phonon lifetime as the charge-order becomes incommensurate with the lattice.

  17. d

    Replication Data for: Integrating online data collection in a household...

    • demo-b2find.dkrz.de
    Updated May 28, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2021). Replication Data for: Integrating online data collection in a household panel study: effects on second-wave participation - Dataset - B2FIND [Dataset]. http://demo-b2find.dkrz.de/dataset/f3984387-c555-5198-86f1-daa3cd0a0cc4
    Explore at:
    Dataset updated
    May 28, 2021
    Description

    Received wisdom in survey practice suggests that using web mode in the first wave of a panelstudy is not as effective as using interviewers. Based on data from a two-wave mode experiment for the Swiss Household Panel (SHP), this study examines how the use of online data collection in the first wave affects participation in the second wave, and if so, who is affected. The experiment compared the traditional SHP design of telephone interviewing to a mixed-mode design combining a household questionnaire by telephone with individual questionnaires by web and to a web-only design for the household and individual questionnaires. We looked at both participation of the household reference person (HRP) and of all household members in multi-person households. We find no support for a higher dropout at wave 2 of HRPs who followed the mixed-mode protocol or who participated online. Neither do we find much evidence that the association between mode and dropout varies by socio-demographic characteristics. The only exception was that of higher dropout rates among HRPs of larger households in the telephone group, compared to the web-only group. Moreover, the mixed-mode and web-only designs were more successful than the telephone design in enrolling and keeping all eligible household members in multi-person households in the study. In conclusion, the results suggest that using web mode (whether alone or combined with telephone) when starting a new panel shows no clear disadvantage with respect to second wave participation compared with telephone interviews.

  18. p

    High Frequency Phone Survey, Continuous Data Collection 2023 - Papua New...

    • microdata.pacificdata.org
    Updated Apr 30, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    William Seitz (2025). High Frequency Phone Survey, Continuous Data Collection 2023 - Papua New Guinea [Dataset]. https://microdata.pacificdata.org/index.php/catalog/877
    Explore at:
    Dataset updated
    Apr 30, 2025
    Dataset provided by
    Darian Naidoo
    William Seitz
    Time period covered
    2023 - 2025
    Area covered
    Papua New Guinea
    Description

    Abstract

    Access to up-to-date socio-economic data is a widespread challenge in Papua New Guinea and other Pacific Island Countries. To increase data availability and promote evidence-based policymaking, the Pacific Observatory provides innovative solutions and data sources to complement existing survey data and analysis. One of these data sources is a series of High Frequency Phone Surveys (HFPS), which began in 2020 as a way to monitor the socio-economic impacts of the COVID-19 Pandemic, and since 2023 has grown into a series of continuous surveys for socio-economic monitoring. See https://www.worldbank.org/en/country/pacificislands/brief/the-pacific-observatory for further details.

    For PNG, after five rounds of data collection from 2020-2022, in April 2023 a monthly HFPS data collection commenced and continued for 18 months (ending September 2024) –on topics including employment, income, food security, health, food prices, assets and well-being. This followed an initial pilot of the data collection from January 2023-March 2023. Data for April 2023-September 2023 were a repeated cross section, while October 2023 established the first month of a panel, which is ongoing as of March 2025. For each month, approximately 550-1000 households were interviewed. The sample is representative of urban and rural areas but is not representative at the province level. This dataset contains combined monthly survey data for all months of the continuous HFPS in PNG. There is one date file for household level data with a unique household ID, and separate files for individual level data within each household data, and household food price data, that can be matched to the household file using the household ID. A unique individual ID within the household data which can be used to track individuals over time within households.

    Geographic coverage

    Urban and rural areas of Papua New Guinea

    Analysis unit

    Household, Individual

    Kind of data

    Sample survey data [ssd]

    Sampling procedure

    The initial sample was drawn through Random Digit Dialing (RDD) with geographic stratification from a large random sample of Digicel’s subscribers. As an objective of the survey was to measure changes in household economic wellbeing over time, the HFPS sought to contact a consistent number of households across each province month to month. This was initially a repeated cross section from April 2023-Dec 2023. The resulting overall sample has a probability-based weighted design, with a proportionate stratification to achieve a proper geographical representation. More information on sampling for the cross-sectional monthly sample can be found in previous documentation for the PNG HFPS data.

    A monthly panel was established in October 2023, that is ongoing as of March 2025. In each subsequent round of data collection after October 2024, the survey firm would first attempt to contact all households from the previous month, and then attempt to contact households from earlier months that had dropped out. After previous numbers were exhausted, RDD with geographic stratification was used for replacement households.

    Mode of data collection

    Computer Assisted Telephone Interview [cati]

    Research instrument

    he questionnaire, which can be found in the External Resources of this documentation, is in English with a Pidgin translation.

    The survey instrument for Q1 2025 consists of the following modules: -1. Basic Household information, -2. Household Roster, -3. Labor, -4a Food security, -4b Food prices -5. Household income, -6. Agriculture, -8. Access to services, -9. Assets -10. Wellbeing and shocks -10a. WASH

    Cleaning operations

    The raw data were cleaned by the World Bank team using STATA. This included formatting and correcting errors identified through the survey’s monitoring and quality control process. The data are presented in two datasets: a household dataset and an individual dataset. The individual dataset contains information on individual demographics and labor market outcomes of all household members aged 15 and above, and the household data set contains information about household demographics, education, food security, food prices, household income, agriculture activities, social protection, access to services, and durable asset ownership. The household identifier (hhid) is available in both the household dataset and the individual dataset. The individual identifier (id_member) can be found in the individual dataset.

  19. V

    2022 - 2024 NTD Annual Data - Service (by Mode and Time Period)

    • data.virginia.gov
    • data.transportation.gov
    • +1more
    csv, json, rdf, xsl
    Updated Oct 28, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S Department of Transportation (2025). 2022 - 2024 NTD Annual Data - Service (by Mode and Time Period) [Dataset]. https://data.virginia.gov/dataset/2022-2024-ntd-annual-data-service-by-mode-and-time-period
    Explore at:
    json, xsl, csv, rdfAvailable download formats
    Dataset updated
    Oct 28, 2025
    Dataset provided by
    Federal Transit Administration
    Authors
    U.S Department of Transportation
    Description

    This represents the Service data reported to the National Transit Database by transit agencies in the 2022, 2023, and 2024 report years.

    In versions of the data tables from before 2014, you can find data on service in the file called "Transit Operating Statistics: Service Supplied and Consumed."

    If you have any other questions about this table, please contact the NTD Help Desk at NTDHelp@dot.gov.

  20. Z

    Data from: FISBe: A real-world benchmark dataset for instance segmentation...

    • data.niaid.nih.gov
    • data-staging.niaid.nih.gov
    • +1more
    Updated Apr 2, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mais, Lisa; Hirsch, Peter; Managan, Claire; Kandarpa, Ramya; Rumberger, Josef Lorenz; Reinke, Annika; Maier-Hein, Lena; Ihrke, Gudrun; Kainmueller, Dagmar (2024). FISBe: A real-world benchmark dataset for instance segmentation of long-range thin filamentous structures [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_10875062
    Explore at:
    Dataset updated
    Apr 2, 2024
    Dataset provided by
    German Cancer Research Center
    Howard Hughes Medical Institute - Janelia Research Campus
    Max Delbrück Center
    Max Delbrück Center for Molecular Medicine
    Authors
    Mais, Lisa; Hirsch, Peter; Managan, Claire; Kandarpa, Ramya; Rumberger, Josef Lorenz; Reinke, Annika; Maier-Hein, Lena; Ihrke, Gudrun; Kainmueller, Dagmar
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    General

    For more details and the most up-to-date information please consult our project page: https://kainmueller-lab.github.io/fisbe.

    Summary

    A new dataset for neuron instance segmentation in 3d multicolor light microscopy data of fruit fly brains

    30 completely labeled (segmented) images

    71 partly labeled images

    altogether comprising ∼600 expert-labeled neuron instances (labeling a single neuron takes between 30-60 min on average, yet a difficult one can take up to 4 hours)

    To the best of our knowledge, the first real-world benchmark dataset for instance segmentation of long thin filamentous objects

    A set of metrics and a novel ranking score for respective meaningful method benchmarking

    An evaluation of three baseline methods in terms of the above metrics and score

    Abstract

    Instance segmentation of neurons in volumetric light microscopy images of nervous systems enables groundbreaking research in neuroscience by facilitating joint functional and morphological analyses of neural circuits at cellular resolution. Yet said multi-neuron light microscopy data exhibits extremely challenging properties for the task of instance segmentation: Individual neurons have long-ranging, thin filamentous and widely branching morphologies, multiple neurons are tightly inter-weaved, and partial volume effects, uneven illumination and noise inherent to light microscopy severely impede local disentangling as well as long-range tracing of individual neurons. These properties reflect a current key challenge in machine learning research, namely to effectively capture long-range dependencies in the data. While respective methodological research is buzzing, to date methods are typically benchmarked on synthetic datasets. To address this gap, we release the FlyLight Instance Segmentation Benchmark (FISBe) dataset, the first publicly available multi-neuron light microscopy dataset with pixel-wise annotations. In addition, we define a set of instance segmentation metrics for benchmarking that we designed to be meaningful with regard to downstream analyses. Lastly, we provide three baselines to kick off a competition that we envision to both advance the field of machine learning regarding methodology for capturing long-range data dependencies, and facilitate scientific discovery in basic neuroscience.

    Dataset documentation:

    We provide a detailed documentation of our dataset, following the Datasheet for Datasets questionnaire:

    FISBe Datasheet

    Our dataset originates from the FlyLight project, where the authors released a large image collection of nervous systems of ~74,000 flies, available for download under CC BY 4.0 license.

    Files

    fisbe_v1.0_{completely,partly}.zip

    contains the image and ground truth segmentation data; there is one zarr file per sample, see below for more information on how to access zarr files.

    fisbe_v1.0_mips.zip

    maximum intensity projections of all samples, for convenience.

    sample_list_per_split.txt

    a simple list of all samples and the subset they are in, for convenience.

    view_data.py

    a simple python script to visualize samples, see below for more information on how to use it.

    dim_neurons_val_and_test_sets.json

    a list of instance ids per sample that are considered to be of low intensity/dim; can be used for extended evaluation.

    Readme.md

    general information

    How to work with the image files

    Each sample consists of a single 3d MCFO image of neurons of the fruit fly.For each image, we provide a pixel-wise instance segmentation for all separable neurons.Each sample is stored as a separate zarr file (zarr is a file storage format for chunked, compressed, N-dimensional arrays based on an open-source specification.").The image data ("raw") and the segmentation ("gt_instances") are stored as two arrays within a single zarr file.The segmentation mask for each neuron is stored in a separate channel.The order of dimensions is CZYX.

    We recommend to work in a virtual environment, e.g., by using conda:

    conda create -y -n flylight-env -c conda-forge python=3.9conda activate flylight-env

    How to open zarr files

    Install the python zarr package:

    pip install zarr

    Opened a zarr file with:

    import zarrraw = zarr.open(, mode='r', path="volumes/raw")seg = zarr.open(, mode='r', path="volumes/gt_instances")

    optional:import numpy as npraw_np = np.array(raw)

    Zarr arrays are read lazily on-demand.Many functions that expect numpy arrays also work with zarr arrays.Optionally, the arrays can also explicitly be converted to numpy arrays.

    How to view zarr image files

    We recommend to use napari to view the image data.

    Install napari:

    pip install "napari[all]"

    Save the following Python script:

    import zarr, sys, napari

    raw = zarr.load(sys.argv[1], mode='r', path="volumes/raw")gts = zarr.load(sys.argv[1], mode='r', path="volumes/gt_instances")

    viewer = napari.Viewer(ndisplay=3)for idx, gt in enumerate(gts): viewer.add_labels( gt, rendering='translucent', blending='additive', name=f'gt_{idx}')viewer.add_image(raw[0], colormap="red", name='raw_r', blending='additive')viewer.add_image(raw[1], colormap="green", name='raw_g', blending='additive')viewer.add_image(raw[2], colormap="blue", name='raw_b', blending='additive')napari.run()

    Execute:

    python view_data.py /R9F03-20181030_62_B5.zarr

    Metrics

    S: Average of avF1 and C

    avF1: Average F1 Score

    C: Average ground truth coverage

    clDice_TP: Average true positives clDice

    FS: Number of false splits

    FM: Number of false merges

    tp: Relative number of true positives

    For more information on our selected metrics and formal definitions please see our paper.

    Baseline

    To showcase the FISBe dataset together with our selection of metrics, we provide evaluation results for three baseline methods, namely PatchPerPix (ppp), Flood Filling Networks (FFN) and a non-learnt application-specific color clustering from Duan et al..For detailed information on the methods and the quantitative results please see our paper.

    License

    The FlyLight Instance Segmentation Benchmark (FISBe) dataset is licensed under the Creative Commons Attribution 4.0 International (CC BY 4.0) license.

    Citation

    If you use FISBe in your research, please use the following BibTeX entry:

    @misc{mais2024fisbe, title = {FISBe: A real-world benchmark dataset for instance segmentation of long-range thin filamentous structures}, author = {Lisa Mais and Peter Hirsch and Claire Managan and Ramya Kandarpa and Josef Lorenz Rumberger and Annika Reinke and Lena Maier-Hein and Gudrun Ihrke and Dagmar Kainmueller}, year = 2024, eprint = {2404.00130}, archivePrefix ={arXiv}, primaryClass = {cs.CV} }

    Acknowledgments

    We thank Aljoscha Nern for providing unpublished MCFO images as well as Geoffrey W. Meissner and the entire FlyLight Project Team for valuablediscussions.P.H., L.M. and D.K. were supported by the HHMI Janelia Visiting Scientist Program.This work was co-funded by Helmholtz Imaging.

    Changelog

    There have been no changes to the dataset so far.All future change will be listed on the changelog page.

    Contributing

    If you would like to contribute, have encountered any issues or have any suggestions, please open an issue for the FISBe dataset in the accompanying github repository.

    All contributions are welcome!

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
The Devastator (2023). Weather and Housing in North America [Dataset]. https://www.kaggle.com/datasets/thedevastator/weather-and-housing-in-north-america
Organization logo

Weather and Housing in North America

Exploring the Relationship between Weather and Housing Conditions in 2012

Explore at:
zip(512280 bytes)Available download formats
Dataset updated
Feb 13, 2023
Authors
The Devastator
License

https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

Area covered
North America
Description

Weather and Housing in North America

Exploring the Relationship between Weather and Housing Conditions in 2012

By [source]

About this dataset

This comprehensive dataset explores the relationship between housing and weather conditions across North America in 2012. Through a range of climate variables such as temperature, wind speed, humidity, pressure and visibility it provides unique insights into the weather-influenced environment of numerous regions. The interrelated nature of housing parameters such as longitude, latitude, median income, median house value and ocean proximity further enhances our understanding of how distinct climates play an integral part in area real estate valuations. Analyzing these two data sets offers a wealth of knowledge when it comes to understanding what factors can dictate the value and comfort level offered by residential areas throughout North America

More Datasets

For more datasets, click here.

Featured Notebooks

  • 🚨 Your notebook can be here! 🚨!

How to use the dataset

This dataset offers plenty of insights into the effects of weather and housing on North American regions. To explore these relationships, you can perform data analysis on the variables provided.

First, start by examining descriptive statistics (i.e., mean, median, mode). This can help show you the general trend and distribution of each variable in this dataset. For example, what is the most common temperature in a given region? What is the average wind speed? How does this vary across different regions? By looking at descriptive statistics, you can get an initial idea of how various weather conditions and housing attributes interact with one another.

Next, explore correlations between variables. Are certain weather variables correlated with specific housing attributes? Is there a link between wind speeds and median house value? Or between humidity and ocean proximity? Analyzing correlations allows for deeper insights into how different aspects may influence one another for a given region or area. These correlations may also inform broader patterns that are present across multiple North American regions or countries.

Finally, use visualizations to further investigate this relationship between climate and housing attributes in North America in 2012. Graphs allow you visualize trends like seasonal variations or long-term changes over time more easily so they are useful when interpreting large amounts of data quickly while providing larger context beyond what numbers alone can tell us about relationships between different aspects within this dataset

Research Ideas

  • Analyzing the effect of climate change on housing markets across North America. By looking at temperature and weather trends in combination with housing values, researchers can better understand how climate change may be impacting certain regions differently than others.
  • Investigating the relationship between median income, house values and ocean proximity in coastal areas. Understanding how ocean proximity plays into housing prices may help inform real estate investment decisions and urban planning initiatives related to coastal development.
  • Utilizing differences in weather patterns across different climates to determine optimal seasonal rental prices for property owners. By analyzing changes in temperature, wind speed, humidity, pressure and visibility from season to season an investor could gain valuable insights into seasonal market trends to maximize their profits from rentals or Airbnb listings over time

Acknowledgements

If you use this dataset in your research, please credit the original authors. Data Source

License

License: CC0 1.0 Universal (CC0 1.0) - Public Domain Dedication No Copyright - You can copy, modify, distribute and perform the work, even for commercial purposes, all without asking permission. See Other Information.

Columns

File: Weather.csv | Column name | Description | |:---------------------|:-----------------------------------------------| | Date/Time | Date and time of the observation. (Date/Time) | | Temp_C | Temperature in Celsius. (Numeric) | | Dew Point Temp_C | Dew point temperature in Celsius. (Numeric) | | Rel Hum_% | Relative humidity in percent. (Numeric) | | Wind Speed_km/h | Wind speed in kilometers per hour. (Numeric) | | Visibility_km | Visibilit...

Search
Clear search
Close search
Google apps
Main menu