Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Some say climate change is the biggest threat of our age while others say it’s a myth based on dodgy science. We are turning some of the data over to you so you can form your own view.
Even more than with other data sets that Kaggle has featured, there’s a huge amount of data cleaning and preparation that goes into putting together a long-time study of climate trends. Early data was collected by technicians using mercury thermometers, where any variation in the visit time impacted measurements. In the 1940s, the construction of airports caused many weather stations to be moved. In the 1980s, there was a move to electronic thermometers that are said to have a cooling bias.
Given this complexity, there are a range of organizations that collate climate trends data. The three most cited land and ocean temperature data sets are NOAA’s MLOST, NASA’s GISTEMP and the UK’s HadCrut.
We have repackaged the data from a newer compilation put together by the Berkeley Earth, which is affiliated with Lawrence Berkeley National Laboratory. The Berkeley Earth Surface Temperature Study combines 1.6 billion temperature reports from 16 pre-existing archives. It is nicely packaged and allows for slicing into interesting subsets (for example by country). They publish the source data and the code for the transformations they applied. They also use methods that allow weather observations from shorter time series to be included, meaning fewer observations need to be thrown away.
In this dataset, we have include several files:
Global Land and Ocean-and-Land Temperatures (GlobalTemperatures.csv):
Other files include:
The raw data comes from the Berkeley Earth data page.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
The World Bank is an international financial institution that provides loans to countries of the world for capital projects. The World Bank's stated goal is the reduction of poverty. Source: https://en.wikipedia.org/wiki/World_Bank
This dataset combines key education statistics from a variety of sources to provide a look at global literacy, spending, and access.
For more information, see the World Bank website.
Fork this kernel to get started with this dataset.
https://bigquery.cloud.google.com/dataset/bigquery-public-data:world_bank_health_population
http://data.worldbank.org/data-catalog/ed-stats
https://cloud.google.com/bigquery/public-data/world-bank-education
Citation: The World Bank: Education Statistics
Dataset Source: World Bank. This dataset is publicly available for anyone to use under the following terms provided by the Dataset Source - http://www.data.gov/privacy-policy#data_policy - and is provided "AS IS" without any warranty, express or implied, from Google. Google disclaims all liability for any damages, direct or indirect, resulting from the use of the dataset.
Banner Photo by @till_indeman from Unplash.
Of total government spending, what percentage is spent on education?
Our National Footprint Accounts (NFAs) measure the ecological resource use and resource capacity of nations from 1961 to 2014. The calculations in the National Footprint Accounts are primarily based on United Nations data sets, including those published by the Food and Agriculture Organization, United Nations Commodity Trade Statistics Database, and the UN Statistics Division, as well as the International Energy Agency. The 2018 edition of the NFA features some exciting updates from last year’s 2017 edition, including data for more countries and improved data sources and methodology. Methodology changes:
To visualize our data in our data explorer click here. Dataset provides Ecological Footprint per capita data for years 1961-2014 in global hectares (gha). Ecological Footprint is a measure of how much area of biologically productive land and water an individual, population, or activity requires to produce all the resources it consumes and to absorb the waste it generates, using prevailing technology and resource management practices. The Ecological Footprint is measured in global hectares. Since trade is global, an individual or country's Footprint tracks area from all over the world. Without further specification, Ecological Footprint generally refers to the Ecological Footprint of consumption (rather than only production or export). Ecological Footprint is often referred to in short form as Footprint.
This data includes total and per capita national biocapacity, ecological footprint of consumption, ecological footprint of production and total area in hectares. This dataset, however, does not include any of our yield factors (national or world) nor any equivalence factors. To view these click here.
Revealing links between human consumption and other human behaviors, geographic characteristics, political landscapes,
How can others contribute? - [ ] Join this table on other data.world datasets (prefereably country-level data) - [ ] Write queries - [ ] Create graphics - [ ] Post and share discoveries
Altosight | AI Custom Web Scraping Data
✦ Altosight provides global web scraping data services with AI-powered technology that bypasses CAPTCHAs, blocking mechanisms, and handles dynamic content.
We extract data from marketplaces like Amazon, aggregators, e-commerce, and real estate websites, ensuring comprehensive and accurate results.
✦ Our solution offers free unlimited data points across any project, with no additional setup costs.
We deliver data through flexible methods such as API, CSV, JSON, and FTP, all at no extra charge.
― Key Use Cases ―
➤ Price Monitoring & Repricing Solutions
🔹 Automatic repricing, AI-driven repricing, and custom repricing rules 🔹 Receive price suggestions via API or CSV to stay competitive 🔹 Track competitors in real-time or at scheduled intervals
➤ E-commerce Optimization
🔹 Extract product prices, reviews, ratings, images, and trends 🔹 Identify trending products and enhance your e-commerce strategy 🔹 Build dropshipping tools or marketplace optimization platforms with our data
➤ Product Assortment Analysis
🔹 Extract the entire product catalog from competitor websites 🔹 Analyze product assortment to refine your own offerings and identify gaps 🔹 Understand competitor strategies and optimize your product lineup
➤ Marketplaces & Aggregators
🔹 Crawl entire product categories and track best-sellers 🔹 Monitor position changes across categories 🔹 Identify which eRetailers sell specific brands and which SKUs for better market analysis
➤ Business Website Data
🔹 Extract detailed company profiles, including financial statements, key personnel, industry reports, and market trends, enabling in-depth competitor and market analysis
🔹 Collect customer reviews and ratings from business websites to analyze brand sentiment and product performance, helping businesses refine their strategies
➤ Domain Name Data
🔹 Access comprehensive data, including domain registration details, ownership information, expiration dates, and contact information. Ideal for market research, brand monitoring, lead generation, and cybersecurity efforts
➤ Real Estate Data
🔹 Access property listings, prices, and availability 🔹 Analyze trends and opportunities for investment or sales strategies
― Data Collection & Quality ―
► Publicly Sourced Data: Altosight collects web scraping data from publicly available websites, online platforms, and industry-specific aggregators
► AI-Powered Scraping: Our technology handles dynamic content, JavaScript-heavy sites, and pagination, ensuring complete data extraction
► High Data Quality: We clean and structure unstructured data, ensuring it is reliable, accurate, and delivered in formats such as API, CSV, JSON, and more
► Industry Coverage: We serve industries including e-commerce, real estate, travel, finance, and more. Our solution supports use cases like market research, competitive analysis, and business intelligence
► Bulk Data Extraction: We support large-scale data extraction from multiple websites, allowing you to gather millions of data points across industries in a single project
► Scalable Infrastructure: Our platform is built to scale with your needs, allowing seamless extraction for projects of any size, from small pilot projects to ongoing, large-scale data extraction
― Why Choose Altosight? ―
✔ Unlimited Data Points: Altosight offers unlimited free attributes, meaning you can extract as many data points from a page as you need without extra charges
✔ Proprietary Anti-Blocking Technology: Altosight utilizes proprietary techniques to bypass blocking mechanisms, including CAPTCHAs, Cloudflare, and other obstacles. This ensures uninterrupted access to data, no matter how complex the target websites are
✔ Flexible Across Industries: Our crawlers easily adapt across industries, including e-commerce, real estate, finance, and more. We offer customized data solutions tailored to specific needs
✔ GDPR & CCPA Compliance: Your data is handled securely and ethically, ensuring compliance with GDPR, CCPA and other regulations
✔ No Setup or Infrastructure Costs: Start scraping without worrying about additional costs. We provide a hassle-free experience with fast project deployment
✔ Free Data Delivery Methods: Receive your data via API, CSV, JSON, or FTP at no extra charge. We ensure seamless integration with your systems
✔ Fast Support: Our team is always available via phone and email, resolving over 90% of support tickets within the same day
― Custom Projects & Real-Time Data ―
✦ Tailored Solutions: Every business has unique needs, which is why Altosight offers custom data projects. Contact us for a feasibility analysis, and we’ll design a solution that fits your goals
✦ Real-Time Data: Whether you need real-time data delivery or scheduled updates, we provide the flexibility to receive data when you need it. Track price changes, monitor product trends, or gather...
The total amount of data created, captured, copied, and consumed globally is forecast to increase rapidly, reaching *** zettabytes in 2024. Over the next five years up to 2028, global data creation is projected to grow to more than *** zettabytes. In 2020, the amount of data created and replicated reached a new high. The growth was higher than previously expected, caused by the increased demand due to the COVID-19 pandemic, as more people worked and learned from home and used home entertainment options more often. Storage capacity also growing Only a small percentage of this newly created data is kept though, as just * percent of the data produced and consumed in 2020 was saved and retained into 2021. In line with the strong growth of the data volume, the installed base of storage capacity is forecast to increase, growing at a compound annual growth rate of **** percent over the forecast period from 2020 to 2025. In 2020, the installed base of storage capacity reached *** zettabytes.
Databank (databank.worldbank.org) is an online web resource that provides simple and quick access to collections of time series data. It has advanced functions for selecting and displaying data, performing customized queries, downloading data, and creating charts and maps. Users can create dynamic custom reports based on their selection of countries, indicators and years. They offer a growing range of free, easy-to-access tools, research and knowledge to help people address the world's development challenges. For example, the Open Data website offers free access to comprehensive, downloadable indicators about development in countries around the globe.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Developed by SOLARGIS and provided by the Global Solar Atlas (GSA), this data resource contains diffuse horizontal irradiation (DIF) in kWh/m² covering the globe. Data is provided in a geographic spatial reference (EPSG:4326). The resolution (pixel size) of solar resource data (GHI, DIF, GTI, DNI) is 9 arcsec (nominally 250 m), PVOUT and TEMP 30 arcsec (nominally 1 km) and OPTA 2 arcmin (nominally 4 km). The data is hyperlinked under 'resources' with the following characeristics: DIF LTAy_AvgDailyTotals (GeoTIFF) Data format: GEOTIFF File size : 198.94 MB There are two temporal representation of solar resource and PVOUT data available: • Longterm yearly/monthly average of daily totals (LTAym_AvgDailyTotals) • Longterm average of yearly/monthly totals (LTAym_YearlyMonthlyTotals) Both type of data are equivalent, you can select the summarization of your preference. The relation between datasets is described by simple equations: • LTAy_YearlyTotals = LTAy_DailyTotals * 365.25 • LTAy_MonthlyTotals = LTAy_DailyTotals * Number_of_Days_In_The_Month For individual country or regional data downloads please see: https://globalsolaratlas.info/download (use the drop-down menu to select country or region of interest) For data provided in AAIGrid please see: https://globalsolaratlas.info/download/world. For more information and terms of use, please, read metadata, provided in PDF and XML format for each data layer in a download file. For other data formats, resolution or time aggregation, please, visit Solargis website. Data can be used for visualization, further processing, and geo-analysis in all mainstream GIS software with raster data processing capabilities (such as open source QGIS, commercial ESRI ArcGIS products and others).
Public Domain Mark 1.0https://creativecommons.org/publicdomain/mark/1.0/
License information was derived automatically
The World Database on Protected Areas (WDPA) is the most comprehensive global database of marine and terrestrial protected areas, updated on a monthly basis, and is one of the key global biodiversity data sets being widely used by scientists, businesses, governments, International secretariats and others to inform planning, policy decisions and management.
The WDPA is a joint project between UN Environment and the International Union for Conservation of Nature (IUCN). The compilation and management of the WDPA is carried out by UN Environment World Conservation Monitoring Centre (UNEP-WCMC), in collaboration with governments, non-governmental organisations, academia and industry. There are monthly updates of the data which are made available online through the Protected Planet website where the data is both viewable and downloadable.
Data and information on the world's protected areas compiled in the WDPA are used for reporting to the Convention on Biological Diversity on progress towards reaching the Aichi Biodiversity Targets (particularly Target 11), to the UN to track progress towards the 2030 Sustainable Development Goals, to some of the Intergovernmental Science-Policy Platform on Biodiversity and Ecosystem Services (IPBES) core indicators, and other international assessments and reports including the Global Biodiversity Outlook, as well as for the publication of the United Nations List of Protected Areas. Every two years, UNEP-WCMC releases the Protected Planet Report on the status of the world's protected areas and recommendations on how to meet international goals and targets.
Many platforms are incorporating the WDPA to provide integrated information to diverse users, including businesses and governments, in a range of sectors including mining, oil and gas, and finance. For example, the WDPA is included in the Integrated Biodiversity Assessment Tool, an innovative decision support tool that gives users easy access to up-to-date information that allows them to identify biodiversity risks and opportunities within a project boundary.
The reach of the WDPA is further enhanced in services developed by other parties, such as the Global Forest Watch and the Digital Observatory for Protected Areas, which provide decision makers with access to monitoring and alert systems that allow whole landscapes to be managed better. Together, these applications of the WDPA demonstrate the growing value and significance of the Protected Planet initiative.
The Pilot Analysis of Global Ecosystems (PAGE): Agroecosystems was one of four pilot studies undertaken as precursors to the Millennium Ecosystem Assessment. The study identifies linkages between crop production systems and environmental services such as food, soil resources, water, biodiversity, and carbon cycling, in the hopes that a better understanding of these linkages might lead to policies that can contribute both to improved food output and to improved ecosystem service provision. Th e PAGE Agroecosystems report includes a series of 24 maps that provide a detailed spatial perspective on agroecosystems a nd agroecosystem services. Pilot Analysis of Global Ecosystems (PAGE): Agroecosystems Dataset offers the 9 geospatial datasets used to build these maps. The datasets are: PAGE Global Agricultural Extent. The data describe the location and extent of global agriculture and are derived from GLLCCD 1998; USGS EDC1999a. PAGE Global Agricultural Extent version 2. The data are an update of the original PAGE Global Agricultural Extent, based on version 2 of the Global Land Cover Characteristics Dataset (GLCCD v2.0, USGS/EDC 2000). The methods used to create this dataset were the same as those employed to create the origina l PAGE Global Agricultural Extent. Mask of the Global Extent of Agriculture. This dataset displays the global extent of agricultural areas as defined by the PAGE study. The other datasets made available on this site (eg. tree cover, soil carbon, area free of soil constraints) only show values for areas within this agricultural extent. PAGE Global Agroecosystems. These data characterize agroecosystems, defined as "a biological and natural resource system managed by humans for the primary purpose of producing food as well as other socially valuable nonfood products and environmental services." Percentage Tree Cover within the Extent of Agriculture. This is a raster dataset that shows the proportion of land area within the PAGE agricultural extent that is occupied by "woody vegetation" (mature vegetation whose approximate height is greater than 5 meters). Carbon Storage in Soils within the PAGE Agricultural Extent. The data give a global estimate of soil organic carbon storage in agricultural lands, calculated by applying Batjes' (1996 and 2000) soil organic carbon content values by soil type area share of each 5 x 5 minute of the Digital Soil Map of the World (FAO 1995). Agriculture Share of Watershed. This dataset depicts agricultural area as a share of total watershed area. The share of each watershed that is agricultural was calculated by applying a weighted percentage to each PAGE agricultural land cover class. Area Free of Soil Constraints. The data show the proportional area within the PAGE agricultural extent that is free from soil constraints. The area free of soil constraints is based on fertility capability classification (FCC) app lied to FAO's Digital Soil Map of the World (1995). Outline of Land and Water Area. These data are used to provide a boundary for land areas and facilitate the readability of maps.
Notice of data discontinuation: Since the start of the pandemic, AP has reported case and death counts from data provided by Johns Hopkins University. Johns Hopkins University has announced that they will stop their daily data collection efforts after March 10. As Johns Hopkins stops providing data, the AP will also stop collecting daily numbers for COVID cases and deaths. The HHS and CDC now collect and visualize key metrics for the pandemic. AP advises using those resources when reporting on the pandemic going forward.
April 9, 2020
April 20, 2020
April 29, 2020
September 1st, 2020
February 12, 2021
new_deaths
column.February 16, 2021
The AP is using data collected by the Johns Hopkins University Center for Systems Science and Engineering as our source for outbreak caseloads and death counts for the United States and globally.
The Hopkins data is available at the county level in the United States. The AP has paired this data with population figures and county rural/urban designations, and has calculated caseload and death rates per 100,000 people. Be aware that caseloads may reflect the availability of tests -- and the ability to turn around test results quickly -- rather than actual disease spread or true infection rates.
This data is from the Hopkins dashboard that is updated regularly throughout the day. Like all organizations dealing with data, Hopkins is constantly refining and cleaning up their feed, so there may be brief moments where data does not appear correctly. At this link, you’ll find the Hopkins daily data reports, and a clean version of their feed.
The AP is updating this dataset hourly at 45 minutes past the hour.
To learn more about AP's data journalism capabilities for publishers, corporations and financial institutions, go here or email kromano@ap.org.
Use AP's queries to filter the data or to join to other datasets we've made available to help cover the coronavirus pandemic
Filter cases by state here
Rank states by their status as current hotspots. Calculates the 7-day rolling average of new cases per capita in each state: https://data.world/associatedpress/johns-hopkins-coronavirus-case-tracker/workspace/query?queryid=481e82a4-1b2f-41c2-9ea1-d91aa4b3b1ac
Find recent hotspots within your state by running a query to calculate the 7-day rolling average of new cases by capita in each county: https://data.world/associatedpress/johns-hopkins-coronavirus-case-tracker/workspace/query?queryid=b566f1db-3231-40fe-8099-311909b7b687&showTemplatePreview=true
Join county-level case data to an earlier dataset released by AP on local hospital capacity here. To find out more about the hospital capacity dataset, see the full details.
Pull the 100 counties with the highest per-capita confirmed cases here
Rank all the counties by the highest per-capita rate of new cases in the past 7 days here. Be aware that because this ranks per-capita caseloads, very small counties may rise to the very top, so take into account raw caseload figures as well.
The AP has designed an interactive map to track COVID-19 cases reported by Johns Hopkins.
@(https://datawrapper.dwcdn.net/nRyaf/15/)
<iframe title="USA counties (2018) choropleth map Mapping COVID-19 cases by county" aria-describedby="" id="datawrapper-chart-nRyaf" src="https://datawrapper.dwcdn.net/nRyaf/10/" scrolling="no" frameborder="0" style="width: 0; min-width: 100% !important;" height="400"></iframe><script type="text/javascript">(function() {'use strict';window.addEventListener('message', function(event) {if (typeof event.data['datawrapper-height'] !== 'undefined') {for (var chartId in event.data['datawrapper-height']) {var iframe = document.getElementById('datawrapper-chart-' + chartId) || document.querySelector("iframe[src*='" + chartId + "']");if (!iframe) {continue;}iframe.style.height = event.data['datawrapper-height'][chartId] + 'px';}}});})();</script>
Johns Hopkins timeseries data - Johns Hopkins pulls data regularly to update their dashboard. Once a day, around 8pm EDT, Johns Hopkins adds the counts for all areas they cover to the timeseries file. These counts are snapshots of the latest cumulative counts provided by the source on that day. This can lead to inconsistencies if a source updates their historical data for accuracy, either increasing or decreasing the latest cumulative count. - Johns Hopkins periodically edits their historical timeseries data for accuracy. They provide a file documenting all errors in their timeseries files that they have identified and fixed here
This data should be credited to Johns Hopkins University COVID-19 tracking project
Product provided by Wappalyzer. Instant access to website technology stacks.
Lookup API Perform near-instant technology lookups with the Lookup API. Results are fetched from our comprehensive database of millions of websites. If we haven't seen a domain before, we'll index it immediately and report back within minutes.
LinkedIn companies use datasets to access public company data for machine learning, ecosystem mapping, and strategic decisions. Popular use cases include competitive analysis, CRM enrichment, and lead generation.
Use our LinkedIn Companies Information dataset to access comprehensive data on companies worldwide, including business size, industry, employee profiles, and corporate activity. This dataset provides key company insights, organizational structure, and competitive landscape, tailored for market researchers, HR professionals, business analysts, and recruiters.
Leverage the LinkedIn Companies dataset to track company growth, analyze industry trends, and refine your recruitment strategies. By understanding company dynamics and employee movements, you can optimize sourcing efforts, enhance business development opportunities, and gain a strategic edge in your market. Stay informed and make data-backed decisions with this essential resource for understanding global company ecosystems.
This dataset is ideal for:
- Market Research: Identifying key trends and patterns across different industries and geographies.
- Business Development: Analyzing potential partners, competitors, or customers.
- Investment Analysis: Assessing investment potential based on company size, funding, and industries.
- Recruitment & Talent Analytics: Understanding the workforce size and specialties of various companies.
CUSTOM
Please review the respective licenses below:
World Ocean Atlas 2018 (WOA18) is a set of objectively analyzed (one degree grid and quarter degree grid) climatological fields of in situ temperature, salinity, dissolved oxygen, Apparent Oxygen Utilization (AOU), percent oxygen saturation, phosphate, silicate, and nitrate at standard depth levels for annual, seasonal, and monthly compositing periods for the World Ocean. Quarter degree fields are for temperature and salinity only. It also includes associated statistical fields of observed oceanographic profile data interpolated to standard depth levels on quarter degree, one degree, and five degree grids. Temperature and salinity fields are available for six decades (1955-1964, 1965-1974, 1975-1984, 1985-1994, 1995-2004, and 2005-2017) an average of all decades representing the period 1955-2017, as well as a thirty year "climate normal" period 1981-2010. Oxygen fields (as well as AOU and percent oxygen saturation) are available using all quality controlled data 1960-2017, nutrient fields using all quality controlled data from the entire sampling period 1878-2017. This accession is a product generated by the National Centers for Environmental Information's (NCEI) Ocean Climate Laboratory Team. The analyses are derived from the NCEI World Ocean Database 2018.
The Associated Press is sharing data from the COVID Impact Survey, which provides statistics about physical health, mental health, economic security and social dynamics related to the coronavirus pandemic in the United States.
Conducted by NORC at the University of Chicago for the Data Foundation, the probability-based survey provides estimates for the United States as a whole, as well as in 10 states (California, Colorado, Florida, Louisiana, Minnesota, Missouri, Montana, New York, Oregon and Texas) and eight metropolitan areas (Atlanta, Baltimore, Birmingham, Chicago, Cleveland, Columbus, Phoenix and Pittsburgh).
The survey is designed to allow for an ongoing gauge of public perception, health and economic status to see what is shifting during the pandemic. When multiple sets of data are available, it will allow for the tracking of how issues ranging from COVID-19 symptoms to economic status change over time.
The survey is focused on three core areas of research:
Instead, use our queries linked below or statistical software such as R or SPSS to weight the data.
If you'd like to create a table to see how people nationally or in your state or city feel about a topic in the survey, use the survey questionnaire and codebook to match a question (the variable label) to a variable name. For instance, "How often have you felt lonely in the past 7 days?" is variable "soc5c".
Nationally: Go to this query and enter soc5c as the variable. Hit the blue Run Query button in the upper right hand corner.
Local or State: To find figures for that response in a specific state, go to this query and type in a state name and soc5c as the variable, and then hit the blue Run Query button in the upper right hand corner.
The resulting sentence you could write out of these queries is: "People in some states are less likely to report loneliness than others. For example, 66% of Louisianans report feeling lonely on none of the last seven days, compared with 52% of Californians. Nationally, 60% of people said they hadn't felt lonely."
The margin of error for the national and regional surveys is found in the attached methods statement. You will need the margin of error to determine if the comparisons are statistically significant. If the difference is:
The survey data will be provided under embargo in both comma-delimited and statistical formats.
Each set of survey data will be numbered and have the date the embargo lifts in front of it in the format of: 01_April_30_covid_impact_survey. The survey has been organized by the Data Foundation, a non-profit non-partisan think tank, and is sponsored by the Federal Reserve Bank of Minneapolis and the Packard Foundation. It is conducted by NORC at the University of Chicago, a non-partisan research organization. (NORC is not an abbreviation, it part of the organization's formal name.)
Data for the national estimates are collected using the AmeriSpeak Panel, NORC’s probability-based panel designed to be representative of the U.S. household population. Interviews are conducted with adults age 18 and over representing the 50 states and the District of Columbia. Panel members are randomly drawn from AmeriSpeak with a target of achieving 2,000 interviews in each survey. Invited panel members may complete the survey online or by telephone with an NORC telephone interviewer.
Once all the study data have been made final, an iterative raking process is used to adjust for any survey nonresponse as well as any noncoverage or under and oversampling resulting from the study specific sample design. Raking variables include age, gender, census division, race/ethnicity, education, and county groupings based on county level counts of the number of COVID-19 deaths. Demographic weighting variables were obtained from the 2020 Current Population Survey. The count of COVID-19 deaths by county was obtained from USA Facts. The weighted data reflect the U.S. population of adults age 18 and over.
Data for the regional estimates are collected using a multi-mode address-based (ABS) approach that allows residents of each area to complete the interview via web or with an NORC telephone interviewer. All sampled households are mailed a postcard inviting them to complete the survey either online using a unique PIN or via telephone by calling a toll-free number. Interviews are conducted with adults age 18 and over with a target of achieving 400 interviews in each region in each survey.Additional details on the survey methodology and the survey questionnaire are attached below or can be found at https://www.covid-impact.org.
Results should be credited to the COVID Impact Survey, conducted by NORC at the University of Chicago for the Data Foundation.
To learn more about AP's data journalism capabilities for publishers, corporations and financial institutions, go here or email kromano@ap.org.
PredictLeads Job Openings Data provides high-quality hiring insights sourced directly from company websites - not job boards. Using advanced web scraping technology, our dataset offers real-time access to job trends, salaries, and skills demand, making it a valuable resource for B2B sales, recruiting, investment analysis, and competitive intelligence.
Key Features:
✅214M+ Job Postings Tracked – Data sourced from 92 Million company websites worldwide. ✅7,1M+ Active Job Openings – Updated in real-time to reflect hiring demand. ✅Salary & Compensation Insights – Extract salary ranges, contract types, and job seniority levels. ✅Technology & Skill Tracking – Identify emerging tech trends and industry demands. ✅Company Data Enrichment – Link job postings to employer domains, firmographics, and growth signals. ✅Web Scraping Precision – Directly sourced from employer websites for unmatched accuracy.
Primary Attributes:
Job Metadata:
Salary Data (salary_data)
Occupational Data (onet_data) (object, nullable)
Additional Attributes:
📌 Trusted by enterprises, recruiters, and investors for high-precision job market insights.
PredictLeads Dataset: https://docs.predictleads.com/v3/guide/job_openings_dataset
The Global Historical Climatology Network - Daily (GHCN-Daily/GHCNd) dataset integrates daily climate observations from approximately 30 different data sources. Version 3 was released in September 2012 with the addition of data from two additional station networks. Changes to the processing system associated with the version 3 release also allowed for updates to occur 7 days a week rather than only on most weekdays. Version 3 contains station-based measurements from well over 90,000 land-based stations worldwide, about two thirds of which are for precipitation measurement only. Other meteorological elements include, but are not limited to, daily maximum and minimum temperature, temperature at the time of observation, snowfall and snow depth. Over 25,000 stations are regularly updated with observations from within roughly the last month. The dataset is also routinely reconstructed (usually every week) from its roughly 30 data sources to ensure that GHCNd is generally in sync with its growing list of constituent sources. During this process, quality assurance checks are applied to the full dataset. Where possible, GHCNd station data are also updated daily from a variety of data streams. Station values for each daily update also undergo a suite of quality checks.
International Data & Economic Analysis (IDEA) is USAID's comprehensive source of economic and social data and analysis. IDEA brings together over 12,000 data series from over 125 sources into one location for easy access by USAID and its partners through the USAID public website. The data are broken down by countries, years and the following sectors: Economy, Country Ratings and Rankings, Trade, Development Assistance, Education, Health, Population, and Natural Resources. IDEA regularly updates the database as new data become available. Examples of IDEA sources include the Demographic and Health Surveys, STATcompiler; UN Food and Agriculture Organization, Food Price Index; IMF, Direction of Trade Statistics; Millennium Challenge Corporation; and World Bank, World Development Indicators. The database can be queried by navigating to the site displayed in the Home Page field below.
This version has been superseded by a newer version. It is highly recommended for users to access the current version. Users should only access this superseded version for special cases, such as reproducing studies. If necessary, this version can be accessed by contacting NCEI. The NOAA Global Surface Temperature Dataset (NOAAGlobalTemp) is a blended product from two independent analysis products: the Extended Reconstructed Sea Surface Temperature (ERSST) analysis and the land surface temperature (LST) analysis using the Global Historical Climatology Network (GHCN) temperature database. The data is merged into a monthly global surface temperature dataset dating back from 1880 to the present. The monthly product output is in gridded (5 degree x 5 degree) and time series formats. The product is used in climate monitoring assessments of near-surface temperatures on a global scale. The changes from version 4 to version 5 include an update to the primary input datasets: ERSST version 5 (updated from v4), and GHCN-M version 4 (updated from v3.3.3). Version 5 updates also include a new netCDF file format with CF conventions. This dataset is formerly known as Merged Land-Ocean Surface Temperature (MLOST).
This dataset contains the following files for California influenza surveillance data: 1) Outpatient Influenza-like Illness Surveillance Data by Region and Influenza Season from volunteer sentinel providers; 2) Clinical Sentinel Laboratory Influenza and Other Respiratory Virus Surveillance Data by Region and Influenza Season from volunteer sentinel laboratories; and 3) Public Health Laboratory Influenza Respiratory Virus Surveillance Data by Region and Influenza Season from California public health laboratories. The Immunization Branch at the California Department of Public Health (CDPH) collects, compiles and analyzes information on influenza activity year-round in California and produces a weekly influenza surveillance report during October through May. The California influenza surveillance system is a collaborative effort between CDPH and its many partners at local health departments, public health and clinical laboratories, vital statistics offices, healthcare providers, clinics, emergency departments, and the Centers for Disease Control and Prevention (CDC). California data are also included in the CDC weekly influenza surveillance report, FluView, and help contribute to the national picture of Influenza activity in the United States. The information collected allows CDPH and CDC to: 1) find out when and where influenza activity is occurring; 2) track influenza-related illness; 3) determine what influenza viruses are circulating; 4) detect changes in influenza viruses; and 5) measure the impact influenza is having on hospitalizations and deaths.
Source: https://www.cdph.ca.gov/Programs/CID/DCDC/Pages/Immunization/Influenza.aspx
Last updated at https://data.chhs.ca.gov : 2021-06-22
License: https://data.chhs.ca.gov/pages/terms
World Weather Records (WWR) is an archived publication and digital data set. WWR is meteorological data from locations around the world. Through most of its history, WWR has been a publication, first published in 1927. Data includes monthly mean values of pressure, temperature, precipitation, and where available, station metadata notes documenting observation practices and station configurations. In recent years, data were supplied by National Meteorological Services of various countries, many of which became members of the World Meteorological Organization (WMO). The First Issue included data from earliest records available at that time up to 1920. Data have been collected for periods 1921-30 (2nd Series), 1931-40 (3rd Series), 1941-50 (4th Series), 1951-60 (5th Series), 1961-70 (6th Series), 1971-80 (7th Series), 1981-90 (8th Series), 1991-2000 (9th Series), and 2001-2011 (10th Series). The most recent Series 11 continues, insofar as possible, the record of monthly mean values of station pressure, sea-level pressure, temperature, and monthly total precipitation for stations listed in previous volumes. In addition to these parameters, mean monthly maximum and minimum temperatures have been collected for many stations and are archived in digital files by NCEI. New stations have also been included. In contrast to previous series, the 11th Series is available for the partial decade, so as to limit waiting period for new records. It begins in 2010 and is updated yearly, extending into the entire decade.
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Some say climate change is the biggest threat of our age while others say it’s a myth based on dodgy science. We are turning some of the data over to you so you can form your own view.
Even more than with other data sets that Kaggle has featured, there’s a huge amount of data cleaning and preparation that goes into putting together a long-time study of climate trends. Early data was collected by technicians using mercury thermometers, where any variation in the visit time impacted measurements. In the 1940s, the construction of airports caused many weather stations to be moved. In the 1980s, there was a move to electronic thermometers that are said to have a cooling bias.
Given this complexity, there are a range of organizations that collate climate trends data. The three most cited land and ocean temperature data sets are NOAA’s MLOST, NASA’s GISTEMP and the UK’s HadCrut.
We have repackaged the data from a newer compilation put together by the Berkeley Earth, which is affiliated with Lawrence Berkeley National Laboratory. The Berkeley Earth Surface Temperature Study combines 1.6 billion temperature reports from 16 pre-existing archives. It is nicely packaged and allows for slicing into interesting subsets (for example by country). They publish the source data and the code for the transformations they applied. They also use methods that allow weather observations from shorter time series to be included, meaning fewer observations need to be thrown away.
In this dataset, we have include several files:
Global Land and Ocean-and-Land Temperatures (GlobalTemperatures.csv):
Other files include:
The raw data comes from the Berkeley Earth data page.