PredictLeads Job Openings Data provides real-time hiring insights sourced directly from company websites, ensuring the highest level of accuracy and freshness. Unlike job boards that rely on aggregated listings, our dataset delivers unmatched granularity on job postings, salary trends, and workforce demand - making it a powerful tool for HR, talent acquisition, and market analysis.
Use Cases: ✅ Job Boards Enhancement – Improve job listings with, high-quality postings. ✅ HR Consulting – Analyze hiring trends to guide workforce planning strategies. ✅ Employment Analytics – Track job market shifts, salary benchmarks, and demand for skills. ✅ HR Operations – Optimize recruitment pipelines with direct employer-sourced data. ✅ Competitive Intelligence – Monitor hiring activities of competitors for strategic insights.
Key API Attributes:
API example: https://docs.predictleads.com/v3/api_endpoints/job_openings_dataset
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
The Department actively seeks to expand the range of datasets it shares through the Government’s Open Data portal. In this regard, it is the first government department to publish gender-based statistics. This data is drawn from the Department’s HR database system and provides on a headcount basis, a female-male breakdown at grade and grade equivalent level. These statistics will be published annually, using the position at the end of December. For baseline and comparison purposes, and to assist the reader in forming a picture of how the Department’s gender balance has evolved over the last 20+ years, we have also provided these reports as at December 2000, 2010, 2020, 2021, and 2022. The Department is quite unique in terms of the broad range of grades of its staff. At the end of 2022 there were 75 grades spread across three distinct grade streams; General Service, Professional & Technical, Industrial. We have uploaded a table which provides a breakdown of these grades within their respective streams and which, in the case of the Professional & Technical grades, also shows their General Service grade equivalent. By way of illustration of the diversity of the grades in the Department, our headcount at the end of December 2022 was 1,604 of which 905 were General Service staff serving across 18 different grades and representing 56.42% of our workforce. The Professional & Technical headcount was 533. These staff were spread across 44 grades and accounted for 33.23% of our staffing complement at that time. We had 166 Industrial staff at the end of December. These staff represented 10.35% of our workforce and were spread across 13 different grades.
Hydrographic and Impairment Statistics (HIS) is a National Park Service (NPS) Water Resources Division (WRD) project established to track certain goals created in response to the Government Performance and Results Act of 1993 (GPRA). One water resources management goal established by the Department of the Interior under GRPA requires NPS to track the percent of its managed surface waters that are meeting Clean Water Act (CWA) water quality standards. This goal requires an accurate inventory that spatially quantifies the surface water hydrography that each bureau manages and a procedure to determine and track which waterbodies are or are not meeting water quality standards as outlined by Section 303(d) of the CWA. This project helps meet this DOI GRPA goal by inventorying and monitoring in a geographic information system for the NPS: (1) CWA 303(d) quality impaired waters and causes; and (2) hydrographic statistics based on the United States Geological Survey (USGS) National Hydrography Dataset (NHD). Hydrographic and 303(d) impairment statistics were evaluated based on a combination of 1:24,000 (NHD) and finer scale data (frequently provided by state GIS layers).
Hydrographic and Impairment Statistics (HIS) is a National Park Service (NPS) Water Resources Division (WRD) project established to track certain goals created in response to the Government Performance and Results Act of 1993 (GPRA). One water resources management goal established by the Department of the Interior under GRPA requires NPS to track the percent of its managed surface waters that are meeting Clean Water Act (CWA) water quality standards. This goal requires an accurate inventory that spatially quantifies the surface water hydrography that each bureau manages and a procedure to determine and track which waterbodies are or are not meeting water quality standards as outlined by Section 303(d) of the CWA. This project helps meet this DOI GRPA goal by inventorying and monitoring in a geographic information system for the NPS: (1) CWA 303(d) quality impaired waters and causes; and (2) hydrographic statistics based on the United States Geological Survey (USGS) National Hydrography Dataset (NHD). Hydrographic and 303(d) impairment statistics were evaluated based on a combination of 1:24,000 (NHD) and finer scale data (frequently provided by state GIS layers).
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This repository contains the author gender dataset (as a comma-delimited .csv file) originally created in association with the paper entitled 'The Impact of Gender on Conference Authorship in Audio Engineering: Analysis Using a New Data Collection Method', but since extended to include conferences up to the end of 2019. The original dataset is available at: https://doi.org/10.5281/zenodo.1249693. Please cite both the paper and the relevant dataset if used. Visualisation is available at: http://tibbakoi.github.io/aesgender.
The dataset was produced using a novel method which used self-identified pronouns, therefore allowing for as many groups as necessary to describe the population.
A list of authors was generated from conference proceedings.
An email was sent to each author to acquire their pronoun.
If no email was available/no response was received, a pronoun was acquired from a biography.
If no biography was available, a pronoun was inferred from traditional gender markers and gender presentation.
If no gender marker/photograph was available, the entry was labelled as 'Information Unavailable'. For brevity, the label 'Unknown' is used in the paper.
The columns in the dataset are as follows:
ID: unique identifier of entry
Pronoun: pronoun of entry
Position (abs): numerical absolute position within author list for entry
Position (relative): relative position within author list for entry (either First, Last, or Middle)
Single/multi-author: whether the publication for that entry has a single author or has multiple authors (single author publications are excluded from author position analysis)
Conference: Full conference name of entry
Topic: Topic of conference of entry, taken from conference name
Year: Year of conference of entry
Type: Type of publication for that entry as listed on the online conference proceedings
Grouped Type: Grouping of publication types for that entry for easier analysis due to inconsistencies in online conference proceedings (groups are: workshop, poster, paper, panel, keynote, invited speaker, invited paper, demo)
Inc. for author pos?: True/False as to whether to include the entry for analysis over author position (included types are: paper, invited paper, poster (all with multiple authors) as these have meaningful author orders)
Inc. for single/multi-author?: True/False as to whether to include the entry for analysis over single/multi author (includes types are: paper, invited paper, poster as these have meaningful author orders)
Invited paper status: Grouping of the types to allow statistical analysis over invited vs non-invited types (invited types are: invited speaker, invited paper, keynote, panel. Non-invited types are: poster, paper, demo, workshop)
NB: Some grouping of the data is required as online conference proceedings are not always consistent (Column 10). Some labelling of the data is required to determine which entries to include in certain types of analysis (Columns 11-13).
This dataset is distributed in the hopes that it will prove useful under the Creative Commons Attribution 4.0, with no warranty; or the implied warranty of merchantability or fitness for a particular problem.
Dataset curated by: Kat Young and Michael Lovedee-Turner, formerly at the AudioLab, Dept. of Electronic Engineering, University of York. Contact: kathryn.ae.young@gmail.com
The GAR15 global exposure database is based on a top-down approach where statistical information including socio-economic, building type, and capital stock at a national level are transposed onto the grids of 5x5 or 1x1 using geographic distribution of population data and gross domestic product (GDP) as proxies.
This dataset was created to pilot techniques for creating synthetic data from datasets containing sensitive and protected information in the local government context. Synthetic data generation replaces actual data with representative data generated from statistical models; this preserves the key data properties that allow insights to be drawn from the data while protecting the privacy of the people included in the data. We invite you to read the Understanding Synthetic Data white paper for a concise introduction to synthetic data.
This effort was a collaboration of the Urban Institute, Allegheny County’s Department of Human Services (DHS) and CountyStat, and the University of Pittsburgh’s Western Pennsylvania Regional Data Center.
The source data for this project consisted of 1) month-by-month records of services included in Allegheny County's data warehouse and 2) demographic data about the individuals who received the services. As the County’s data warehouse combines this service and client data, this data is referred to as “Integrated Services data”. Read more about the data warehouse and the kinds of services it includes here.
Synthetic data are typically generated from probability distributions or models identified as being representative of the confidential data. For this dataset, a model of the Integrated Services data was used to generate multiple versions of the synthetic dataset. These different candidate datasets were evaluated to select for publication the dataset version that best balances utility and privacy. For high-level information about this evaluation, see the Synthetic Data User Guide.
For more information about the creation of the synthetic version of this data, see the technical brief for this project, which discusses the technical decision making and modeling process in more detail.
This disaggregated synthetic data allows for many analyses that are not possible with aggregate data (summary statistics). Broadly, this synthetic version of this data could be analyzed to better understand the usage of human services by people in Allegheny County, including the interplay in the usage of multiple services and demographic information about clients.
Some amount of deviation from the original data is inherent to the synthetic data generation process. Specific examples of limitations (including undercounts and overcounts for the usage of different services) are given in the Synthetic Data User Guide and the technical report describing this dataset's creation.
Please reach out to this dataset's data steward (listed below) to let us know how you are using this data and if you found it to be helpful. Please also provide any feedback on how to make this dataset more applicable to your work, any suggestions of future synthetic datasets, or any additional information that would make this more useful. Also, please copy wprdc@pitt.edu on any such feedback (as the WPRDC always loves to hear about how people use the data that they publish and how the data could be improved).
1) A high-level overview of synthetic data generation as a method for protecting privacy can be found in the Understanding Synthetic Data white paper.
2) The Synthetic Data User Guide provides high-level information to help users understand the motivation, evaluation process, and limitations of the synthetic version of Allegheny County DHS's Human Services data published here.
3) Generating a Fully Synthetic Human Services Dataset: A Technical Report on Synthesis and Evaluation Methodologies describes the full technical methodology used for generating the synthetic data, evaluating the various options, and selecting the final candidate for publication.
4) The WPRDC also hosts the Allegheny County Human Services Community Profiles dataset, which provides annual updates on human-services usage, aggregated by neighborhood/municipality. That data can be explored using the County's Human Services Community Profile web site.
The GAR15 global exposure database is based on a top-down approach where statistical information including socio-economic, building type, and capital stock at a national level are transposed onto the grids of 5x5 or 1x1 using geographic distribution of population data and gross domestic product (GDP) as proxies.
The dataset provides noise data to facilitate the tracking of trends in transportation-related noise. This dataset includes results from simplified noise modeling methods and should not be used to evaluate noise levels in individual locations. See the documentation for a full description of methodologies and assumptions: https://doi.org/10.21949/1519111 The 2016 National Transportation Noise Map dataset utilized transportation mode input data from 2016 in a model and is current as of October 2020, published by the Bureau of Transportation Statistics (BTS), and is part of the U.S. Department of Transportation (USDOT)/Bureau of Transportation Statistics (BTS) National Transportation Atlas Database (NTAD). Please see the website https://www.bts.gov/geospatial/national-transportation-noise-map, for downloads and more information about these datasets. For web services of these data, please navigate to https://geo.dot.gov/server/rest/services/Hosted and search for service names beginning with "Noise." Data within the National Transportation Noise Map represent potential noise levels across the nation for an average annual day for the specified year. These data are intended to facilitate the tracking of trends in transportation-related noise by mode collectively over time and should not be used to evaluate noise levels in individual locations and/or at specific times. This dataset is developed using a 24-hr equivalent A-weighted sound level (denoted by LAeq) noise metric. The results represent the approximate average noise energy due to transportation noise sources over a 24-hour period at the receptor locations where noise is computed. Layers include Aviation and Road Noise for the Lower 48 States as well as Alaska and Hawaii. The full listing can be found below. 2016 National Transportation Noise Alaska Alaska Aviation Noise Alaska Road and Aviation Noise Alaska Road Noise Lower 48 States (CONUS) Lower 48 States (CONUS) Aviation Noise Lower 48 States (CONUS) Road and Aviation Noise Lower 48 States (CONUS) Road Noise Hawaii Hawaii Aviation Noise Hawaii Road and Aviation Noise Hawaii Road Noise
Precursor to the Enterprise Human Resources Integration-Statistical Data Mart (EHRI-SDM). It contains information about the employee and their history of their personnel actions. The SDM data is extracted and placed into the CPDF format to allow longitudinal analysis over a longer-term than is available in EHRI/SDM. Serves as the source for research data that goes back to 1973.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
ICARIA project had as one of its main purposes to develop coherent, reliable and usable downscaled climate projections from the last CMIP6 in order to construct the basis for efficient support to climate adaptation and decision-making of the related stakeholders, supporting the adaptation of critical assets within the project. These projections were obtained with also the purpose to be freely available for further use in subsequent studies and, hence, foster adaptation to climate change in more areas. Therefore, ICARIA’s climate information is already based on CMIP6 models and incorporating in its workflow the current SSPs. The presented high-resolution future climate projections display a unique dataset. These models will provide the scenarios to be considered within the Risk Assessment and the design and development of all adaptation measures coming as ICARIA outcomes.
For further details, find here a brief of the methodology followed:
The statistical downscaling methodology applied in ICARIA by FIC, named FICLIMA (Ribalaygua et al. 2013), consists of a two-step analogue/regression statistical method which has been used in national and international projects with good verification results (i.e.: Monjo et al. 2016). The first step is common for all simulated climate variables and it is based on an analogue stratification (Zorita et al. 1993). An analogue method was applied based on the hypothesis that ‘analogue’ atmospheric patterns (predictors) should cause analogue local effects (predictands), which means that the number of days that were most similar to the day to be downscaled was selected. The similarity between any two days was measured according to three nested synoptic windows (with different weights) and four large-scale fields using a pseudo-Euclidean distance between the large-scale fields used as predictors. For each predictor, the weighted Euclidean distance was calculated and standardised by substituting it with the closest percentile of a reference population of weighted Euclidean distances for that predictor. This method is a good method for reproducing nonlinear relationships between predictors and the predictands, but it could not be used to simulate values outside of the range of observed values. In order to overcome this problem and obtain a better simulation, a second step was required.
For this second step, the procedures applied depend on the variable of interest. To determine the temperature, multiple linear regression analysis for the selected number of most analogous days was performed for each station and for each problem day. From a group of potential predictors, the linear regression selected those with the highest correlation, using a forward and backward stepwise approach.
For precipitation, a group of m problem days (we use the whole days of a month) is downscaled. For each problem day we obtain a “preliminary precipitation amount” averaging the rain amount of its n most analogous days, so we can sort the m problem days from the highest to the lowest “preliminary precipitation amount”. For assigning the final precipitation amount, all amounts of the m×n analogous days are sorted and clustered in m groups. Every quantity is finally assigned, orderly, to the m days previously sorted by the “preliminary precipitation amount”.
For wind or relative humidity, the second step is a transfer function between the observed probability distribution and the simulated one using the averaged values from the n = 30 analogous days. Particularly, a parametric bias correction was performed to the time series obtained from the analogue stratification (first step). In order to estimate the improvement of this procedure, the bias correction was also applied to the direct model outputs.
This second step done at a daily scale with an inner thorough verification procedure is essential and the main differentiating process of FICLIMA method. It extends beyond mean values to include extremes and covers all time scales, including daily intervals. With the verification it can be proven If the method correctly simulates changes from one day to the next, indicating an effective capture of the underlying physical connections between predictors and predictands. These physical links remain relatively consistent, even in the face of climate change (as opposed to purely empirical relationships that might shift). In essence, this approach theoretically addresses the primary challenge in statistical downscaling known as the non-stationarity problem. This problem questions the stability of predictor/predictand relationships established in the past, probing whether these relationships will persist in the future.
The dataset shared here includes information for the three case studies tackled in ICARIA: Barcelona Metropolitan Area (AMB), Salzburg Region (SLZ), and South Aegean Region (SAR). The information provided covers data and outcomes by 10 models belonging to CMIP6. Each model has a historical archive, from 01/01/1950 to 31/12/2014 and 4 future scenarios (ssp126, ssp245, ssp370 and ssp585) ranging from 01/01/2015 to 31/12/2100. The relation of the selected models is detailed in the next Table:
Table 1. Information about the 10 climate models belonging to the 6 Coupled Model Intercomparison Project (CMIP6) corresponding to the IPCC AR6. Models were retrieved from the Earth System Grid Federation (ESGF) portal in support of the Program for Climate Model Diagnosis and Intercomparison (PCMDI).
CMIP6 MODELS
Resolution
Responsible Centre
References
ACCESS-CM2
1,875º x 1,250º
Australian Community Climate and Earth System Simulator (ACCESS), Australia
Bi, D. et al (2020)
BCC-CSM2-MR
1,125º x 1,121º
Beijing Climate Center (BCC), China Meteorological Administration, China.
Wu T. et al. (2019)
CanESM5
2,812º x 2,790º
Canadian Centre for Climate Modeling and Analysis (CC-CMA), Canadá.
Swart, N.C. et al. (2019)
CMCC-ESM2
1,000º x 1,000º
Centro Mediterraneo sui Cambiamenti Climatici (CMCC).
Cherchi et al, 2018
CNRM-ESM2-1
1,406º x 1,401º
CNRM (Centre National de Recherches Meteorologiques), Meteo-France, Francia.
Seferian, R. (2019)
EC-EARTH3
0,703º x 0,702º
EC-EARTH Consortium
EC-Earth Consortium. (2019)
MPI-ESM1-2-HR
0,938º x 0,935º
Max-Planck Institute for Meteorology (MPI-M), Germany.
Müller et al., (2018)
MRI-ESM2-0
1,125º x 1,121º
Meteorological Research Institute (MRI), Japan.
Yukimoto, S. et al. (2019)
NorESM2-MM
1,250º x 0,942º
Norwegian Climate Centre (NCC), Norway.
Bentsen, M. et al. (2019)
UKESM1-0-LL
1,875º x 1,250º
UK Met Office, Hadley Centre, United Kingdom
Good, P. et al. (2019)
The results shared here are developed over each of the observational locations that were retrieved to run the statistical downscaling. Both the observational datasets and the future climate change projections can be found here in a TXT format for each of the locations where they were developed. Observations include the main variables retrieved after a quality and homogeneity control, and climate projections together with extreme indicators include each of the 10 models, the 4 Tier 1 SSPs and data until the year 2100. The variables treated belong to the main climate variables and their related extreme indicators as they were defined during the ICARIA project. You can find here a summary table of all the variables and indicators that were used to develop the projections. Table 2. Summary of selected thermal and precipitation indicators, grouped aligned with the main hazards they feed. “nd” = number of days; “ne” = number of events.
Index/name
Short description
Source
Variable
Units
Threshold
Thermal indicators
TX90 / TX10
Warm/cold days
Zhang et al. (2011)
TX
nd
90 / 10%
HD
Heat day
ICARIA
TX
nd
30 °C
EHD
Extreme heat day
ICARIA
TX
nd
35 °C
TR
Tropical nights
Zhang et al. (2011)
TN
nd
20 °C
EQ
Equatorial nights
AEMet 2020, ICARIA
TN
nd
25 °C
IN
Infernal nights
ICARIA
TN
nd
30 °C
FD
Frost days
Zhang et al. (2011)
TN
nd
< 0 °C
Max consec
Max spell length for above thermal indicators
ICARIA
-
nd
-
Nº events
Number of above thermal indicators events
ICARIA
-
ne
3 days
TXm
Mean maximum temperatures
ICARIA
TX
°C
-
TNm
Mean minimum temperatures
ICARIA
TN
°C
-
TM
Mean temperatures
ICARIA
TA
°C
-
HWle
Heatwave length
ICARIA
TX
nd
3d > 95% TX
HWim/HWix
Mean and maximum heatwave intensity
ICARIA
TX
°C
3d > 95% TX
HWf
Heatwave frequency
ICARIA
TX
ne
3d > 95% TX
HWd
Heatwave days
ICARIA
TX
nd
3d > 95% TX
HI - P90
Heat Index (percentile 90)
NWS (1994)
TX, RH
°C
TX>27 °C, HR> 40%
UTCI
Universal Thermal Climate Index
Bröde et al. (2012)
TARH, W
-
-
UHI
Isla de calor (BCN) anual y estacional
AMB, Metrobs 2015
T
°C
TM1-TM2 > 0 °C
Precipitation indicators
R20
Number of heavy precipitation days
Zhang et al. (2011)
P
nd
20 mm
R50, R100
Days with extreme heavy rain
AMB et al. (2017)
P
nd
50mm
100mm
Ra
Yearly and seasonal rainfall relative change
ICARIA
P
mm
≥ 0.1mm
IDF - CCF
IDF Curves - Climate Change Factor
Arnbjerg-Nielsen (2012)
P
-
≥ 0.1mm
Forest fire indicators
Mean FWI
Mean Canadian FWI in fire season
Stock, B.J. et al. (1989)
RHn, TX, P, W
.
June-September
Very High FWI
Very High Canadian FWI
Stock, B.J. et al.
Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
License information was derived automatically
Numbers of potential redundancies from HR1 forms and employers submitting them, by region and industry, Great Britain. These are official statistics in development.
ODC Public Domain Dedication and Licence (PDDL) v1.0http://www.opendatacommons.org/licenses/pddl/1.0/
License information was derived automatically
Traffic-related data collected by the Boston Transportation Department, as well as other City departments and State agencies. Various types of counts: Turning Movement Counts, Automated Traffic Recordings, Pedestrian Counts, Delay Studies, and Gap Studies.
~_Turning Movement Counts (TMC)_ present the number of motor vehicles, pedestrians, and cyclists passing through the particular intersection. Specific movements and crossings are recorded for all street approaches involved with the intersection. This data is used in traffic signal retiming programs and for signal requests. Counts are typically conducted for 2-, 4-, 11-, and 12-Hr periods.
~_Automated Traffic Recordings (ATR)_ record the volume of motor vehicles traveling along a particular road, measures of travel speeds, and approximations of the class of the vehicles (motorcycle, 2-axle, large box truck, bus, etc). This type of count is conducted only along a street link/corridor, to gather data between two intersections or points of interest. This data is used in travel studies, as well as to review concerns about street use, speeding, and capacity. Counts are typically conducted for 12- & 24-Hr periods.
~_Pedestrian Counts (PED)_ record the volume of individual persons crossing a given street, whether at an existing intersection or a mid-block crossing. This data is used to review concerns about crossing safety, as well as for access analysis for points of interest. Counts are typically conducted for 2-, 4-, 11-, and 12-Hr periods.
~_Delay Studies (DEL)_ measure the delay experienced by motor vehicles due to the effects of congestion. Counts are typically conducted for a 1-Hr period at a given intersection or point of intersecting vehicular traffic.
~_Gap Studies (GAP)_ record the number of gaps which are typically present between groups of vehicles traveling through an intersection or past a point on a street. This data is used to assess opportunities for pedestrians to cross the street and for analyses on vehicular “platooning”. Counts are typically conducted for a specific 1-Hr period at a single point of crossing.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
ICARIA project had as one of its main purposes to develop coherent, reliable and usable downscaled climate projections from the last CMIP6 in order to construct the basis for efficient support to climate adaptation and decision-making of the related stakeholders, supporting the adaptation of critical assets within the project. These projections were obtained with also the purpose of being freely available for further use in subsequent studies and, hence, foster adaptation to climate change in more areas. Therefore, ICARIA’s climate information is already based on CMIP6 models and incorporating in its workflow the current SSPs. The presented high-resolution future climate projections display a unique dataset, being obtained from a high-quality and high-density set of weather observations that are then interpolated to the case studies of interest in a 100x100m resolution grid, which is the main outcome offered in this publication. These models will provide the scenarios to be considered within the Risk Assessment and the design and development of all adaptation measures coming as ICARIA outcomes.
For further details, find here a brief of the methodology followed:
The statistical downscaling methodology applied in ICARIA by FIC, named FICLIMA (Ribalaygua et al. 2013), consists of a two-step analogue/regression statistical method which has been used in national and international projects with good verification results (i.e.: Monjo et al. 2016). The first step is common for all simulated climate variables and it is based on an analogue stratification (Zorita et al. 1993). An analogue method was applied based on the hypothesis that ‘analogue’ atmospheric patterns (predictors) should cause analogue local effects (predictands), which means that the number of days that were most similar to the day to be downscaled was selected. The similarity between any two days was measured according to three nested synoptic windows (with different weights) and four large-scale fields using a pseudo-Euclidean distance between the large-scale fields used as predictors. For each predictor, the weighted Euclidean distance was calculated and standardised by substituting it with the closest percentile of a reference population of weighted Euclidean distances for that predictor. This method is a good method for reproducing nonlinear relationships between predictors and the predictands, but it could not be used to simulate values outside of the range of observed values. In order to overcome this problem and obtain a better simulation, a second step was required.
For this second step, the procedures applied depend on the variable of interest. To determine the temperature, multiple linear regression analysis for the selected number of most analogous days was performed for each station and for each problem day. From a group of potential predictors, the linear regression selected those with the highest correlation, using a forward and backward stepwise approach.
For precipitation, a group of m problem days (we use the whole days of a month) is downscaled. For each problem day we obtain a “preliminary precipitation amount” averaging the rain amount of its n most analogous days, so we can sort the m problem days from the highest to the lowest “preliminary precipitation amount”. For assigning the final precipitation amount, all amounts of the m×n analogous days are sorted and clustered in m groups. Every quantity is finally assigned, orderly, to the m days previously sorted by the “preliminary precipitation amount”.
For wind or relative humidity, the second step is a transfer function between the observed probability distribution and the simulated one using the averaged values from the n = 30 analogous days. Particularly, a parametric bias correction was performed to the time series obtained from the analogue stratification (first step). In order to estimate the improvement of this procedure, the bias correction was also applied to the direct model outputs.
This second step done at a daily scale with an inner thorough verification procedure is essential and the main differentiating process of FICLIMA method. It extends beyond mean values to include extremes and covers all time scales, including daily intervals. With the verification it can be proven If the method correctly simulates changes from one day to the next, indicating an effective capture of the underlying physical connections between predictors and predictands. These physical links remain relatively consistent, even in the face of climate change (as opposed to purely empirical relationships that might shift). In essence, this approach theoretically addresses the primary challenge in statistical downscaling known as the non-stationarity problem. This problem questions the stability of predictor/predictand relationships established in the past, probing whether these relationships will persist in the future.
The dataset shared here includes information for the three case studies tackled in ICARIA: Barcelona Metropolitan Area (AMB), Salzburg Region (SLZ), and South Aegean Region (SAR). The information provided covers data and outcomes by 10 models belonging to CMIP6. Each model has a historical archive, from 01/01/1950 to 31/12/2014 and 4 future scenarios (ssp126, ssp245, ssp370 and ssp585) ranging from 01/01/2015 to 31/12/2100. The relation of the selected models is detailed in the next Table:
Table 1. Information about the 10 climate models belonging to the 6 Coupled Model Intercomparison Project (CMIP6) corresponding to the IPCC AR6. Models were retrieved from the Earth System Grid Federation (ESGF) portal in support of the Program for Climate Model Diagnosis and Intercomparison (PCMDI).
CMIP6 MODELS
Resolution
Responsible Centre
References
ACCESS-CM2
1,875º x 1,250º
Australian Community Climate and Earth System Simulator (ACCESS), Australia
Bi, D. et al (2020)
BCC-CSM2-MR
1,125º x 1,121º
Beijing Climate Center (BCC), China Meteorological Administration, China.
Wu T. et al. (2019)
CanESM5
2,812º x 2,790º
Canadian Centre for Climate Modeling and Analysis (CC-CMA), Canadá.
Swart, N.C. et al. (2019)
CMCC-ESM2
1,000º x 1,000º
Centro Mediterraneo sui Cambiamenti Climatici (CMCC).
Cherchi et al, 2018
CNRM-ESM2-1
1,406º x 1,401º
CNRM (Centre National de Recherches Meteorologiques), Meteo-France, Francia.
Seferian, R. (2019)
EC-EARTH3
0,703º x 0,702º
EC-EARTH Consortium
EC-Earth Consortium. (2019)
MPI-ESM1-2-HR
0,938º x 0,935º
Max-Planck Institute for Meteorology (MPI-M), Germany.
Müller et al., (2018)
MRI-ESM2-0
1,125º x 1,121º
Meteorological Research Institute (MRI), Japan.
Yukimoto, S. et al. (2019)
NorESM2-MM
1,250º x 0,942º
Norwegian Climate Centre (NCC), Norway.
Bentsen, M. et al. (2019)
UKESM1-0-LL
1,875º x 1,250º
UK Met Office, Hadley Centre, United Kingdom
Good, P. et al. (2019)
The climate projections have been developed over each of the observational locations that were retrieved to run the statistical downscaling. The results from these projections have been spatially interpolated into a 100x100m grid with a Multi-lineal Regression Model considering diverse adjustments and topographic corrections. The results presented here are the median of the 10 models used, obtained for each of the 4 SSPs and each of the time periods considered in ICARIA until the year 2100. The variables treated belong to the main climate variables and their related extreme indicators as they were defined during the ICARIA project. You can find here a summary table of all the variables and indicators that were used to develop the projections. Table 2. Summary of selected thermal and precipitation indicators, grouped aligned with the main hazards they feed. “nd” = number of days; “ne” = number of events.
Index/name
Short description
Source
Variable
Units
Threshold
Thermal indicators
TX90 / TX10
Warm/cold days
Zhang et al. (2011)
TX
nd
90 / 10%
HD
Heat day
ICARIA
TX
nd
30 °C
EHD
Extreme heat day
ICARIA
TX
nd
35 °C
TR
Tropical nights
Zhang et al. (2011)
TN
nd
20 °C
EQ
Equatorial nights
AEMet 2020, ICARIA
TN
nd
25 °C
IN
Infernal nights
ICARIA
TN
nd
30 °C
FD
Frost days
Zhang et al. (2011)
TN
nd
< 0 °C
Max consec
Max spell length for above thermal indicators
ICARIA
-
nd
-
Nº events
Number of above thermal indicators events
ICARIA
-
ne
3 days
TXm
Mean maximum temperatures
ICARIA
TX
°C
-
TNm
Mean minimum temperatures
ICARIA
TN
°C
-
TM
Mean temperatures
ICARIA
TA
°C
-
HWle
Heatwave length
ICARIA
TX
nd
3d > 95% TX
HWim/HWix
Mean and maximum heatwave intensity
ICARIA
TX
°C
3d > 95% TX
HWf
Heatwave frequency
ICARIA
TX
ne
3d > 95% TX
HWd
Heatwave days
ICARIA
TX
nd
3d > 95% TX
HI - P90
Heat Index (percentile 90)
NWS (1994)
TX, RH
°C
TX>27 °C, HR> 40%
UTCI
Universal Thermal Climate Index
Bröde et al. (2012)
TARH, W
-
-
UHI
Isla de calor (BCN) anual y estacional
AMB, Metrobs 2015
T
°C
TM1-TM2 > 0 °C
Precipitation indicators
R20
Number of heavy precipitation days
Zhang et al. (2011)
P
nd
20 mm
R50, R100
Days with extreme heavy rain
AMB et al. (2017)
P
nd
50mm
100mm
Ra
Yearly and seasonal rainfall relative change
ICARIA
P
mm
≥ 0.1mm
IDF - CCF
IDF Curves - Climate Change Factor
Arnbjerg-Nielsen (2012)
P
-
≥ 0.1mm
Forest fire
The GAR15 global exposure database is based on a top-down approach where statistical information including socio-economic, building type, and capital stock at a national level are transposed onto the grids of 5x5 or 1x1 using geographic distribution of population data and gross domestic product (GDP) as proxies.
This database contains tobacco consumption data from 1970-2015 collected through a systematic search coupled with consultation with country and subject-matter experts. Data quality appraisal was conducted by at least two research team members in duplicate, with greater weight given to official government sources. All data was standardized into units of cigarettes consumed and a detailed accounting of data quality and sourcing was prepared. Data was found for 82 of 214 countries for which searches for national cigarette consumption data were conducted, representing over 95% of global cigarette consumption and 85% of the world’s population. Cigarette consumption fell in most countries over the past three decades but trends in country specific consumption were highly variable. For example, China consumed 2.5 million metric tonnes (MMT) of cigarettes in 2013, more than Russia (0.36 MMT), the United States (0.28 MMT), Indonesia (0.28 MMT), Japan (0.20 MMT), and the next 35 highest consuming countries combined. The US and Japan achieved reductions of more than 0.1 MMT from a decade earlier, whereas Russian consumption plateaued, and Chinese and Indonesian consumption increased by 0.75 MMT and 0.1 MMT, respectively. These data generally concord with modelled country level data from the Institute for Health Metrics and Evaluation and have the additional advantage of not smoothing year-over-year discontinuities that are necessary for robust quasi-experimental impact evaluations. Before this study, publicly available data on cigarette consumption have been limited—either inappropriate for quasi-experimental impact evaluations (modelled data), held privately by companies (proprietary data), or widely dispersed across many national statistical agencies and research organisations (disaggregated data). This new dataset confirms that cigarette consumption has decreased in most countries over the past three decades, but that secular country specific consumption trends are highly variable. The findings underscore the need for more robust processes in data reporting, ideally built into international legal instruments or other mandated processes. To monitor the impact of the WHO Framework Convention on Tobacco Control and other tobacco control interventions, data on national tobacco production, trade, and sales should be routinely collected and openly reported. The first use of this database for a quasi-experimental impact evaluation of the WHO Framework Convention on Tobacco Control is: Hoffman SJ, Poirier MJP, Katwyk SRV, Baral P, Sritharan L. Impact of the WHO Framework Convention on Tobacco Control on global cigarette consumption: quasi-experimental evaluations using interrupted time series analysis and in-sample forecast event modelling. BMJ. 2019 Jun 19;365:l2287. doi: https://doi.org/10.1136/bmj.l2287 Another use of this database was to systematically code and classify longitudinal cigarette consumption trajectories in European countries since 1970 in: Poirier MJ, Lin G, Watson LK, Hoffman SJ. Classifying European cigarette consumption trajectories from 1970 to 2015. Tobacco Control. 2022 Jan. DOI: 10.1136/tobaccocontrol-2021-056627. Statement of Contributions: Conceived the study: GEG, SJH Identified multi-country datasets: GEG, MP Extracted data from multi-country datasets: MP Quality assessment of data: MP, GEG Selection of data for final analysis: MP, GEG Data cleaning and management: MP, GL Internet searches: MP (English, French, Spanish, Portuguese), GEG (English, French), MYS (Chinese), SKA (Persian), SFK (Arabic); AG, EG, BL, MM, YM, NN, EN, HR, KV, CW, and JW (English), GL (English) Identification of key informants: GEG, GP Project Management: LS, JM, MP, SJH, GEG Contacts with Statistical Agencies: MP, GEG, MYS, SKA, SFK, GP, BL, MM, YM, NN, HR, KV, JW, GL Contacts with key informants: GEG, MP, GP, MYS, GP Funding: GEG, SJH SJH: Hoffman, SJ; JM: Mammone J; SRVK: Rogers Van Katwyk, S; LS: Sritharan, L; MT: Tran, M; SAK: Al-Khateeb, S; AG: Grjibovski, A.; EG: Gunn, E; SKA: Kamali-Anaraki, S; BL: Li, B; MM: Mahendren, M; YM: Mansoor, Y; NN: Natt, N; EN: Nwokoro, E; HR: Randhawa, H; MYS: Yunju Song, M; KV: Vercammen, K; CW: Wang, C; JW: Woo, J; MJPP: Poirier, MJP; GEG: Guindon, EG; GP: Paraje, G; GL Gigi Lin Key informants who provided data: Corne van Walbeek (South Africa, Jamaica) Frank Chaloupka (US) Ayda Yurekli (Turkey) Dardo Curti (Uruguay) Bungon Ritthiphakdee (Thailand) Jakub Lobaszewski (Poland) Guillermo Paraje (Chile, Argentina) Key informants who provided useful insights: Carlos Manuel Guerrero López (Mexico) Muhammad Jami Husain (Bangladesh) Nigar Nargis (Bangladesh) Rijo M John (India) Evan Blecher (Nigeria, Indonesia, Philippines, South Africa) Yagya Karki (Nepal) Anne CK Quah (Malaysia) Nery Suarez Lugo (Cuba) Agencies providing assistance: Iranian Tobacco Co. Institut National de la Statistique (Tunisia) HM Revenue & Customs (UK) Eidgenössisches Finanzdepartement EFD/Département fédéral des finances DFF (Switzerland) Bureau of Internal Revenue (Philippines) National Statistical Office of Mongolia
Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
License information was derived automatically
This article explains the changes to methods used in estimating Capital Stocks and Consumption of Fixed Capital. Source agency: Office for National Statistics Designation: National Statistics Language: English Alternative title: Cap Stock
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset presents aggregated data regarding employed persons within the relevant statistical regions, including the number of employee jobs and median employee income per job by sex, classified by Greater Capital City Statistical Area (GCCSA). The data spans from 2011-12 to 2017-18 financial year and is aggregated to the 2016 GCCSA boundaries.
Jobs in Australia provide aggregate statistics and are sourced from the Linked Employer-Employee Dataset (LEED). It provides new information about filled jobs in Australia, the people who hold them, and their employers. Employee person refers to any person with one or more job. Employed persons in this publication can be employees, owner-managers of unincorporated enterprises, or both. Employed persons are persons who have employment income in the reference year, excluding those whose employment income is made up entirely of an employment termination payment. Employed persons have one or more jobs on the job file.
The job counts in this release differ from the filled job estimates from other sources such as the Australian Labour Account and the Labour Force Australia. The Jobs in Australia release provides insights into all jobs held throughout the year, while the Labour Account data provides the number of filled jobs at a point-in-time each quarter (and annually for the financial year reference period), and Labour Force Survey data measures the number of people employed each month.
For more information on the release please visit the Australian Bureau of Statistics
This release provides statistics on the number and nature of jobs, the people who hold them, and their employers. These statistics can be used to understand regional labour markets or to identify the impact of major changes in local communities. The release also provides new insights into the number of jobs people hold, the duration of jobs, and the industries and employment income of concurrent jobs.
The scope of these data includes individuals who submitted an individual tax return to the Australian Taxation Office (ATO), individuals who had a Pay As You Go (PAYG) payment summary issued by an employer and their employers.
AURIN has spatially enabled the original data. The following additional changes were made:
Where data was not published for confidential reasons, "np" in the original data, the records have been set to null.
Total values may be higher than the sum of the published components due to this confidentialisation.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset presents aggregated data regarding employee jobs and median employee income per job, classified by industry subdivision at Statistical Area Level 4 (SA4). The data spans over the 2013-14 financial year and is aggregated to the 2016 SA4 boundaries.
Jobs in Australia provide aggregate statistics and are sourced from the Linked Employer-Employee Dataset (LEED). It provides new information about filled jobs in Australia, the people who hold them, and their employers. An 'employee Job' refers to a job for which the occupant receives remuneration in wages, salary, payment in kind, or piece rates. This excludes self-employment jobs held by Owner-Managers of Unincorporated Enterprises (OMUE).
The job counts in this release differ from the filled job estimates from other sources such as the Australian Labour Account and the Labour Force Australia. The Jobs in Australia release provides insights into all jobs held throughout the year, while the Labour Account data provides the number of filled jobs at a point-in-time each quarter (and annually for the financial year reference period), and Labour Force Survey data measures the number of people employed each month.
For more information on the release please visit the Australian Bureau of Statistics
This release provides statistics on the number and nature of jobs, the people who hold them, and their employers. These statistics can be used to understand regional labour markets or to identify the impact of major changes in local communities. The release also provides new insights into the number of jobs people hold, the duration of jobs, and the industries and employment income of concurrent jobs.
The scope of these data includes individuals who submitted an individual tax return to the Australian Taxation Office (ATO), individuals who had a Pay As You Go (PAYG) payment summary issued by an employer and their employers.
AURIN has spatially enabled the original data. The following additional changes were made:
Where data was not published for confidential reasons, "np" in the original data, the records have been set to null.
Total values may be higher than the sum of the published components due to this confidentialisation.
Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
License information was derived automatically
This User Guide contains information about the ONSPD including: directory content; data currency; the methodology for assigning areas to postcodes; data formats; data quality and limitations and details of recent changes that have impacted on the data. Various annexes and tables provide more detailed supporting information. (File size - 962 KB)
PredictLeads Job Openings Data provides real-time hiring insights sourced directly from company websites, ensuring the highest level of accuracy and freshness. Unlike job boards that rely on aggregated listings, our dataset delivers unmatched granularity on job postings, salary trends, and workforce demand - making it a powerful tool for HR, talent acquisition, and market analysis.
Use Cases: ✅ Job Boards Enhancement – Improve job listings with, high-quality postings. ✅ HR Consulting – Analyze hiring trends to guide workforce planning strategies. ✅ Employment Analytics – Track job market shifts, salary benchmarks, and demand for skills. ✅ HR Operations – Optimize recruitment pipelines with direct employer-sourced data. ✅ Competitive Intelligence – Monitor hiring activities of competitors for strategic insights.
Key API Attributes:
API example: https://docs.predictleads.com/v3/api_endpoints/job_openings_dataset