Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
These three datasets contain data and R code to assign a proxy variable for office worker, based on responses to an open-ended question (OEQ) about occupation in Swedish surveys. The R code and proxy variable can be applied to any dataset with Swedish OEQ about occupation; the R code is also adaptable for OEQ in any language, provided there is a standard classification of occupations in that language.
The R code can be found in the dataset Assigning_office_worker_proxy.R, and the proxy variable in the dataset SSYK12_modified.xlsx.
The dataset Occupation_response.xlsx gives an example of what can be extracted from a Swedish questionnaire with an OEQ about occupation. The dataset can be replaced with optional data as long as it includes two variables named “ID” and “Occupation_swe” (i.e., occupation title given by respondent).
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
This synthetic dataset is suitable for learning decision tree algorithms due to its mixture of categorical and numerical features. Decision trees can handle various data types and are effective at capturing the relationships between different features and the target variable, Income. Additionally, this dataset provides a good opportunity to learn how to handle null values, which are commonly encountered in real-world data.
You can find the original dataset on Kaggle: "https://www.kaggle.com/datasets/uciml/adult-census-income">Adult Census Income .
Age - Description: The age of the individual. - Data Type: Numerical
Workclass - Description: The classification of the individual's employment. - Categories: Private, Self-emp-not-inc, Local-gov, etc.
FinalWeight - Description: The final weight, a numeric value representing the sampling weight. - Data Type: Numerical
Education - Description: The individual's level of education. - Categories: Bachelors, HS-grad, 11th, Masters, 9th, etc.
EducationYears - Description: The number of years of education completed. - Data Type: Numerical
MaritalStatus - Description: The marital status of the individual. - Categories: Married-civ-spouse, Divorced, Never-married, Separated, Widowed, etc.
Occupation - Description: The type of occupation the individual holds. - Categories: Tech-support, Craft-repair, Other-service, Sales, Exec-managerial, etc.
Relationship - Description: The individual's relationship status within a family. - Categories: Wife, Own-child, Husband, Not-in-family, Other-relative, Unmarried
Race - Description: The race of the individual. - Categories: White, Asian-Pac-Islander, Amer-Indian-Eskimo, Other, Black
Gender - Description: The gender of the individual. - Categories: Male, Female
CapitalGain - Description: Capital gains, a numeric value. - Data Type: Numerical
CapitalLoss - Description: Capital losses, a numeric value. - Data Type: Numerical
HoursPerWeek - Description: The number of hours worked per week. - Data Type: Numerical
NativeCountry - Description: The native country of the individual. - Categories: United-States, Cambodia, England, Puerto-Rico, Canada, Germany, etc.
Income (Label) - Description: The income level, categorizing whether the individual's income is above or below $50,000. - Categories: >50K, <=50K
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The O*NET Database contains hundreds of standardized and occupation-specific descriptors on almost 1,000 occupations covering the entire U.S. economy. The database, which is available to the public at no cost, is continually updated by a multi-method data collection program. Sources of data include: job incumbents, occupational experts, occupational analysts, employer job postings, and customer/professional association input.
Data content areas include:
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Are you curious about the factors that influence career changes? Do you want to explore how someone's field of study impacts their likelihood to stick with or switch careers? Look no further! The Field Of Study vs Occupation dataset is designed to help you predict whether individuals are likely to change their occupation based on their academic background, job experience, and other demographic factors.
This rich dataset contains 30,000+ records and features 22 attributes, including personal details, job satisfaction, skills gap, industry growth, and much more. The dataset’s target variable, “Likely to Change Occupation”, indicates whether a person is at risk of switching careers.
This dataset offers a unique opportunity to apply machine learning models to predict career changes based on a variety of factors that impact job satisfaction, professional growth, and personal circumstances. It's a great dataset for anyone interested in:
Facebook
TwitterThis layer shows median earnings by occupational group broken down by sex. This is shown by tract, county, and state boundaries. This service is updated annually to contain the most currently released American Community Survey (ACS) 5-year data, and contains estimates and margins of error. There are also additional calculated attributes related to this topic, which can be mapped or used within analysis. Only full-time year-round workers included. Median earnings is based on earnings in past 12 months of survey. Occupation Groups based on Bureau of Labor Statistics (BLS)' Standard Occupation Classification (SOC). This layer is symbolized to show median earnings of the full-time, year-round civilian employed population. To see the full list of attributes available in this service, go to the "Data" tab, and choose "Fields" at the top right. Current Vintage: 2019-2023ACS Table(s): B24022 (Not all lines of this ACS table are available in this feature layer.)Data downloaded from: Census Bureau's API for American Community Survey Date of API call: December 12, 2024National Figures: data.census.govThe United States Census Bureau's American Community Survey (ACS):About the SurveyGeography & ACSTechnical DocumentationNews & UpdatesThis ready-to-use layer can be used within ArcGIS Pro, ArcGIS Online, its configurable apps, dashboards, Story Maps, custom apps, and mobile apps. Data can also be exported for offline workflows. For more information about ACS layers, visit the FAQ. Please cite the Census and ACS when using this data.Data Note from the Census:Data are based on a sample and are subject to sampling variability. The degree of uncertainty for an estimate arising from sampling variability is represented through the use of a margin of error. The value shown here is the 90 percent margin of error. The margin of error can be interpreted as providing a 90 percent probability that the interval defined by the estimate minus the margin of error and the estimate plus the margin of error (the lower and upper confidence bounds) contains the true value. In addition to sampling variability, the ACS estimates are subject to nonsampling error (for a discussion of nonsampling variability, see Accuracy of the Data). The effect of nonsampling error is not represented in these tables.Data Processing Notes:This layer is updated automatically when the most current vintage of ACS data is released each year, usually in December. The layer always contains the latest available ACS 5-year estimates. It is updated annually within days of the Census Bureau's release schedule. Click here to learn more about ACS data releases.Boundaries come from the US Census TIGER geodatabases, specifically, the National Sub-State Geography Database (named tlgdb_(year)_a_us_substategeo.gdb). Boundaries are updated at the same time as the data updates (annually), and the boundary vintage appropriately matches the data vintage as specified by the Census. These are Census boundaries with water and/or coastlines erased for cartographic and mapping purposes. For census tracts, the water cutouts are derived from a subset of the 2020 Areal Hydrography boundaries offered by TIGER. Water bodies and rivers which are 50 million square meters or larger (mid to large sized water bodies) are erased from the tract level boundaries, as well as additional important features. For state and county boundaries, the water and coastlines are derived from the coastlines of the 2023 500k TIGER Cartographic Boundary Shapefiles. These are erased to more accurately portray the coastlines and Great Lakes. The original AWATER and ALAND fields are still available as attributes within the data table (units are square meters).The States layer contains 52 records - all US states, Washington D.C., and Puerto RicoCensus tracts with no population that occur in areas of water, such as oceans, are removed from this data service (Census Tracts beginning with 99).Percentages and derived counts, and associated margins of error, are calculated values (that can be identified by the "_calc_" stub in the field name), and abide by the specifications defined by the American Community Survey.Field alias names were created based on the Table Shells file available from the American Community Survey Summary File Documentation page.Negative values (e.g., -4444...) have been set to null, with the exception of -5555... which has been set to zero. These negative values exist in the raw API data to indicate the following situations:The margin of error column indicates that either no sample observations or too few sample observations were available to compute a standard error and thus the margin of error. A statistical test is not appropriate.Either no sample observations or too few sample observations were available to compute an estimate, or a ratio of medians cannot be calculated because one or both of the median estimates falls in the lowest interval or upper interval of an open-ended distribution.The median falls in the lowest interval of an open-ended distribution, or in the upper interval of an open-ended distribution. A statistical test is not appropriate.The estimate is controlled. A statistical test for sampling variability is not appropriate.The data for this geographic area cannot be displayed because the number of sample cases is too small.
Facebook
TwitterThe Annual Population Survey (APS) household datasets are produced annually and are available from 2004 (Special Licence) and 2006 (End User Licence). They allow production of family and household labour market statistics at local areas and for small sub-groups of the population across the UK. The household data comprise key variables from the Labour Force Survey (LFS) and the APS 'person' datasets. The APS household datasets include all the variables on the LFS and APS person datasets, except for the income variables. They also include key family and household-level derived variables. These variables allow for an analysis of the combined economic activity status of the family or household. In addition, they also include more detailed geographical, industry, occupation, health and age variables.
For further detailed information about methodology, users should consult the Labour Force Survey User Guide, included with the APS documentation. For variable and value labelling and coding frames that are not included either in the data or in the current APS documentation, users are advised to consult the latest versions of the LFS User Guides, which are available from the ONS Labour Force Survey - User Guidance webpages.
Occupation data for 2021 and 2022
The ONS has identified an issue with the collection of some occupational data in 2021 and 2022 data files in a number of their surveys. While they estimate any impacts will be small overall, this will affect the accuracy of the breakdowns of some detailed (four-digit Standard Occupational Classification (SOC)) occupations, and data derived from them. None of ONS' headline statistics, other than those directly sourced from occupational data, are affected and you can continue to rely on their accuracy. Further information can be found in the ONS article published on 11 July 2023: Revision of miscoded occupational data in the ONS Labour Force Survey, UK: January 2021 to September 2022
End User Licence and Secure Access APS data
Users should note that there are two versions of each APS dataset. One is available under the standard End User Licence (EUL) agreement, and the other is a Secure Access version. The EUL version includes Government Office Region geography, banded age, 3-digit SOC and industry sector for main, second and last job. The Secure Access version contains more detailed variables relating to:
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
This dataset contains information about individuals' demographic and employment attributes to predict whether their income exceeds $50,000 per year. It originates from the 1994 U.S. Census database and has been widely used in classification problems, making it an excellent resource for machine learning, data analysis, and statistical modeling.
The dataset includes various features related to personal and work-related attributes. The target variable is whether an individual's income exceeds $50,000 annually.
Key features include:
Age: Age of the individual.
Workclass: Employment type (e.g., private, government, self-employed).
Education: Highest level of education achieved.
Education-Num: Number corresponding to the level of education.
Marital Status: Marital status of the individual.
Occupation: Profession or job role.
Relationship: Family role (e.g., husband, wife, not in family).
Race: Race of the individual.
Sex: Gender of the individual.
Capital Gain: Income from investment sources other than salary.
Capital Loss: Losses from investment sources.
Hours Per Week: Average number of hours worked per week.
Native Country: Country of origin of the individual
Age: Continuous variable representing the age of the individual.
Workclass: Categorical variable indicating the type of employment (e.g., Private, Self-Employed, Government).
Education: Categorical variable showing the highest level of education achieved (e.g., Bachelors, Masters).
Education-Num: Numerical representation of the education level.
Marital Status: Categorical variable representing marital status (e.g., Married, Never-Married).
Occupation: Categorical variable indicating the job role or occupation
Relationship: Categorical variable describing the family relationship (e.g., Husband, Wife).
Race: Categorical variable showing the race of the individual.
Sex: Categorical variable indicating the gender of the individual.
Capital Gain: Continuous variable representing income from capital gains.
Capital Loss: Continuous variable representing losses from investments.
Hours Per Week: Continuous variable showing the average working hours per week.
Native Country: Categorical variable indicating the country of origin.
Income: Target variable (binary), indicating whether the individual earns more than $50,000 (>50K) or not (<=50K).
This dataset was derived from the 1994 U.S. Census database and has been made publicly available for research and educational purposes. It is not affiliated with any specific organization. Users are encouraged to comply with ethical data usage guidelines while working with this dataset.
Facebook
TwitterThe Annual Population Survey (APS) is a major survey series, which aims to provide data that can produce reliable estimates at the local authority level. Key topics covered in the survey include education, employment, health and ethnicity. The APS comprises key variables from the Labour Force Survey (LFS), all its associated LFS boosts and the APS boost. The APS aims to provide enhanced annual data for England, covering a target sample of at least 510 economically active persons for each Unitary Authority (UA)/Local Authority District (LAD) and at least 450 in each Greater London Borough. In combination with local LFS boost samples, the survey provides estimates for a range of indicators down to Local Education Authority (LEA) level across the United Kingdom.For further detailed information about methodology, users should consult the Labour Force Survey User Guide, included with the APS documentation. For variable and value labelling and coding frames that are not included either in the data or in the current APS documentation, users are advised to consult the latest versions of the LFS User Guides, which are available from the ONS Labour Force Survey - User Guidance webpages.Occupation data for 2021 and 2022The ONS has identified an issue with the collection of some occupational data in 2021 and 2022 data files in a number of their surveys. While they estimate any impacts will be small overall, this will affect the accuracy of the breakdowns of some detailed (four-digit Standard Occupational Classification (SOC)) occupations, and data derived from them. None of ONS' headline statistics, other than those directly sourced from occupational data, are affected and you can continue to rely on their accuracy. The affected datasets have now been updated. Further information can be found in the ONS article published on 11 July 2023: Revision of miscoded occupational data in the ONS Labour Force Survey, UK: January 2021 to September 2022APS Well-Being DatasetsFrom 2012-2015, the ONS published separate APS datasets aimed at providing initial estimates of subjective well-being, based on the Integrated Household Survey. In 2015 these were discontinued. A separate set of well-being variables and a corresponding weighting variable have been added to the April-March APS person datasets from A11M12 onwards. Further information on the transition can be found in the Personal well-being in the UK: 2015 to 2016 article on the ONS website.APS disability variablesOver time, there have been some updates to disability variables in the APS. An article explaining the quality assurance investigations on these variables that have been conducted so far is available on the ONS Methodology webpage. End User Licence and Secure Access APS dataUsers should note that there are two versions of each APS dataset. One is available under the standard End User Licence (EUL) agreement, and the other is a Secure Access version. The EUL version includes Government Office Region geography, banded age, 3-digit SOC and industry sector for main, second and last job. The Secure Access version contains more detailed variables relating to: age: single year of age, year and month of birth, age completed full-time education and age obtained highest qualification, age of oldest dependent child and age of youngest dependent child family unit and household: including a number of variables concerning the number of dependent children in the family according to their ages, relationship to head of household and relationship to head of family nationality and country of origin geography: including county, unitary/local authority, place of work, Nomenclature of Territorial Units for Statistics 2 (NUTS2) and NUTS3 regions, and whether lives and works in same local authority district health: including main health problem, and current and past health problems education and apprenticeship: including numbers and subjects of various qualifications and variables concerning apprenticeships industry: including industry, industry class and industry group for main, second and last job, and industry made redundant from occupation: including 4-digit Standard Occupational Classification (SOC) for main, second and last job and job made redundant from system variables: including week number when interview took place and number of households at address
The Secure Access data have more restrictive access conditions than those made available under the standard EUL. Prospective users will need to gain ONS Accredited Researcher status, complete an extra application form and demonstrate to the data owners exactly why they need access to the additional variables.
Facebook
TwitterBackgroundThe Labour Force Survey (LFS) is a unique source of information using international definitions of employment and unemployment and economic inactivity, together with a wide range of related topics such as occupation, training, hours of work and personal characteristics of household members aged 16 years and over. It is used to inform social, economic and employment policy. The Annual Population Survey, also held at the UK Data Archive, is derived from the LFS.The LFS was first conducted biennially from 1973-1983, then annually between 1984 and 1991, comprising a quarterly survey conducted throughout the year and a 'boost' survey in the spring quarter. From 1992 it moved to a quarterly cycle with a sample size approximately equivalent to that of the previous annual data. Northern Ireland was also included in the survey from December 1994. Further information on the background to the QLFS may be found in the documentation.The UK Data Service also holds a Secure Access version of the QLFS (see below); household datasets; two-quarter and five-quarter longitudinal datasets; LFS datasets compiled for Eurostat; and some additional annual Northern Ireland datasets.LFS DocumentationThe documentation available from the Archive to accompany LFS datasets largely consists of the latest version of each user guide volume alongside the appropriate questionnaire for the year concerned (the latest questionnaire available covers July-September 2022). Volumes are updated periodically, so users are advised to check the latest documents on the ONS Labour Force Survey - User Guidance pages before commencing analysis. This is especially important for users of older QLFS studies, where information and guidance in the user guide documents may have changed over time.LFS response to COVID-19From April 2020 to May 2022, additional non-calendar quarter LFS microdata were made available to cover the pandemic period. The first additional microdata to be released covered February to April 2020 and the final non-calendar dataset covered March-May 2022. Publication then returned to calendar quarters only. Within the additional non-calendar COVID-19 quarters, pseudonymised variables Casenop and Hserialp may contain a significant number of missing cases (set as -9). These variables may not be available in full for the additional COVID-19 datasets until the next standard calendar quarter is produced. The income weight variable, PIWT, is not available in the non-calendar quarters, although the person weight (PWT) is included. Please consult the documentation for full details.Occupation data for 2021 and 2022 data filesThe ONS has identified an issue with the collection of some occupational data in 2021 and 2022 data files in a number of their surveys. While they estimate any impacts will be small overall, this will affect the accuracy of the breakdowns of some detailed (four-digit Standard Occupational Classification (SOC)) occupations, and data derived from them. Further information can be found in the ONS article published on 11 July 2023: Revision of miscoded occupational data in the ONS Labour Force Survey, UK: January 2021 to September 2022.2024 ReweightingIn February 2024, reweighted person-level data from July-September 2022 onwards were released. Up to July-September 2023, only the person weight was updated (PWT23); the income weight remains at 2022 (PIWT22). The 2023 income weight (PIWT23) was included from the October-December 2023 quarter. Users are encouraged to read the ONS methodological note of 5 February, Impact of reweighting on Labour Force Survey key indicators: 2024, which includes important information on the 2024 reweighting exercise.End User Licence and Secure Access QLFS dataTwo versions of the QLFS are available from UKDS. One is available under the standard End User Licence (EUL) agreement, and the other is a Secure Access version. The EUL version includes country and Government Office Region geography, 3-digit Standard Occupational Classification (SOC) and 3-digit industry group for main, second and last job (from July-September 2015, 4-digit industry class is available for main job only).The Secure Access version contains more detailed variables relating to:age: single year of age, year and month of birth, age completed full-time education and age obtained highest qualification, age of oldest dependent child and age of youngest dependent childfamily unit and household: including a number of variables concerning the number of dependent children in the family according to their ages, relationship to head of household and relationship to head of familynationality and country of originfiner detail geography: including county, unitary/local authority, place of work, Nomenclature of Territorial Units for Statistics 2 (NUTS2) and NUTS3 regions, and whether lives and works in same local authority district, and other categories;health: including main health problem, and current and past health problemseducation and apprenticeship: including numbers and subjects of various qualifications and variables concerning apprenticeshipsindustry: including industry, industry class and industry group for main, second and last job, and industry made redundant fromoccupation: including 5-digit industry subclass and 4-digit SOC for main, second and last job and job made redundant fromsystem variables: including week number when interview took place and number of households at addressother additional detailed variables may also be included.The Secure Access datasets (SNs 6727 and 7674) have more restrictive access conditions than those made available under the standard EUL. Prospective users will need to gain ONS Accredited Researcher status, complete an extra application form and demonstrate to the data owners exactly why they need access to the additional variables. Users are strongly advised to first obtain the standard EUL version of the data to see if they are sufficient for their research requirements. This study was deposited in 2008, as a result of the move from seasonal to calendar quarters for the QLFS, and the reweighting process to 2007-2008 population figures. It combines data from previously-available QLFS seasonal quarter datasets. The depositor has advised that small revisions to the data may have been made during this process, but they should not be significant.
Variables Refwkd, Refwkm, Refwky and Calweek amended: During November 2009, the ONS supplied syntax to resolve issues discovered in variables Refwkd, Refwkm, Refwky (reference week date, month and year) and Calweek (calendar week), which affected Northern Ireland cases. The issues had arisen due to misalignment between week number and Refwkd/Refwkm/Refwky, and had meant that when week number was used to create calendar quarters from seasonal quarters, for some cases Refwkd, Refwkm and Refwky fell outside the target calendar quarter. The syntax supplied has been used to correct the issue; users whose analysis has been adversely affected should download a new version of the dataset.
Facebook
TwitterThis layer shows median earnings by occupational group. This is shown by tract, county, and state boundaries. This service is updated annually to contain the most currently released American Community Survey (ACS) 5-year data, and contains estimates and margins of error. There are also additional calculated attributes related to this topic, which can be mapped or used within analysis. Only full-time year-round workers included. Median earnings is based on earnings in past 12 months of survey. Occupation Groups based on Bureau of Labor Statistics (BLS)' Standard Occupation Classification (SOC). This layer is symbolized to show median earnings of the full-time, year-round civilian employed population. To see the full list of attributes available in this service, go to the "Data" tab, and choose "Fields" at the top right. Current Vintage: 2019-2023ACS Table(s): B24021Data downloaded from: Census Bureau's API for American Community Survey Date of API call: December 12, 2024National Figures: data.census.govThe United States Census Bureau's American Community Survey (ACS):About the SurveyGeography & ACSTechnical DocumentationNews & UpdatesThis ready-to-use layer can be used within ArcGIS Pro, ArcGIS Online, its configurable apps, dashboards, Story Maps, custom apps, and mobile apps. Data can also be exported for offline workflows. For more information about ACS layers, visit the FAQ. Please cite the Census and ACS when using this data.Data Note from the Census:Data are based on a sample and are subject to sampling variability. The degree of uncertainty for an estimate arising from sampling variability is represented through the use of a margin of error. The value shown here is the 90 percent margin of error. The margin of error can be interpreted as providing a 90 percent probability that the interval defined by the estimate minus the margin of error and the estimate plus the margin of error (the lower and upper confidence bounds) contains the true value. In addition to sampling variability, the ACS estimates are subject to nonsampling error (for a discussion of nonsampling variability, see Accuracy of the Data). The effect of nonsampling error is not represented in these tables.Data Processing Notes:This layer is updated automatically when the most current vintage of ACS data is released each year, usually in December. The layer always contains the latest available ACS 5-year estimates. It is updated annually within days of the Census Bureau's release schedule. Click here to learn more about ACS data releases.Boundaries come from the US Census TIGER geodatabases, specifically, the National Sub-State Geography Database (named tlgdb_(year)_a_us_substategeo.gdb). Boundaries are updated at the same time as the data updates (annually), and the boundary vintage appropriately matches the data vintage as specified by the Census. These are Census boundaries with water and/or coastlines erased for cartographic and mapping purposes. For census tracts, the water cutouts are derived from a subset of the 2020 Areal Hydrography boundaries offered by TIGER. Water bodies and rivers which are 50 million square meters or larger (mid to large sized water bodies) are erased from the tract level boundaries, as well as additional important features. For state and county boundaries, the water and coastlines are derived from the coastlines of the 2023 500k TIGER Cartographic Boundary Shapefiles. These are erased to more accurately portray the coastlines and Great Lakes. The original AWATER and ALAND fields are still available as attributes within the data table (units are square meters).The States layer contains 52 records - all US states, Washington D.C., and Puerto RicoCensus tracts with no population that occur in areas of water, such as oceans, are removed from this data service (Census Tracts beginning with 99).Percentages and derived counts, and associated margins of error, are calculated values (that can be identified by the "_calc_" stub in the field name), and abide by the specifications defined by the American Community Survey.Field alias names were created based on the Table Shells file available from the American Community Survey Summary File Documentation page.Negative values (e.g., -4444...) have been set to null, with the exception of -5555... which has been set to zero. These negative values exist in the raw API data to indicate the following situations:The margin of error column indicates that either no sample observations or too few sample observations were available to compute a standard error and thus the margin of error. A statistical test is not appropriate.Either no sample observations or too few sample observations were available to compute an estimate, or a ratio of medians cannot be calculated because one or both of the median estimates falls in the lowest interval or upper interval of an open-ended distribution.The median falls in the lowest interval of an open-ended distribution, or in the upper interval of an open-ended distribution. A statistical test is not appropriate.The estimate is controlled. A statistical test for sampling variability is not appropriate.The data for this geographic area cannot be displayed because the number of sample cases is too small.
Facebook
TwitterThis file contains original variables from Theme 13 Occupations from Census 2011 and a series of additional variables produced by AIRO such as percentage rates, ratios etc. The file includes data on Persons At Work by Occupation for the 18,488 Small Areas in the Republic of Ireland.
Facebook
TwitterBackground The Labour Force Survey (LFS) is a unique source of information using international definitions of employment and unemployment and economic inactivity, together with a wide range of related topics such as occupation, training, hours of work and personal characteristics of household members aged 16 years and over. It is used to inform social, economic and employment policy. The LFS was first conducted biennially from 1973-1983. Between 1984 and 1991 the survey was carried out annually and consisted of a quarterly survey conducted throughout the year and a 'boost' survey in the spring quarter (data were then collected seasonally). From 1992 quarterly data were made available, with a quarterly sample size approximately equivalent to that of the previous annual data. The survey then became known as the Quarterly Labour Force Survey (QLFS). From December 1994, data gathering for Northern Ireland moved to a full quarterly cycle to match the rest of the country, so the QLFS then covered the whole of the UK (though some additional annual Northern Ireland LFS datasets are also held at the UK Data Archive). Further information on the background to the QLFS may be found in the documentation.Longitudinal data The LFS retains each sample household for five consecutive quarters, with a fifth of the sample replaced each quarter. The main survey was designed to produce cross-sectional data, but the data on each individual have now been linked together to provide longitudinal information. The longitudinal data comprise two types of linked datasets, created using the weighting method to adjust for non-response bias. The two-quarter datasets link data from two consecutive waves, while the five-quarter datasets link across a whole year (for example January 2010 to March 2011 inclusive) and contain data from all five waves. Linking together records to create a longitudinal dimension can, for example, provide information on gross flows over time between different labour force categories (employed, unemployed and economically inactive). This will provide detail about people who have moved between the categories. Also, longitudinal information is useful in monitoring the effects of government policies and can be used to follow the subsequent activities and circumstances of people affected by specific policy initiatives, and to compare them with other groups in the population. There are however methodological problems which could distort the data resulting from this longitudinal linking. The ONS continues to research these issues and advises that the presentation of results should be carefully considered, and warnings should be included with outputs where necessary.
Secure Access data Secure Access longitudinal datasets for the LFS are available for two-quarters (SN 7908) and five-quarters (SN 7909). The two-quarter datasets are available from April 2001 and the five-quarter datasets are available from June 2010. The Secure Access versions include additional, detailed variables not included in the standard 'End User Licence' (EUL) longitudinal datasets (see under GNs 33315 and 33316).
Extra variables that typically can be found in the Secure Access versions but not in the EUL versions relate to:day, month and year of birthstandard occupational classification (SOC) relating to second job, job made redundant from, last job, apprenticeships and occupation one year agofive digit industry subclass relating to main job, last job, second job and job one year agoThese extra variables are not available for every quarter or dataset. Users are advised to consult the 'LFS Variable Catalogue' file available in the Documentation section below for further information. Occupation data for 2021 and 2022 data filesThe ONS has identified an issue with the collection of some occupational data in 2021 and 2022 data files in a number of their surveys. While they estimate any impacts will be small overall, this will affect the accuracy of the breakdowns of some detailed (four-digit Standard Occupational Classification (SOC)) occupations, and data derived from them. Further information can be found in the ONS article published on 11 July 2023: Revision of miscoded occupational data in the ONS Labour Force Survey, UK: January 2021 to September 2022.2022 WeightingThe population totals used for the latest LFS estimates use projected growth rates from Real Time Information (RTI) data for UK, EU and non-EU populations based on 2021 patterns. The total population used for the LFS therefore does not take into account any changes in migration, birth rates, death rates, and so on since June 2021, and hence levels estimates may be under- or over-estimating the true values and should be used with caution. Estimates of rates will, however, be robust.
Facebook
Twitterhttps://doi.org/10.17026/fp39-0x58https://doi.org/10.17026/fp39-0x58
This survey is part of Centerdata's Telepanel project. Telepanel consists of approx. 2000 households, surveyed weekly. Besides the Centerdatabase offers opportunities to compose tailor-made datasets. Meaning of having a job Background Variables: Age, year of birth / Sex Ownership of house Nr. of children living with family/household / Position in family/household / Size of family/household / Other: presence of partner in family/household Respondent: occupational status Respondent: gross income / Respondent: net income / Total family/household: gross income / Total family/ household: net income Respondent: highest grade attained / Respondent: highest type attended Other: constructed variable ( social economic class ) according to GFK * Dongen.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The National Health and Nutrition Examination Survey (NHANES) provides data and have considerable potential to study the health and environmental exposure of the non-institutionalized US population. However, as NHANES data are plagued with multiple inconsistencies, processing these data is required before deriving new insights through large-scale analyses. Thus, we developed a set of curated and unified datasets by merging 614 separate files and harmonizing unrestricted data across NHANES III (1988-1994) and Continuous (1999-2018), totaling 135,310 participants and 5,078 variables. The variables conveydemographics (281 variables),dietary consumption (324 variables),physiological functions (1,040 variables),occupation (61 variables),questionnaires (1444 variables, e.g., physical activity, medical conditions, diabetes, reproductive health, blood pressure and cholesterol, early childhood),medications (29 variables),mortality information linked from the National Death Index (15 variables),survey weights (857 variables),environmental exposure biomarker measurements (598 variables), andchemical comments indicating which measurements are below or above the lower limit of detection (505 variables).csv Data Record: The curated NHANES datasets and the data dictionaries includes 23 .csv files and 1 excel file.The curated NHANES datasets involves 20 .csv formatted files, two for each module with one as the uncleaned version and the other as the cleaned version. The modules are labeled as the following: 1) mortality, 2) dietary, 3) demographics, 4) response, 5) medications, 6) questionnaire, 7) chemicals, 8) occupation, 9) weights, and 10) comments."dictionary_nhanes.csv" is a dictionary that lists the variable name, description, module, category, units, CAS Number, comment use, chemical family, chemical family shortened, number of measurements, and cycles available for all 5,078 variables in NHANES."dictionary_harmonized_categories.csv" contains the harmonized categories for the categorical variables.“dictionary_drug_codes.csv” contains the dictionary for descriptors on the drugs codes.“nhanes_inconsistencies_documentation.xlsx” is an excel file that contains the cleaning documentation, which records all the inconsistencies for all affected variables to help curate each of the NHANES modules.R Data Record: For researchers who want to conduct their analysis in the R programming language, only cleaned NHANES modules and the data dictionaries can be downloaded as a .zip file which include an .RData file and an .R file.“w - nhanes_1988_2018.RData” contains all the aforementioned datasets as R data objects. We make available all R scripts on customized functions that were written to curate the data.“m - nhanes_1988_2018.R” shows how we used the customized functions (i.e. our pipeline) to curate the original NHANES data.Example starter codes: The set of starter code to help users conduct exposome analysis consists of four R markdown files (.Rmd). We recommend going through the tutorials in order.“example_0 - merge_datasets_together.Rmd” demonstrates how to merge the curated NHANES datasets together.“example_1 - account_for_nhanes_design.Rmd” demonstrates how to conduct a linear regression model, a survey-weighted regression model, a Cox proportional hazard model, and a survey-weighted Cox proportional hazard model.“example_2 - calculate_summary_statistics.Rmd” demonstrates how to calculate summary statistics for one variable and multiple variables with and without accounting for the NHANES sampling design.“example_3 - run_multiple_regressions.Rmd” demonstrates how run multiple regression models with and without adjusting for the sampling design.
Facebook
TwitterAbstract copyright UK Data Service and data collection copyright owner.
Background
The Labour Force Survey (LFS) is a unique source of information using international definitions of employment and unemployment and economic inactivity, together with a wide range of related topics such as occupation, training, hours of work and personal characteristics of household members aged 16 years and over. It is used to inform social, economic and employment policy. The LFS was first conducted biennially from 1973-1983. Between 1984 and 1991 the survey was carried out annually and consisted of a quarterly survey conducted throughout the year and a 'boost' survey in the spring quarter (data were then collected seasonally). From 1992 quarterly data were made available, with a quarterly sample size approximately equivalent to that of the previous annual data. The survey then became known as the Quarterly Labour Force Survey (QLFS). From December 1994, data gathering for Northern Ireland moved to a full quarterly cycle to match the rest of the country, so the QLFS then covered the whole of the UK (though some additional annual Northern Ireland LFS datasets are also held at the UK Data Archive). Further information on the background to the QLFS may be found in the documentation.
Longitudinal data
The LFS retains each sample household for five consecutive quarters, with a fifth of the sample replaced each quarter. The main survey was designed to produce cross-sectional data, but the data on each individual have now been linked together to provide longitudinal information. The longitudinal data comprise two types of linked datasets, created using the weighting method to adjust for non-response bias. The two-quarter datasets link data from two consecutive waves, while the five-quarter datasets link across a whole year (for example January 2010 to March 2011 inclusive) and contain data from all five waves. Linking together records to create a longitudinal dimension can, for example, provide information on gross flows over time between different labour force categories (employed, unemployed and economically inactive). This will provide detail about people who have moved between the categories. Also, longitudinal information is useful in monitoring the effects of government policies and can be used to follow the subsequent activities and circumstances of people affected by specific policy initiatives, and to compare them with other groups in the population. There are however methodological problems which could distort the data resulting from this longitudinal linking. The ONS continues to research these issues and advises that the presentation of results should be carefully considered, and warnings should be included with outputs where necessary.
Secure Access data
Secure Access longitudinal datasets for the LFS are available for two-quarters (SN 7908) and five-quarters (SN 7909). The two-quarter datasets are available from April 2001 and the five-quarter datasets are available from June 2010. The Secure Access versions include additional, detailed variables not included in the standard 'End User Licence' (EUL) longitudinal datasets (see under GNs 33315 and 33316).
Extra variables that typically can be found in the Secure Access versions but not in the EUL versions relate to:
Occupation data for 2021 and 2022 data files
The ONS has identified an issue with the collection of some occupational data in 2021 and 2022 data files in a number of their surveys. While they estimate any impacts will be small overall, this will affect the accuracy of the breakdowns of some detailed (four-digit Standard Occupational Classification (SOC)) occupations, and data derived from them. Further information can be found in the ONS article published on 11 July 2023: Revision of miscoded occupational data in the ONS Labour Force Survey, UK: January 2021 to September 2022.
2022 Weighting
The population totals used for the latest LFS estimates use projected growth rates from Real Time Information (RTI) data for UK, EU and non-EU populations based on 2021 patterns. The total population used for the LFS...
Facebook
TwitterEvery four years, the Wasatch Front’s two metropolitan planning organizations (MPOs), Wasatch Front Regional Council (WFRC) and Mountainland Association of Governments (MAG), collaborate to update a set of annual small area -- traffic analysis zone and ‘city area’, see descriptions below) -- population and employment projections for the Salt Lake City-West Valley City (WFRC), Ogden-Layton (WFRC), and Provo-Orem (MAG) urbanized areas.
These projections are primarily developed for the purpose of informing long-range transportation infrastructure and services planning done as part of the 4 year Regional Transportation Plan update cycle, as well as Utah’s Unified Transportation Plan, 2023-2050. Accordingly, the foundation for these projections is largely data describing existing conditions for a 2019 base year, the first year of the latest RTP process. The projections are included in the official travel models, which are publicly released at the conclusion of the RTP process.
Projections within the Wasatch Front urban area ( SUBAREAID = 1) were produced with using the Real Estate Market Model as described below. Socioeconomic forecasts produced for Cache MPO (Cache County, SUBAREAID = 2), Dixie MPO (Washington County, SUBAREAID = 3), Summit County (SUBAREAID = 4), and UDOT (other areas of the state, SUBAREAID = 0) all adhere to the University of Utah Gardner Policy Institute's county-level projection controls, but other modeling methods are used to arrive at the TAZ-level forecasts for these areas.
As these projections may be a valuable input to other analyses, this dataset is made available here as a public service for informational purposes only. It is solely the responsibility of the end user to determine the appropriate use of this dataset for other purposes.
Wasatch Front Real Estate Market Model (REMM) Projections
WFRC and MAG have developed a spatial statistical model using the UrbanSim modeling platform to assist in producing these annual projections. This model is called the Real Estate Market Model, or REMM for short. REMM is used for the urban portion of Weber, Davis, Salt Lake, and Utah counties. REMM relies on extensive inputs to simulate future development activity across the greater urbanized region. Key inputs to REMM include:
Demographic data from the decennial census
County-level population and employment projections -- used as REMM control totals -- are produced by the University of Utah’s Kem C. Gardner Policy Institute (GPI) funded by the Utah State Legislature
Current employment locational patterns derived from the Utah Department of Workforce Services
Land use visioning exercises and feedback, especially in regard to planned urban and local center development, with city and county elected officials and staff
Current land use and valuation GIS-based parcel data stewarded by County Assessors
Traffic patterns and transit service from the regional Travel Demand Model that together form the landscape of regional accessibility to workplaces and other destinations
Calibration of model variables to balance the fit of current conditions and dynamics at the county and regional level
‘Traffic Analysis Zone’ Projections
The annual projections are forecasted for each of the Wasatch Front’s 3,546 Traffic Analysis Zone (TAZ) geographic units. TAZ boundaries are set along roads, streams, and other physical features and average about 600 acres (0.94 square miles). TAZ sizes vary, with some TAZs in the densest areas representing only a single city block (25 acres).
‘City Area’ Projections
The TAZ-level output from the model is also available for ‘city areas’ that sum the projections for the TAZ geographies that roughly align with each city’s current boundary. As TAZs do not align perfectly with current city boundaries, the ‘city area’ summaries are not projections specific to a current or future city boundary, but the ‘city area’ summaries may be suitable surrogates or starting points upon which to base city-specific projections.
Summary Variables in the Datasets
Annual projection counts are available for the following variables (please read Key Exclusions note below):
Demographics
Household Population Count (excludes persons living in group quarters)
Household Count (excludes group quarters)
Employment
Typical Job Count (includes job types that exhibit typical commuting and other travel/vehicle use patterns)
Retail Job Count (retail, food service, hotels, etc)
Office Job Count (office, health care, government, education, etc)
Industrial Job Count (manufacturing, wholesale, transport, etc)
Non-Typical Job Count* (includes agriculture, construction, mining, and home-based jobs) This can be calculated by subtracting Typical Job Count from All Employment Count
All Employment Count* (all jobs, this sums jobs from typical and non-typical sectors).
Key Exclusions from TAZ and ‘City Area’ Projections
As the primary purpose for the development of these population and employment projections is to model future travel in the region, REMM-based projections do not include population or households that reside in group quarters (prisons, senior centers, dormitories, etc), as residents of these facilities typically have a very low impact on regional travel. USTM-based projections also excludes group quarter populations. Group quarters population estimates are available at the county-level from GPI and at various sub-county geographies from the Census Bureau.
Statewide Projections
Population and employment projections for the Wasatch Front area can be combined with those developed by Dixie MPO (St. George area), Cache MPO (Logan area), and the Utah Department of Transportation (for the remainder of the state) into one database for use in the Utah Statewide Travel Model (USTM). While projections for the areas outside of the Wasatch Front use different forecasting methods, they contain the same summary-level population and employment projections making similar TAZ and ‘City Area’ data available statewide. WFRC plans, in the near future, to add additional areas to these projections datasets by including the projections from the USTM model.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
By [source]
This dataset offers a comprehensive and varied analysis of an organization's employees, focusing on areas such as employee attrition, personal and job-related factors, and financials. Included are numerous parameters such as Age, Gender, Marital Status, Business Travel Frequency, Daily Rate of Pay, Departmental Information such as Distance From Home Office or Education Level Obtained by the employee in question. Also included is a variant series of parameters related to the job being performed such as Job Involvement (level), Job Level (relative to similar roles within the same organization), Job Role specifically meant for that individual(function/task), total working hours in a week/month/year be it overtime or standard hours for a given role. Furthermore detailed aspects include Percent Salary Hike during their tenure with the company from promotion or otherwise , Performance Rating based on specific criteria established by leadership , Relationship Satisfaction among peers at workplace but also taking into account outside family members that can influence stress levels in varying capacities ,Monthly Income considered at its starting point once hired then compared against their monthly payrate with overtime hours included if applicable along with Number Companies Worked before if any. Lastly the Retirement Status commonly known as Attrition is highlighted; covering whether there was an intent to stay with one employer through retirement age or if attrition took place for reasons beyond ones control earlier than expected . Through this dataset you can get an insight into various major aspect regarding today's workforce management philosphies which have changed drastically over time due to advancements in technology
For more datasets, click here.
- 🚨 Your notebook can be here! 🚨!
- Understand the variables that make up this dataset. The dataset includes several personal and job-related variables such as Age, Gender, Marital Status, Business Travel, Daily Rate, Department, Distance From Home, Education, Education Field, Employee Count, Employee Number, Environment Satisfaction Hoursly Rate and so on. Knowing what each variable is individuallly will help when exploring employee attrition as a whole.
- Analyze the data for patterns as well as outliers or anomalies either at an individual level or across all of the data points together. Identifying these patterns or discrepancies can offer insight into factors that are related to employee attrition.
- Visualize the data using charts and graphs to allow for easy understanding of which relationships might be causing higher levels of employees leaving the organization over time dimensions like age or job role can be key factors in employee attrition rates visually displaying how they relate to one another can provide clarity into what needs to change within an organization in order to reduce attrition rates
- Explore relationships between pairs of variables through correlation analysis correlations are measures of how strongly two variables are related when looking at employment retention it’s important to analyze correlations at both an individual level and for all variables together showing which pairings have more influence than others when it comes to influencing employee decisions
5 Use descriptive analytics methods such as scatter plots histograms boxplots etc with aggregated values from each field like average age average monthly income etc These analytics help gain a deeper understanding about where changes need to be made internally
6 Utilize predictive analytics with more advanced techniques such as regressions clustering decision trees in order identify trendsfrom past data points then build models on those insights from different perspectives helping further prepare organizations against potential high levelsinvolving employees departing ?
- Identifying performance profiles of employees at risk for attrition through predictive analytics and using this insight to create personalized development plans or retention strategies.
- Using the data to assess the impact of different financial incentives or variations in job role/structure on employee attitudes, satisfaction and ultimately attrition rates.
- Analyzing different age groups' responses to various perks or turnover patterns in order to understand how organizations can better engage different demographic segments
If you use this dataset in your research, pl...
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Hand transcribed content from the United States Bureau of Labour Statistics Dictionary of Titles (DoT). The DoT is a record of occupations and a description of the tasks performed. Five editions exist from 1939, 1949, 1965, 1977 and 1991. The DoT was replaced by O*NET structured data on jobs, workers and their characteristics. However, apart from the 1991 data, the data in the DoT is not easily ingestible, existing only in scalar PDF documents. Attempts at Optical Character Recognition led to low accuracy. For that reason we present here hand transcribed textual data from these documents. Various data are available for each occupation e.g. numerical codes, references to other occupations as well as the free text description. For that reason the data for each edition is presented in 'long' format with a variable number of lines, with a blank line between occupations. Consult the transcription instructions for more details. Structured meta-data (see here) on occupations is also available for the 1965, 1977 and 1991 editions. For the 1965, 1977 and 1991 editions, this data can be extracted from the numerical codes with the occupational entries, the key for these codes is found in the 1965 edition in separate tables exist which were transcribed. The instructions provided to transcribers for this edition are also added to the repository. The original documents are freely available in PDF format (e.g. here) This data accompanies the paper 'Longitudinal Complex Dynamics of Labour Markets Reveal Increasing Polarisation' by Althobaiti et al
Facebook
TwitterAbstract copyright UK Data Service and data collection copyright owner.
The Annual Population Survey (APS) is a major survey series, which aims to provide data that can produce reliable estimates at the local authority level. Key topics covered in the survey include education, employment, health and ethnicity. The APS comprises key variables from the Labour Force Survey (LFS), all its associated LFS boosts and the APS boost. The APS aims to provide enhanced annual data for England, covering a target sample of at least 510 economically active persons for each Unitary Authority (UA)/Local Authority District (LAD) and at least 450 in each Greater London Borough. In combination with local LFS boost samples, the survey provides estimates for a range of indicators down to Local Education Authority (LEA) level across the United Kingdom.
For further detailed information about methodology, users should consult the Labour Force Survey User Guide, included with the APS documentation. For variable and value labelling and coding frames that are not included either in the data or in the current APS documentation, users are advised to consult the latest versions of the LFS User Guides, which are available from the ONS Labour Force Survey - User Guidance webpages.
Occupation data for 2021 and 2022
The ONS has identified an issue with the collection of some occupational data in 2021 and 2022 data files in a number of their surveys. While they estimate any impacts will be small overall, this will affect the accuracy of the breakdowns of some detailed (four-digit Standard Occupational Classification (SOC)) occupations, and data derived from them. None of ONS' headline statistics, other than those directly sourced from occupational data, are affected and you can continue to rely on their accuracy. The affected datasets have now been updated. Further information can be found in the ONS article published on 11 July 2023: Revision of miscoded occupational data in the ONS Labour Force Survey, UK: January 2021 to September 2022
APS Well-Being Datasets
From 2012-2015, the ONS published separate APS datasets aimed at providing initial estimates of subjective well-being, based on the Integrated Household Survey. In 2015 these were discontinued. A separate set of well-being variables and a corresponding weighting variable have been added to the April-March APS person datasets from A11M12 onwards. Further information on the transition can be found in the Personal well-being in the UK: 2015 to 2016 article on the ONS website.
APS disability variables
Over time, there have been some updates to disability variables in the APS. An article explaining the quality assurance investigations on these variables that have been conducted so far is available on the ONS Methodology webpage.
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Important Dataset Update 6/24/2020:Summit and Wasatch Counties updated.Important Dataset Update 6/12/2020:MAG area updated.Important Dataset Update 7/15/2019: This dataset now includes projections for all populated statewide traffic analysis zones (TAZs). Projections within the Wasatch Front urban area ( SUBAREAID = 1) were produced with using the Real Estate Market Model as described below. Socioeconomic forecasts produced for Cache MPO (Cache County, SUBAREAID = 2), Dixie MPO (Washington County, SUBAREAID = 3), Summit County (SUBAREAID = 4), and UDOT (other areas of the state, SUBAREAID = 0) all adhere to the University of Utah Gardner Policy Institute's county-level projection controls, but other modeling methods are used to arrive at the TAZ-level forecasts for these areas.As with any dataset that presents projections into the future, it is important to have a full understanding of the data before using it. Before using this data, you are strongly encouraged to read the metadata description below and direct any questions or feedback about this data to analytics@wfrc.org. Every four years, the Wasatch Front’s two metropolitan planning organizations (MPOs), Wasatch Front Regional Council (WFRC) and Mountainland Association of Governments (MAG), collaborate to update a set of annual small area -- traffic analysis zone and ‘city area’, see descriptions below) -- population and employment projections for the Salt Lake City-West Valley City (WFRC), Ogden-Layton (WFRC), and Provo-Orem (MAG) urbanized areas. These projections are primarily developed for the purpose of informing long-range transportation infrastructure and services planning done as part of the 4 year Regional Transportation Plan update cycle, as well as Utah’s Unified Transportation Plan, 2019-2050. Accordingly, the foundation for these projections is largely data describing existing conditions for a 2015 base year, the first year of the latest RTP process. The projections are included in the official travel models, which are publicly released at the conclusion of the RTP process. As these projections may be a valuable input to other analyses, this dataset is made available at http://data.wfrc.org/search?q=projections as a public service for informational purposes only. It is solely the responsibility of the end user to determine the appropriate use of this dataset for other purposes. Wasatch Front Real Estate Market Model (REMM) ProjectionsWFRC and MAG have developed a spatial statistical model using the UrbanSim modeling platform to assist in producing these annual projections. This model is called the Real Estate Market Model, or REMM for short. REMM is used for the urban portion of Weber, Davis, Salt Lake, and Utah counties. REMM relies on extensive inputs to simulate future development activity across the greater urbanized region. Key inputs to REMM include:Demographic data from the decennial census;County-level population and employment projections -- used as REMM control totals -- are produced by the University of Utah’s Kem C. Gardner Policy Institute (GPI) funded by the Utah State Legislature;Current employment locational patterns derived from the Utah Department of Workforce Services; Land use visioning exercises and feedback, especially in regard to planned urban and local center development, with city and county elected officials and staff;Current land use and valuation GIS-based parcel data stewarded by County Assessors;Traffic patterns and transit service from the regional Travel Demand Model that together form the landscape of regional accessibility to workplaces and other destinations; andCalibration of model variables to balance the fit of current conditions and dynamics at the county and regional level.‘Traffic Analysis Zone’ ProjectionsThe annual projections are forecasted for each of the Wasatch Front’s 2,800+ Traffic Analysis Zone (TAZ) geographic units. TAZ boundaries are set along roads, streams, and other physical features and average about 600 acres (0.94 square miles). TAZ sizes vary, with some TAZs in the densest areas representing only a single city block (25 acres). ‘City Area’ ProjectionsThe TAZ-level output from the model is also available for ‘city areas’ that sum the projections for the TAZ geographies that roughly align with each city’s current boundary. As TAZs do not align perfectly with current city boundaries, the ‘city area’ summaries are not projections specific to a current or future city boundary, but the ‘city area’ summaries may be suitable surrogates or starting points upon which to base city-specific projections.Summary Variables in the DatasetsAnnual projection counts are available for the following variables (please read Key Exclusions note below):DemographicsHousehold Population Count (excludes persons living in group quarters)Household Count (excludes group quarters)EmploymentTypical Job Count (includes job types that exhibit typical commuting and other travel/vehicle use patterns)Retail Job Count (retail, food service, hotels, etc)Office Job Count (office, health care, government, education, etc)Industrial Job Count (manufacturing, wholesale, transport, etc)Non-Typical Job Count* (includes agriculture, construction, mining, and home-based jobs) This can be calculated by subtracting Typical Job Count from All Employment Count.All Employment Count* (all jobs, this sums jobs from typical and non-typical sectors).* These variable includes REMM’s attempt to estimate construction jobs in areas that experience new and re-development activity. Areas may see short-term fluctuations in Non-Typical and All Employment counts due to the temporary location of construction jobs.Population and employment projections for the Wasatch Front area can be combined with those developed by Dixie MPO (St. George area), Cache MPO (Logan area), and the Utah Department of Transportation (for the remainder of the state) into one database for use in the Utah Statewide Travel Model (USTM). While projections for the areas outside of the Wasatch Front use different forecasting methods, they contain the same summary-level population and employment projections making similar TAZ and ‘City Area’ data available statewide. WFRC plans, in the near future, to add additional areas to these projections datasets by including the projections from the USTM model.Key Exclusions from TAZ and ‘City Area’ ProjectionsAs the primary purpose for the development of these population and employment projections is to model future travel in the region, REMM-based projections do not include population or households that reside in group quarters (prisons, senior centers, dormitories, etc), as residents of these facilities typically have a very low impact on regional travel. USTM-based projections also excludes group quarter populations. Group quarters population estimates are available at the county-level from GPI and at various sub-county geographies from the Census Bureau.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
These three datasets contain data and R code to assign a proxy variable for office worker, based on responses to an open-ended question (OEQ) about occupation in Swedish surveys. The R code and proxy variable can be applied to any dataset with Swedish OEQ about occupation; the R code is also adaptable for OEQ in any language, provided there is a standard classification of occupations in that language.
The R code can be found in the dataset Assigning_office_worker_proxy.R, and the proxy variable in the dataset SSYK12_modified.xlsx.
The dataset Occupation_response.xlsx gives an example of what can be extracted from a Swedish questionnaire with an OEQ about occupation. The dataset can be replaced with optional data as long as it includes two variables named “ID” and “Occupation_swe” (i.e., occupation title given by respondent).