69 datasets found
  1. y

    Public Sector Apprenticeship Target - Dataset - York Open Data

    • data.yorkopendata.org
    • ckan.york.staging.datopian.com
    Updated Sep 18, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2018). Public Sector Apprenticeship Target - Dataset - York Open Data [Dataset]. https://data.yorkopendata.org/dataset/public-sector-apprenticeship-target
    Explore at:
    Dataset updated
    Sep 18, 2018
    License

    Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
    License information was derived automatically

    Area covered
    York
    Description

    This data is published in accordance with the Public Sector Apprenticeship Target. Public Bodies employing more than 250 people are required to have regard to the target of 2.3% of employees being new starting apprentices.

  2. Employee Dataset

    • kaggle.com
    zip
    Updated Nov 10, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Abhisek Das (2024). Employee Dataset [Dataset]. https://www.kaggle.com/datasets/intellabhi/employee-dataset
    Explore at:
    zip(243351 bytes)Available download formats
    Dataset updated
    Nov 10, 2024
    Authors
    Abhisek Das
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    This dataset contains employee-related information from a company, with the goal of predicting employee attrition (i.e., whether an employee leaves the company or not). The dataset includes various features related to employee demographics, job satisfaction, work environment, compensation, and more. This can be used to build machine learning models to predict employee attrition, identify patterns, and assist in making data-driven decisions to improve retention strategies.

    status: This is the target variable, where 'left' indicates that the employee left the company (attrition) and "stayed" indicates that the employee stayed.

  3. Public Sector Apprenticeship Target - Dataset - data.gov.uk

    • ckan.publishing.service.gov.uk
    Updated Oct 28, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    ckan.publishing.service.gov.uk (2018). Public Sector Apprenticeship Target - Dataset - data.gov.uk [Dataset]. https://ckan.publishing.service.gov.uk/dataset/public-sector-apprenticeship-target
    Explore at:
    Dataset updated
    Oct 28, 2018
    Dataset provided by
    CKANhttps://ckan.org/
    Description

    This data is published in accordance with the Public Sector Apprenticeship Target. Public Bodies employing more than 250 people are required to have regard to the target of 2.3% of employees being new starting apprentices.

  4. Employee Churn at Dunder Mifflin Paper Company

    • kaggle.com
    zip
    Updated Sep 24, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Soheil Changizi (2023). Employee Churn at Dunder Mifflin Paper Company [Dataset]. https://www.kaggle.com/datasets/cocolicoq4/employee-churn-at-dunder-mifflin-paper-company/code
    Explore at:
    zip(33026 bytes)Available download formats
    Dataset updated
    Sep 24, 2023
    Authors
    Soheil Changizi
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Predicting Employee Churn at Dunder Mifflin Paper Company

    In the quaint town of Scranton, Pennsylvania, lies the regional branch of the Dunder Mifflin Paper Company, a well-established and somewhat quirky paper company. Dunder Mifflin has been a staple of the local community for years, providing paper products to businesses and individuals alike. However, the company is facing a unique challenge: employee churn.

    The regional manager, Michael Scott, is deeply concerned about the high turnover rate among the employees. He believes that by understanding the factors contributing to employee churn, the company can take steps to improve employee satisfaction and retention.

    Dataset Features: 1. EmployeeID: A unique identifier for each employee. 2. Tenure: The number of years the employee has been with the company. 3. Salary: The employee's annual salary. 4. Department: The department in which the employee works (e.g., Sales, Accounting, Customer Service). 5. JobSatisfaction: The employee's self-reported job satisfaction level (on a scale from 1 to 5, with 5 being highly satisfied). 6. WorkLifeBalance: The employee's self-reported work-life balance rating (on a scale from 1 to 5, with 5 being excellent). 7. CommuteDistance: The distance the employee commutes to work (e.g., Short, Medium, Long). 8. MaritalStatus: The marital status of the employee (e.g., Single, Married, Divorced). 9. Education: The highest level of education attained by the employee (e.g., High School, Bachelor's, Master's). 10. PerformanceRating: The employee's performance rating (on a scale from 1 to 5, with 5 being excellent). 11. TrainingHours: The number of hours of training the employee has received. 12. OverTime: Whether the employee works overtime or not. 13. NumProjects: The number of projects the employee is currently working on. 14. YearsSincePromotion: The number of years since the employee's last promotion. 15. EnvironmentSatisfaction: The employee's self-reported environment satisfaction (on a scale from 1 to 5, with 5 being highly satisfied). 16. Branch: The "Branch" feature represents the geographic location of each employee within one of the 12 Dunder Mifflin branches across the United States.

    Classes (Target Variable): Employees will be classified into four classes based on their likelihood to leave the company: - Class A: Highly likely to leave. - Class B: Moderately likely to leave. - Class C: Slightly likely to leave.

    This classification problem can help Dunder Mifflin Paper Company identify key factors contributing to employee turnover and implement strategies to improve employee retention and workplace satisfaction, all in a setting reminiscent of the beloved TV show "The Office."

    Usage Note:

    This fictional dataset is intended solely for educational and illustrative purposes. While it may resemble real-world data in some aspects, it should not be used for making real business decisions or drawing conclusions about real employees or organizations.

    Any analysis or modeling performed on this dataset should be considered fictional and should not be extrapolated to real-world scenarios.

    Please keep in mind that the dataset is purely fictional and is meant to provide a lighthearted and relatable context for learning and practicing data analysis and machine learning techniques.

  5. SASP Target 53 - Aboriginal Employees - Dataset - data.sa.gov.au

    • data.sa.gov.au
    Updated Jul 2, 2015
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    data.sa.gov.au (2015). SASP Target 53 - Aboriginal Employees - Dataset - data.sa.gov.au [Dataset]. https://data.sa.gov.au/data/dataset/sasp-target-53-aboriginal-employees
    Explore at:
    Dataset updated
    Jul 2, 2015
    Dataset provided by
    Government of South Australiahttp://sa.gov.au/
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Australia, South Australia
    Description

    Increase the participation of Aboriginal people in the South Australian public sector.

  6. e

    Jobs and Job Density, Borough

    • data.europa.eu
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Greater London Authority, Jobs and Job Density, Borough [Dataset]. https://data.europa.eu/data/datasets/jobs-and-job-density-borough~~1?locale=et
    Explore at:
    Dataset authored and provided by
    Greater London Authority
    Description

    Data shows the number of jobs and job density by borough.

    The number of jobs in an area is composed of jobs done by residents (of any age) and jobs done by workers (of any age) who commute into the area.

    Total jobs is a workplace based measure of jobs and comprises:

    - employees (from the Annual Business Inquiry),

    - self-employment jobs (from the Annual Population Survey), People who are self-employed in a second job are included in the self-employed totals.

    - government-supported trainees (from DfES and DWP) and

    - HM Forces (from MoD).

    Job density is the number of jobs per resident of working age (male and female: 16-64). For example, a job density of 1.0 would mean that there is one job for every resident of working age in the population.

    More information on jobs available in Workplace Employment by Sex and Status, Borough and modelled estimates and projections of jobs are available in the GLA Employment Projections. These are considered to be the most accurate jobs estimates at borough level.

    Download this data from NOMIS

  7. Employee Promotion Prediction Dataset

    • kaggle.com
    zip
    Updated May 1, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    AV Sohan Aiyappa (2025). Employee Promotion Prediction Dataset [Dataset]. https://www.kaggle.com/datasets/avsohanaiyappa/employee-promotion-prediction-dataset/data
    Explore at:
    zip(2817 bytes)Available download formats
    Dataset updated
    May 1, 2025
    Authors
    AV Sohan Aiyappa
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    This is a synthetic dataset that contains information about 300 employees, each represented by three features: date_of_birth, date_of_joining, and gender. The date_of_birth and date_of_joining columns are provided in dd-mm-yyyy format, indicating the employee's age and tenure with the company respectively. The gender column includes values such as male, female, and other . The target variable, promoted, indicates whether an employee received a promotion (yes) or not (no). The dataset is logically structured such that employees who are older, have spent more time in the company, and identify as female have a higher likelihood of being promoted.

  8. Success.ai | | US Premium B2B Emails & Phone Numbers Dataset - APIs and flat...

    • datarade.ai
    Updated Oct 25, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Success.ai (2024). Success.ai | | US Premium B2B Emails & Phone Numbers Dataset - APIs and flat files available – 170M+, Verified Profiles - Best Price Guarantee [Dataset]. https://datarade.ai/data-products/success-ai-us-premium-b2b-emails-phone-numbers-dataset-success-ai
    Explore at:
    .bin, .json, .xml, .csv, .xls, .sql, .txtAvailable download formats
    Dataset updated
    Oct 25, 2024
    Dataset provided by
    Area covered
    United States
    Description

    Success.ai offers a comprehensive, enterprise-ready B2B leads data solution, ideal for businesses seeking access to over 150 million verified employee profiles and 170 million work emails. Our data empowers organizations across industries to target key decision-makers, optimize recruitment, and fuel B2B marketing efforts. Whether you're looking for UK B2B data, B2B marketing data, or global B2B contact data, Success.ai provides the insights you need with pinpoint accuracy.

    Tailored for B2B Sales, Marketing, Recruitment and more: Our B2B contact data and B2B email data solutions are designed to enhance your lead generation, sales, and recruitment efforts. Build hyper-targeted lists based on job title, industry, seniority, and geographic location. Whether you’re reaching mid-level professionals or C-suite executives, Success.ai delivers the data you need to connect with the right people.

    API Features:

    • Real-Time Updates: Our APIs deliver real-time updates, ensuring that the contact data your business relies on is always current and accurate.
    • High Volume Handling: Designed to support up to 860k API calls per day, our system is built for scalability and responsiveness, catering to enterprises of all sizes.
    • Flexible Integration: Easily integrate with CRM systems, marketing automation tools, and other enterprise applications to streamline your workflows and enhance productivity.

    Key Categories Served: B2B sales leads – Identify decision-makers in key industries, B2B marketing data – Target professionals for your marketing campaigns, Recruitment data – Source top talent efficiently and reduce hiring times, CRM enrichment – Update and enhance your CRM with verified, updated data, Global reach – Coverage across 195 countries, including the United States, United Kingdom, Germany, India, Singapore, and more.

    Global Coverage with Real-Time Accuracy: Success.ai’s dataset spans a wide range of industries such as technology, finance, healthcare, and manufacturing. With continuous real-time updates, your team can rely on the most accurate data available: 150M+ Employee Profiles: Access professional profiles worldwide with insights including full name, job title, seniority, and industry. 170M Verified Work Emails: Reach decision-makers directly with verified work emails, available across industries and geographies, including Singapore and UK B2B data. GDPR-Compliant: Our data is fully compliant with GDPR and other global privacy regulations, ensuring safe and legal use of B2B marketing data.

    Key Data Points for Every Employee Profile: Every profile in Success.ai’s database includes over 20 critical data points, providing the information needed to power B2B sales and marketing campaigns: Full Name, Job Title, Company, Work Email, Location, Phone Number, LinkedIn Profile, Experience, Education, Technographic Data, Languages, Certifications, Industry, Publications & Awards.

    Use Cases Across Industries: Success.ai’s B2B data solution is incredibly versatile and can support various enterprise use cases, including: B2B Marketing Campaigns: Reach high-value professionals in industries such as technology, finance, and healthcare. Enterprise Sales Outreach: Build targeted B2B contact lists to improve sales efforts and increase conversions. Talent Acquisition: Accelerate hiring by sourcing top talent with accurate and updated employee data, filtered by job title, industry, and location. Market Research: Gain insights into employment trends and company profiles to enrich market research. CRM Data Enrichment: Ensure your CRM stays accurate by integrating updated B2B contact data. Event Targeting: Create lists for webinars, conferences, and product launches by targeting professionals in key industries.

    Use Cases for Success.ai's Contact Data - Targeted B2B Marketing: Create precise campaigns by targeting key professionals in industries like tech and finance. - Sales Outreach: Build focused sales lists of decision-makers and C-suite executives for faster deal cycles. - Recruiting Top Talent: Easily find and hire qualified professionals with updated employee profiles. - CRM Enrichment: Keep your CRM current with verified, accurate employee data. - Event Targeting: Create attendee lists for events by targeting relevant professionals in key sectors. - Market Research: Gain insights into employment trends and company profiles for better business decisions. - Executive Search: Source senior executives and leaders for headhunting and recruitment. - Partnership Building: Find the right companies and key people to develop strategic partnerships.

    Why Choose Success.ai’s Employee Data? Success.ai is the top choice for enterprises looking for comprehensive and affordable B2B data solutions. Here’s why: Unmatched Accuracy: Our AI-powered validation process ensures 99% accuracy across all data points, resulting in higher engagement and fewer bounces. Global Scale: With 150M+ employee profiles and 170M veri...

  9. HR Analytics: Job Change of Data Scientists

    • kaggle.com
    zip
    Updated Sep 11, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Abhishek Kumar (2021). HR Analytics: Job Change of Data Scientists [Dataset]. https://www.kaggle.com/uniabhi/hr-analytics-job-change-of-data-scientists
    Explore at:
    zip(301600 bytes)Available download formats
    Dataset updated
    Sep 11, 2021
    Authors
    Abhishek Kumar
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Context and Content A company which is active in Big Data and Data Science wants to hire data scientists among people who successfully pass some courses which conduct by the company. Many people signup for their training. Company wants to know which of these candidates are really wants to work for the company after training or looking for a new employment because it helps to reduce the cost and time as well as the quality of training or planning the courses and categorization of candidates. Information related to demographics, education, experience are in hands from candidates signup and enrollment.

    This dataset designed to understand the factors that lead a person to leave current job for HR researches too. By model(s) that uses the current credentials,demographics,experience data you will predict the probability of a candidate to look for a new job or will work for the company, as well as interpreting affected factors on employee decision.

    The whole data divided to train and test . Target isn't included in test but the test target values data file is in hands for related tasks. A sample submission correspond to enrollee_id of test set provided too with columns : enrollee _id , target

    Note:

    The dataset is imbalanced. Most features are categorical (Nominal, Ordinal, Binary), some with high cardinality. Missing imputation can be a part of your pipeline as well. Features

    enrollee_id : Unique ID for candidate

    city: City code

    city_ development _index : Developement index of the city (scaled)

    gender: Gender of candidate

    relevent_experience: Relevant experience of candidate

    enrolled_university: Type of University course enrolled if any

    education_level: Education level of candidate

    major_discipline :Education major discipline of candidate

    experience: Candidate total experience in years

    company_size: No of employees in current employer's company

    company_type : Type of current employer

    lastnewjob: Difference in years between previous job and current job

    training_hours: training hours completed

    target: 0 – Not looking for job change, 1 – Looking for a job change

    Inspiration Predict the probability of a candidate will work for the company Interpret model(s) such a way that illustrate which features affect candidate decision Please refer to the following task for more details: https://www.kaggle.com/arashnic/hr-analytics-job-change-of-data-scientists/tasks?taskId=3015

  10. d

    SASP Target 50 - People with Disability - Dataset - data.sa.gov.au

    • data.sa.gov.au
    Updated Jul 2, 2015
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2015). SASP Target 50 - People with Disability - Dataset - data.sa.gov.au [Dataset]. https://data.sa.gov.au/data/dataset/sasp-target-50-people-with-disability
    Explore at:
    Dataset updated
    Jul 2, 2015
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Australia, South Australia
    Description

    Increase by 10% the number of people with a disability employed in South Australia by 2020.

  11. 2

    Data from: LFS

    • datacatalogue.ukdataservice.ac.uk
    Updated May 20, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Office for National Statistics (2025). LFS [Dataset]. http://doi.org/10.5255/UKDA-SN-9389-1
    Explore at:
    Dataset updated
    May 20, 2025
    Dataset provided by
    UK Data Servicehttps://ukdataservice.ac.uk/
    Authors
    Office for National Statistics
    Area covered
    United Kingdom
    Description

    Background
    The Labour Force Survey (LFS) is a unique source of information using international definitions of employment and unemployment and economic inactivity, together with a wide range of related topics such as occupation, training, hours of work and personal characteristics of household members aged 16 years and over. It is used to inform social, economic and employment policy. The LFS was first conducted biennially from 1973-1983. Between 1984 and 1991 the survey was carried out annually and consisted of a quarterly survey conducted throughout the year and a 'boost' survey in the spring quarter (data were then collected seasonally). From 1992 quarterly data were made available, with a quarterly sample size approximately equivalent to that of the previous annual data. The survey then became known as the Quarterly Labour Force Survey (QLFS). From December 1994, data gathering for Northern Ireland moved to a full quarterly cycle to match the rest of the country, so the QLFS then covered the whole of the UK (though some additional annual Northern Ireland LFS datasets are also held at the UK Data Archive). Further information on the background to the QLFS may be found in the documentation.

    Longitudinal data
    The LFS retains each sample household for five consecutive quarters, with a fifth of the sample replaced each quarter. The main survey was designed to produce cross-sectional data, but the data on each individual have now been linked together to provide longitudinal information. The longitudinal data comprise two types of linked datasets, created using the weighting method to adjust for non-response bias. The two-quarter datasets link data from two consecutive waves, while the five-quarter datasets link across a whole year (for example January 2010 to March 2011 inclusive) and contain data from all five waves. A full series of longitudinal data has been produced, going back to winter 1992. Linking together records to create a longitudinal dimension can, for example, provide information on gross flows over time between different labour force categories (employed, unemployed and economically inactive). This will provide detail about people who have moved between the categories. Also, longitudinal information is useful in monitoring the effects of government policies and can be used to follow the subsequent activities and circumstances of people affected by specific policy initiatives, and to compare them with other groups in the population. There are however methodological problems which could distort the data resulting from this longitudinal linking. The ONS continues to research these issues and advises that the presentation of results should be carefully considered, and warnings should be included with outputs where necessary.

    New reweighting policy
    Following the new reweighting policy ONS has reviewed the latest population estimates made available during 2019 and have decided not to carry out a 2019 LFS and APS reweighting exercise. Therefore, the next reweighting exercise will take place in 2020. These will incorporate the 2019 Sub-National Population Projection data (published in May 2020) and 2019 Mid-Year Estimates (published in June 2020). It is expected that reweighted Labour Market aggregates and microdata will be published towards the end of 2020/early 2021.

    LFS Documentation
    The documentation available from the Archive to accompany LFS datasets largely consists of the latest version of each user guide volume alongside the appropriate questionnaire for the year concerned. However, volumes are updated periodically by ONS, so users are advised to check the latest documents on the ONS Labour Force Survey - User Guidance pages before commencing analysis. This is especially important for users of older QLFS studies, where information and guidance in the user guide documents may have changed over time.

    Additional data derived from the QLFS
    The Archive also holds further QLFS series: End User Licence (EUL) quarterly data; Secure Access datasets; household datasets; quarterly, annual and ad hoc module datasets compiled for Eurostat; and some additional annual Northern Ireland datasets.

    Variables DISEA and LNGLST
    Dataset A08 (Labour market status of disabled people) which ONS suspended due to an apparent discontinuity between April to June 2017 and July to September 2017 is now available. As a result of this apparent discontinuity and the inconclusive investigations at this stage, comparisons should be made with caution between April to June 2017 and subsequent time periods. However users should note that the estimates are not seasonally adjusted, so some of the change between quarters could be due to seasonality. Further recommendations on historical comparisons of the estimates will be given in November 2018 when ONS are due to publish estimates for July to September 2018.

    An article explaining the quality assurance investigations that have been conducted so far is available on the ONS Methodology webpage. For any queries about Dataset A08 please email Labour.Market@ons.gov.uk.

    Occupation data for 2021 and 2022 data files

    The ONS has identified an issue with the collection of some occupational data in 2021 and 2022 data files in a number of their surveys. While they estimate any impacts will be small overall, this will affect the accuracy of the breakdowns of some detailed (four-digit Standard Occupational Classification (SOC)) occupations, and data derived from them. Further information can be found in the ONS article published on 11 July 2023: https://www.ons.gov.uk/employmentandlabourmarket/peopleinwork/employmentandemployeetypes/articles/revisionofmiscodedoccupationaldataintheonslabourforcesurveyuk/january2021toseptember2022" style="background-color: rgb(255, 255, 255);">Revision of miscoded occupational data in the ONS Labour Force Survey, UK: January 2021 to September 2022.

    2022 Weighting

    The population totals used for the latest LFS estimates use projected growth rates from Real Time Information (RTI) data for UK, EU and non-EU populations based on 2021 patterns. The total population used for the LFS therefore does not take into account any changes in migration, birth rates, death rates, and so on since June 2021, and hence levels estimates may be under- or over-estimating the true values and should be used with caution. Estimates of rates will, however, be robust.

    Production of two-quarter longitudinal data resumed, April 2024

    In April 2024, ONS resumed production of the two-quarter longitudinal data, along with quarterly household data. As detailed in the ONS Labour Market Transformation update of April 2024, for longitudinal data, flows between October to December 2023 and January to March 2024 will similarly mark the start of a new time series. This will be consistent with LFS weighting from equivalent person quarterly datasets, but will not be consistent with historic longitudinal data
    before this period.

  12. c

    Labour Force Survey Two-Quarter Longitudinal Dataset, July - December, 2024

    • datacatalogue.cessda.eu
    Updated Feb 28, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Office for National Statistics (2025). Labour Force Survey Two-Quarter Longitudinal Dataset, July - December, 2024 [Dataset]. http://doi.org/10.5255/UKDA-SN-9348-1
    Explore at:
    Dataset updated
    Feb 28, 2025
    Authors
    Office for National Statistics
    Time period covered
    Jul 1, 2024 - Dec 31, 2024
    Area covered
    United Kingdom
    Variables measured
    Individuals
    Measurement technique
    Compilation or synthesis of existing material, the datasets were created from existing LFS data. They do not contain all records, but only those of respondents of working age who have responded to the survey in all the periods being linked. The data therefore comprise a subset of variables representing approximately one third of all QLFS variables. Cases were linked using the QLFS panel design.
    Description

    Abstract copyright UK Data Service and data collection copyright owner.

    Background
    The Labour Force Survey (LFS) is a unique source of information using international definitions of employment and unemployment and economic inactivity, together with a wide range of related topics such as occupation, training, hours of work and personal characteristics of household members aged 16 years and over. It is used to inform social, economic and employment policy. The LFS was first conducted biennially from 1973-1983. Between 1984 and 1991 the survey was carried out annually and consisted of a quarterly survey conducted throughout the year and a 'boost' survey in the spring quarter (data were then collected seasonally). From 1992 quarterly data were made available, with a quarterly sample size approximately equivalent to that of the previous annual data. The survey then became known as the Quarterly Labour Force Survey (QLFS). From December 1994, data gathering for Northern Ireland moved to a full quarterly cycle to match the rest of the country, so the QLFS then covered the whole of the UK (though some additional annual Northern Ireland LFS datasets are also held at the UK Data Archive). Further information on the background to the QLFS may be found in the documentation.

    Longitudinal data
    The LFS retains each sample household for five consecutive quarters, with a fifth of the sample replaced each quarter. The main survey was designed to produce cross-sectional data, but the data on each individual have now been linked together to provide longitudinal information. The longitudinal data comprise two types of linked datasets, created using the weighting method to adjust for non-response bias. The two-quarter datasets link data from two consecutive waves, while the five-quarter datasets link across a whole year (for example January 2010 to March 2011 inclusive) and contain data from all five waves. A full series of longitudinal data has been produced, going back to winter 1992. Linking together records to create a longitudinal dimension can, for example, provide information on gross flows over time between different labour force categories (employed, unemployed and economically inactive). This will provide detail about people who have moved between the categories. Also, longitudinal information is useful in monitoring the effects of government policies and can be used to follow the subsequent activities and circumstances of people affected by specific policy initiatives, and to compare them with other groups in the population. There are however methodological problems which could distort the data resulting from this longitudinal linking. The ONS continues to research these issues and advises that the presentation of results should be carefully considered, and warnings should be included with outputs where necessary.

    New reweighting policy
    Following the new reweighting policy ONS has reviewed the latest population estimates made available during 2019 and have decided not to carry out a 2019 LFS and APS reweighting exercise. Therefore, the next reweighting exercise will take place in 2020. These will incorporate the 2019 Sub-National Population Projection data (published in May 2020) and 2019 Mid-Year Estimates (published in June 2020). It is expected that reweighted Labour Market aggregates and microdata will be published towards the end of 2020/early 2021.

    LFS Documentation
    The documentation available from the Archive to accompany LFS datasets largely consists of the latest version of each user guide volume alongside the appropriate questionnaire for the year concerned. However, volumes are updated periodically by ONS, so users are advised to check the latest documents on the ONS Labour Force Survey - User Guidance pages before commencing analysis. This is especially important for users of older QLFS studies, where information and guidance in the user guide documents may have changed over time.

    Additional data derived from the QLFS
    The Archive also holds further QLFS series: End User Licence (EUL) quarterly data; Secure Access datasets; household datasets; quarterly, annual and ad hoc module datasets compiled for Eurostat; and some additional annual Northern Ireland datasets.

    Variables DISEA and LNGLST
    Dataset A08 (Labour market status of disabled people) which ONS suspended due to an apparent discontinuity between April to June 2017 and July to September 2017 is now available. As a result of this apparent discontinuity and the inconclusive...

  13. a

    Employed population below international poverty line by sex and age...

    • hub.arcgis.com
    • global-fistula-hub-ucsf.hub.arcgis.com
    Updated Feb 3, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Direct Relief (2021). Employed population below international poverty line by sex and age (percent) [Dataset]. https://hub.arcgis.com/datasets/dced3a1f900c4c99ba0f3dc7de86b869
    Explore at:
    Dataset updated
    Feb 3, 2021
    Dataset authored and provided by
    Direct Relief
    Area covered
    Description

    Series Name: Employed population below international poverty line by sex and age (percent)Series Code: SI_POV_EMP1Release Version: 2020.Q2.G.03This dataset is the part of the Global SDG Indicator Database compiled through the UN System in preparation for the Secretary-General's annual report on Progress towards the Sustainable Development Goals.Indicator 1.1.1: Proportion of the population living below the international poverty line by sex, age, employment status and geographic location (urban/rural)Target 1.1: By 2030, eradicate extreme poverty for all people everywhere, currently measured as people living on less than $1.25 a dayGoal 1: End poverty in all its forms everywhereFor more information on the compilation methodology of this dataset, see https://unstats.un.org/sdgs/metadata/

  14. T

    United States Unemployment Rate

    • tradingeconomics.com
    • pt.tradingeconomics.com
    • +14more
    csv, excel, json, xml
    Updated Nov 20, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    TRADING ECONOMICS (2025). United States Unemployment Rate [Dataset]. https://tradingeconomics.com/united-states/unemployment-rate
    Explore at:
    excel, xml, csv, jsonAvailable download formats
    Dataset updated
    Nov 20, 2025
    Dataset authored and provided by
    TRADING ECONOMICS
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Jan 31, 1948 - Sep 30, 2025
    Area covered
    United States
    Description

    Unemployment Rate in the United States increased to 4.40 percent in September from 4.30 percent in August of 2025. This dataset provides the latest reported value for - United States Unemployment Rate - plus previous releases, historical high and low, short-term forecast and long-term prediction, economic calendar, survey consensus and news.

  15. w

    Data Use in Academia Dataset

    • datacatalog.worldbank.org
    csv, utf-8
    Updated Nov 27, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Semantic Scholar Open Research Corpus (S2ORC) (2023). Data Use in Academia Dataset [Dataset]. https://datacatalog.worldbank.org/search/dataset/0065200/data_use_in_academia_dataset
    Explore at:
    utf-8, csvAvailable download formats
    Dataset updated
    Nov 27, 2023
    Dataset provided by
    Brian William Stacy
    Semantic Scholar Open Research Corpus (S2ORC)
    License

    https://datacatalog.worldbank.org/public-licenses?fragment=cchttps://datacatalog.worldbank.org/public-licenses?fragment=cc

    Description

    This dataset contains metadata (title, abstract, date of publication, field, etc) for around 1 million academic articles. Each record contains additional information on the country of study and whether the article makes use of data. Machine learning tools were used to classify the country of study and data use.


    Our data source of academic articles is the Semantic Scholar Open Research Corpus (S2ORC) (Lo et al. 2020). The corpus contains more than 130 million English language academic papers across multiple disciplines. The papers included in the Semantic Scholar corpus are gathered directly from publishers, from open archives such as arXiv or PubMed, and crawled from the internet.


    We placed some restrictions on the articles to make them usable and relevant for our purposes. First, only articles with an abstract and parsed PDF or latex file are included in the analysis. The full text of the abstract is necessary to classify the country of study and whether the article uses data. The parsed PDF and latex file are important for extracting important information like the date of publication and field of study. This restriction eliminated a large number of articles in the original corpus. Around 30 million articles remain after keeping only articles with a parsable (i.e., suitable for digital processing) PDF, and around 26% of those 30 million are eliminated when removing articles without an abstract. Second, only articles from the year 2000 to 2020 were considered. This restriction eliminated an additional 9% of the remaining articles. Finally, articles from the following fields of study were excluded, as we aim to focus on fields that are likely to use data produced by countries’ national statistical system: Biology, Chemistry, Engineering, Physics, Materials Science, Environmental Science, Geology, History, Philosophy, Math, Computer Science, and Art. Fields that are included are: Economics, Political Science, Business, Sociology, Medicine, and Psychology. This third restriction eliminated around 34% of the remaining articles. From an initial corpus of 136 million articles, this resulted in a final corpus of around 10 million articles.


    Due to the intensive computer resources required, a set of 1,037,748 articles were randomly selected from the 10 million articles in our restricted corpus as a convenience sample.


    The empirical approach employed in this project utilizes text mining with Natural Language Processing (NLP). The goal of NLP is to extract structured information from raw, unstructured text. In this project, NLP is used to extract the country of study and whether the paper makes use of data. We will discuss each of these in turn.


    To determine the country or countries of study in each academic article, two approaches are employed based on information found in the title, abstract, or topic fields. The first approach uses regular expression searches based on the presence of ISO3166 country names. A defined set of country names is compiled, and the presence of these names is checked in the relevant fields. This approach is transparent, widely used in social science research, and easily extended to other languages. However, there is a potential for exclusion errors if a country’s name is spelled non-standardly.


    The second approach is based on Named Entity Recognition (NER), which uses machine learning to identify objects from text, utilizing the spaCy Python library. The Named Entity Recognition algorithm splits text into named entities, and NER is used in this project to identify countries of study in the academic articles. SpaCy supports multiple languages and has been trained on multiple spellings of countries, overcoming some of the limitations of the regular expression approach. If a country is identified by either the regular expression search or NER, it is linked to the article. Note that one article can be linked to more than one country.


    The second task is to classify whether the paper uses data. A supervised machine learning approach is employed, where 3500 publications were first randomly selected and manually labeled by human raters using the Mechanical Turk service (Paszke et al. 2019).[1] To make sure the human raters had a similar and appropriate definition of data in mind, they were given the following instructions before seeing their first paper:


    Each of these documents is an academic article. The goal of this study is to measure whether a specific academic article is using data and from which country the data came.

    There are two classification tasks in this exercise:

    1. identifying whether an academic article is using data from any country

    2. Identifying from which country that data came.

    For task 1, we are looking specifically at the use of data. Data is any information that has been collected, observed, generated or created to produce research findings. As an example, a study that reports findings or analysis using a survey data, uses data. Some clues to indicate that a study does use data includes whether a survey or census is described, a statistical model estimated, or a table or means or summary statistics is reported.

    After an article is classified as using data, please note the type of data used. The options are population or business census, survey data, administrative data, geospatial data, private sector data, and other data. If no data is used, then mark "Not applicable". In cases where multiple data types are used, please click multiple options.[2]

    For task 2, we are looking at the country or countries that are studied in the article. In some cases, no country may be applicable. For instance, if the research is theoretical and has no specific country application. In some cases, the research article may involve multiple countries. In these cases, select all countries that are discussed in the paper.

    We expect between 10 and 35 percent of all articles to use data.


    The median amount of time that a worker spent on an article, measured as the time between when the article was accepted to be classified by the worker and when the classification was submitted was 25.4 minutes. If human raters were exclusively used rather than machine learning tools, then the corpus of 1,037,748 articles examined in this study would take around 50 years of human work time to review at a cost of $3,113,244, which assumes a cost of $3 per article as was paid to MTurk workers.


    A model is next trained on the 3,500 labelled articles. We use a distilled version of the BERT (bidirectional Encoder Representations for transformers) model to encode raw text into a numeric format suitable for predictions (Devlin et al. (2018)). BERT is pre-trained on a large corpus comprising the Toronto Book Corpus and Wikipedia. The distilled version (DistilBERT) is a compressed model that is 60% the size of BERT and retains 97% of the language understanding capabilities and is 60% faster (Sanh, Debut, Chaumond, Wolf 2019). We use PyTorch to produce a model to classify articles based on the labeled data. Of the 3,500 articles that were hand coded by the MTurk workers, 900 are fed to the machine learning model. 900 articles were selected because of computational limitations in training the NLP model. A classification of “uses data” was assigned if the model predicted an article used data with at least 90% confidence.


    The performance of the models classifying articles to countries and as using data or not can be compared to the classification by the human raters. We consider the human raters as giving us the ground truth. This may underestimate the model performance if the workers at times got the allocation wrong in a way that would not apply to the model. For instance, a human rater could mistake the Republic of Korea for the Democratic People’s Republic of Korea. If both humans and the model perform the same kind of errors, then the performance reported here will be overestimated.


    The model was able to predict whether an article made use of data with 87% accuracy evaluated on the set of articles held out of the model training. The correlation between the number of articles written about each country using data estimated under the two approaches is given in the figure below. The number of articles represents an aggregate total of

  16. Labour Force Survey Five-Quarter Longitudinal Dataset, October 1997 -...

    • commons.datacite.org
    Updated 2008
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Social Office For National Statistics; Northern Ireland Statistics (2008). Labour Force Survey Five-Quarter Longitudinal Dataset, October 1997 - December 1998 [Dataset]. http://doi.org/10.5255/ukda-sn-5984-1
    Explore at:
    Dataset updated
    2008
    Dataset provided by
    DataCitehttps://www.datacite.org/
    UK Data Servicehttps://ukdataservice.ac.uk/
    Authors
    Social Office For National Statistics; Northern Ireland Statistics
    Description

    This study was deposited in 2008, as a result of the move from seasonal to calendar quarters for the QLFS, and the reweighting process to 2007-2008 population figures. It combines data from previously-available QLFS seasonal five-quarter longitudinal datasets. The depositor has advised that small revisions to the data may have been made during this process, but they should not be significant.

    Background
    The Labour Force Survey (LFS) is a unique source of information using international definitions of employment and unemployment and economic inactivity, together with a wide range of related topics such as occupation, training, hours of work and personal characteristics of household members aged 16 years and over. It is used to inform social, economic and employment policy. The LFS was first conducted biennially from 1973-1983. Between 1984 and 1991 the survey was carried out annually and consisted of a quarterly survey conducted throughout the year and a 'boost' survey in the spring quarter (data were then collected seasonally). From 1992 quarterly data were made available, with a quarterly sample size approximately equivalent to that of the previous annual data. The survey then became known as the Quarterly Labour Force Survey (QLFS). From December 1994, data gathering for Northern Ireland moved to a full quarterly cycle to match the rest of the country, so the QLFS then covered the whole of the UK (though some additional annual Northern Ireland LFS datasets are also held at the UK Data Archive). Further information on the background to the QLFS may be found in the documentation.

    Longitudinal data
    The LFS retains each sample household for five consecutive quarters, with a fifth of the sample replaced each quarter. The main survey was designed to produce cross-sectional data, but the data on each individual have now been linked together to provide longitudinal information. The longitudinal data comprise two types of linked datasets, created using the weighting method to adjust for non-response bias. The two-quarter datasets link data from two consecutive waves, while the five-quarter datasets link across a whole year (for example January 2010 to March 2011 inclusive) and contain data from all five waves. A full series of longitudinal data has been produced, going back to winter 1992. Linking together records to create a longitudinal dimension can, for example, provide information on gross flows over time between different labour force categories (employed, unemployed and economically inactive). This will provide detail about people who have moved between the categories. Also, longitudinal information is useful in monitoring the effects of government policies and can be used to follow the subsequent activities and circumstances of people affected by specific policy initiatives, and to compare them with other groups in the population. There are however methodological problems which could distort the data resulting from this longitudinal linking. The ONS continues to research these issues and advises that the presentation of results should be carefully considered, and warnings should be included with outputs where necessary.

    LFS Documentation
    The documentation available from the Archive to accompany LFS datasets largely consists of the latest version of each user guide volume alongside the appropriate questionnaire for the year concerned. However, volumes are updated periodically by ONS, so users are advised to check the latest documents on the ONS Labour Force Survey - User Guidance pages before commencing analysis. This is especially important for users of older QLFS studies, where information and guidance in the user guide documents may have changed over time.

    Additional data derived from the QLFS
    The Archive also holds further QLFS series: End User Licence (EUL) quarterly data; Secure Access datasets; household datasets; quarterly, annual and ad hoc module datasets compiled for Eurostat; and some additional annual Northern Ireland datasets.

    Variables DISEA and LNGLST
    Dataset A08 (Labour market status of disabled people) which ONS suspended due to an apparent discontinuity between April to June 2017 and July to September 2017 is now available. As a result of this apparent discontinuity and the inconclusive investigations at this stage, comparisons should be made with caution between April to June 2017 and subsequent time periods. However users should note that the estimates are not seasonally adjusted, so some of the change between quarters could be due to seasonality. Further recommendations on historical comparisons of the estimates will be given in November 2018 when ONS are due to publish estimates for July to September 2018.

    An article explaining the quality assurance investigations that have been conducted so far is available on the ONS Methodology webpage. For any queries about Dataset A08 please email Labour.Market@ons.gov.uk.

  17. d

    Dataset for: The use of the internet and digital technologies in Slovenia...

    • demo-b2find.dkrz.de
    Updated Nov 11, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). Dataset for: The use of the internet and digital technologies in Slovenia among people aged 55 and over - Dataset - B2FIND [Dataset]. http://demo-b2find.dkrz.de/dataset/ddff31ac-25d7-5d87-a118-440d2dadacaf
    Explore at:
    Dataset updated
    Nov 11, 2025
    Area covered
    Slovenia
    Description

    The dataset contains original data for a wide range of indicators that capture information about the (non-)use of the internet and digital technologies among people aged 55 and over. The target population was the population of the Republic of Slovenia aged 55 and over. A representative sample of participants (N=636) was surveyed between 24/10/2023 and 25/01/2024 via the online survey or the paper-and-pencil survey. The questionnaire consists of several thematic sections: Internet uses, proxy internet use, Internet skills, consequences of Internet use, motivation to participate in digital skills training. One part of the questionnaire was divided into two sections – one for Internet users and one for Internet non-users. The purpose of this survey was to gather information on internet and digital technology (non-)use among individuals aged 55 and over, including their attitudes, related services, and motivation for acquiring digital skills.

  18. g

    Dept of the Premier and Cabinet - SASP Target 50 - People with Disability |...

    • gimi9.com
    Updated Jul 1, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). Dept of the Premier and Cabinet - SASP Target 50 - People with Disability | gimi9.com [Dataset]. https://gimi9.com/dataset/au_sasp-target-50-people-with-disability/
    Explore at:
    Dataset updated
    Jul 1, 2025
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Increase by 10% the number of people with a disability employed in South Australia by 2020.

  19. Synthetic Demographic Dataset

    • kaggle.com
    zip
    Updated Dec 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    AnthonyTherrien (2023). Synthetic Demographic Dataset [Dataset]. https://www.kaggle.com/datasets/anthonytherrien/synthetic-population-demographics-dataset/discussion?sort=undefined
    Explore at:
    zip(204018419 bytes)Available download formats
    Dataset updated
    Dec 30, 2023
    Authors
    AnthonyTherrien
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Description

    Description:

    Introducing the Synthetic Demographic Dataset, a large-scale, simulated dataset encompassing 5,000,000 rows. This dataset is a fictional yet intricate assembly of individual profiles, each characterized by various demographic and lifestyle attributes such as name, gender, country, age, income, education level, occupation, and more. It is designed to illustrate the potential of the data generation script, available in the 'Code' section.

    Purpose and Use:

    The dataset's primary purpose is to display the versatility and depth of the data generation script. It exemplifies how diverse demographic data can be synthesized. This dataset is ideal for understanding the structure and potential of synthetic data but is not intended for predictive modeling or statistical analysis due to the lack of a target variable or real-world correlation.

    Key Note:

    This dataset does not correlate with any real-world data or target values. It is artificially generated for demonstration purposes only and should not be employed for machine learning models or statistical analyses intending to derive real-world insights or predictions.

    Data Format and Attributes:

    Each of the 5,000,000 rows represents an individual with attributes including:

    • Name
    • Gender
    • Country
    • Age
    • Income
    • Education Level
    • Occupation
    • Marital Status
    • Number of Children
    • Location Type (Urban/Suburban/Rural)
    • Health Index
    • Exercise Frequency
    • Diet Quality Score
    • Credit Score
    • Car Ownership Status

    Dataset Size:

    5,000,000 Rows

    Code Availability:

    Access the code used for generating this dataset in the 'Code' section. It offers insight into synthetic data generation techniques, valuable for educational and demonstration purposes.

  20. a

    Indicator 1.1.1: Employed population below international poverty line, by...

    • hub.arcgis.com
    • data.amerigeoss.org
    • +1more
    Updated Nov 15, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    SDGs (2018). Indicator 1.1.1: Employed population below international poverty line, by sex and age (percent) [Dataset]. https://hub.arcgis.com/datasets/61ccf35396a142f4840e25bb23496ab2
    Explore at:
    Dataset updated
    Nov 15, 2018
    Dataset authored and provided by
    SDGs
    Area covered
    Description

    Series SI_POV_EMP1: Employed population below international poverty line, by sex and age (%)Indicator 1.1.1: Proportion of population below the international poverty line, by sex, age, employment status and geographical location (urban/rural)Target 1.1: By 2030, eradicate extreme poverty for all people everywhere, currently measured as people living on less than $1.25 a dayGoal 1: End poverty in all its forms everywhereRelease Version: 2018.Q2.G.01

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
(2018). Public Sector Apprenticeship Target - Dataset - York Open Data [Dataset]. https://data.yorkopendata.org/dataset/public-sector-apprenticeship-target

Public Sector Apprenticeship Target - Dataset - York Open Data

Explore at:
Dataset updated
Sep 18, 2018
License

Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
License information was derived automatically

Area covered
York
Description

This data is published in accordance with the Public Sector Apprenticeship Target. Public Bodies employing more than 250 people are required to have regard to the target of 2.3% of employees being new starting apprentices.

Search
Clear search
Close search
Google apps
Main menu