3 datasets found
  1. Gender Pay Gap Dataset

    • kaggle.com
    zip
    Updated Feb 2, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    fedesoriano (2022). Gender Pay Gap Dataset [Dataset]. https://www.kaggle.com/datasets/fedesoriano/gender-pay-gap-dataset
    Explore at:
    zip(61650632 bytes)Available download formats
    Dataset updated
    Feb 2, 2022
    Authors
    fedesoriano
    Description

    Similar Datasets

    • Company Bankruptcy Prediction: LINK
    • The Boston House-Price Data: LINK
    • California Housing Prices Data (5 new features!): LINK
    • Spanish Wine Quality Dataset: LINK

    Context

    The gender pay gap or gender wage gap is the average difference between the remuneration for men and women who are working. Women are generally considered to be paid less than men. There are two distinct numbers regarding the pay gap: non-adjusted versus adjusted pay gap. The latter typically takes into account differences in hours worked, occupations were chosen, education, and job experience. In the United States, for example, the non-adjusted average female's annual salary is 79% of the average male salary, compared to 95% for the adjusted average salary.

    The reasons link to legal, social, and economic factors, and extend beyond "equal pay for equal work".

    The gender pay gap can be a problem from a public policy perspective because it reduces economic output and means that women are more likely to be dependent upon welfare payments, especially in old age.

    This dataset aims to replicate the data used in the famous paper "The Gender Wage Gap: Extent, Trends, and Explanations", which provides new empirical evidence on the extent of and trends in the gender wage gap, which declined considerably during the 1980–2010 period.

    Citation

    fedesoriano. (January 2022). Gender Pay Gap Dataset. Retrieved [Date Retrieved] from https://www.kaggle.com/fedesoriano/gender-pay-gap-dataset.

    Content

    There are 2 files in this dataset: a) the Panel Study of Income Dynamics (PSID) microdata over the 1980-2010 period, and b) the Current Population Survey (CPS) to provide some additional US national data on the gender pay gap.

    PSID variables:

    NOTES: THE VARIABLES WITH fz ADDED TO THEIR NAME REFER TO EXPERIENCE WHERE WE HAVE FILLED IN SOME ZEROS IN THE MISSING PSID YEARS WITH DATA FROM THE RESPONDENTS’ ANSWERS TO QUESTIONS ABOUT JOBS WORKED ON DURING THESE MISSING YEARS. THE fz variables WERE USED IN THE REGRESSION ANALYSES THE VARIABLES WITH A predict PREFIX REFER TO THE COMPUTATION OF ACTUAL EXPERIENCE ACCUMULATED DURING THE YEARS IN WHICH THE PSID DID NOT SURVEY THE RESPONDENTS. THERE ARE MORE PREDICTED EXPERIENCE LEVELS THAT ARE NEEDED TO IMPUTE EXPERIENCE IN THE MISSING YEARS IN SOME CASES. NOTE THAT THE VARIABLES yrsexpf, yrsexpfsz, etc., INCLUDE THESE COMPUTATIONS, SO THAT IF YOU WANT TO USE FULL TIME OR PART TIME EXPERIENCE, YOU DON’T NEED TO ADD THESE PREDICT VARIABLES IN. THEY ARE INCLUDED IN THE DATA SET TO ILLUSTRATE THE RESULTS OF THE COMPUTATION PROCESS. THE VARIABLES WITH AN orig PREFIX ARE THE ORIGINAL PSID VARIABLES. THESE HAVE BEEN PROCESSED AND IN SOME CASES RENAMED FOR CONVENIENCE. THE hd SUFFIX MEANS THAT THE VARIABLE REFERS TO THE HEAD OF THE FAMILY, AND THE wf SUFFIX MEANS THAT IT REFERS TO THE WIFE OR FEMALE COHABITOR IF THERE IS ONE. AS SHOWN IN THE ACCOMPANYING REGRESSION PROGRAM, THESE orig VARIABLES AREN’T USED DIRECTLY IN THE REGRESSIONS. THERE ARE MORE OF THE ORIGINAL PSID VARIABLES, WHICH WERE USED TO CONSTRUCT THE VARIABLES USED IN THE REGRESSIONS. HD MEANS HEAD AND WF MEANS WIFE OR FEMALE COHABITOR.

    1. intnum68: 1968 INTERVIEW NUMBER
    2. pernum68: PERSON NUMBER 68
    3. wave: Current Wave of the PSID
    4. sex: gender SEX OF INDIVIDUAL (1=male, 2=female)
    5. intnum: Wave-specific Interview Number
    6. farminc: Farm Income
    7. region: regLab Region of Current Interview
    8. famwgt: this is the PSID’s family weight, which is used in all analyses
    9. relhead: ER34103L this is the relation to the head of household (10=head; 20=legally married wife; 22=cohabiting partner)
    10. age: Age
    11. employed: ER34116L Whether or not employed or on temp leave (everyone gets a 1 for this variable, since our wage analyses use only the currently employed)
    12. sch: schLbl Highest Year of Schooling
    13. annhrs: Annual Hours Worked
    14. annlabinc: Annual Labor Income
    15. occ: 3 Digit Occupation 2000 codes
    16. ind: 3 Digit Industry 2000 codes
    17. white: White, nonhispanic dummy variable
    18. black: Black, nonhispanic dummy variable
    19. hisp: Hispanic dummy variable
    20. othrace: Other Race dummy variable
    21. degree: degreeLbl Agent's Degree Status (0=no college degree; 1=bachelor’s without advanced degree; 2=advanced degree)
    22. degupd: degreeLbl Agent's Degree Status (Updated with 2009 values)
    23. schupd: schLbl Schooling (updated years of schooling)
    24. annwks: Annual Weeks Worked
    25. unjob: unJobLbl Union Coverage dummy variable
    26. usualhrwk: Usual Hrs Worked Per Week
    27. labincbus: Labor Income from...
  2. Budget Allocation

    • kaggle.com
    zip
    Updated Jul 13, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Shrinolo (2024). Budget Allocation [Dataset]. https://www.kaggle.com/datasets/shrinolo/budget-allocation
    Explore at:
    zip(143327 bytes)Available download formats
    Dataset updated
    Jul 13, 2024
    Authors
    Shrinolo
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    This dataset provides detailed budget allocation insights for urban and rural households in India, capturing present living standards. The data includes various spending areas such as housing, food, transportation, healthcare, education, and discretionary expenses. The dataset is designed to help researchers, policymakers, and individuals understand spending habits and optimize budget planning.

    Context: The dataset is derived from various government reports, surveys, and market research studies that provide a snapshot of the current economic conditions and living standards in India. It includes average income levels, typical expenses, and common savings patterns for both urban and rural households.

    Sources:

    National Sample Survey Office (NSSO) Ministry of Statistics and Programme Implementation (MoSPI) Various market research reports and publications Inspiration: The inspiration behind this dataset is to provide a clear and detailed picture of how households in different regions of India allocate their budgets. This can be a valuable resource for economists, social scientists, financial advisors, and anyone interested in understanding the financial behavior of Indian households.

  3. g

    Data from: Aggregate Dataset Eastern Europe

    • search.gesis.org
    • da-ra.de
    Updated Apr 13, 2010
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Tausch, Arno (2010). Aggregate Dataset Eastern Europe [Dataset]. http://doi.org/10.4232/1.1159
    Explore at:
    application/x-spss-sav(49194), application/x-stata-dta(79671), application/x-spss-por(25249)Available download formats
    Dataset updated
    Apr 13, 2010
    Dataset provided by
    GESIS search
    GESIS Data Archive
    Authors
    Tausch, Arno
    License

    https://www.gesis.org/en/institute/data-usage-termshttps://www.gesis.org/en/institute/data-usage-terms

    Time period covered
    1970 - 1980
    Area covered
    Europe, Eastern Europe
    Variables measured
    v135 - ISLAMIC, v218 - DOCTORS, v4 - ICPSR-NO., v95 - TV'S 1977, v133 - CATHOLICS, v2 - ABBREVATIONS, v87 - STOCK: CARS, v94 - RADIOS 1977, v132 - PROTESTANTS, v24 - AVERAGE WAGE, and 319 more
    Description

    Aggregate indicators at the level of the country for 7 countries of the East Bloc from the areas of economy, defense, population and society.

    Topics: 1. Population and society: population density; population growth from 1970 to 1978; infant mortality and life expectancy; degree of urbanization; rate of provision with running water and sanitary facilities; residential furnishings and housing conditions; hospital beds and doctors per capita; proportion of children in kindergartens; proportion of women in various branchs of the economy; religious affiliation; divorce rate; training level of the population; education expenditures; employees in technology and science; scientific book production; social mobility.

    1. Economy: growth rate of the gross national product; GNP per capita; public investments; merchandise import and export; proportion of employees and proportion of production in the individual sectors of the economy; average income; meat consumption and supply of calories; trade with Comecon countries, capitalist and under-developed countries; trade deficit and foreign debt; growth of import and export as well as of income; work productivity; working hours needed for selected goods; capital intensity; provision of households with telephone, television, cars and other durable economic goods; energy import and energy use; employee-worker relationship; development of real income as well as prices; private savings; income concentration; retail trade index; hectare yields and proportion of private agriculture.

    2. Military: defense expenditures; export of weapons; strength of military forces; proportion of defense expenditures in gross national product; number of disturbances and protest demonstrations; armed attacks and persons killed; sanctions of the government; internal security forces.

    3. Miscellaneous: content analysis of newspapers regarding reports about human rights, disarmament, economic as well as technical cooperation and conflicts after adoption of the final agreement of Helsinki and Belgrad.

  4. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
fedesoriano (2022). Gender Pay Gap Dataset [Dataset]. https://www.kaggle.com/datasets/fedesoriano/gender-pay-gap-dataset
Organization logo

Gender Pay Gap Dataset

The Gender Wage Gap: Extent, Trends, and Explanations for differences in Salary

Explore at:
3 scholarly articles cite this dataset (View in Google Scholar)
zip(61650632 bytes)Available download formats
Dataset updated
Feb 2, 2022
Authors
fedesoriano
Description

Similar Datasets

  • Company Bankruptcy Prediction: LINK
  • The Boston House-Price Data: LINK
  • California Housing Prices Data (5 new features!): LINK
  • Spanish Wine Quality Dataset: LINK

Context

The gender pay gap or gender wage gap is the average difference between the remuneration for men and women who are working. Women are generally considered to be paid less than men. There are two distinct numbers regarding the pay gap: non-adjusted versus adjusted pay gap. The latter typically takes into account differences in hours worked, occupations were chosen, education, and job experience. In the United States, for example, the non-adjusted average female's annual salary is 79% of the average male salary, compared to 95% for the adjusted average salary.

The reasons link to legal, social, and economic factors, and extend beyond "equal pay for equal work".

The gender pay gap can be a problem from a public policy perspective because it reduces economic output and means that women are more likely to be dependent upon welfare payments, especially in old age.

This dataset aims to replicate the data used in the famous paper "The Gender Wage Gap: Extent, Trends, and Explanations", which provides new empirical evidence on the extent of and trends in the gender wage gap, which declined considerably during the 1980–2010 period.

Citation

fedesoriano. (January 2022). Gender Pay Gap Dataset. Retrieved [Date Retrieved] from https://www.kaggle.com/fedesoriano/gender-pay-gap-dataset.

Content

There are 2 files in this dataset: a) the Panel Study of Income Dynamics (PSID) microdata over the 1980-2010 period, and b) the Current Population Survey (CPS) to provide some additional US national data on the gender pay gap.

PSID variables:

NOTES: THE VARIABLES WITH fz ADDED TO THEIR NAME REFER TO EXPERIENCE WHERE WE HAVE FILLED IN SOME ZEROS IN THE MISSING PSID YEARS WITH DATA FROM THE RESPONDENTS’ ANSWERS TO QUESTIONS ABOUT JOBS WORKED ON DURING THESE MISSING YEARS. THE fz variables WERE USED IN THE REGRESSION ANALYSES THE VARIABLES WITH A predict PREFIX REFER TO THE COMPUTATION OF ACTUAL EXPERIENCE ACCUMULATED DURING THE YEARS IN WHICH THE PSID DID NOT SURVEY THE RESPONDENTS. THERE ARE MORE PREDICTED EXPERIENCE LEVELS THAT ARE NEEDED TO IMPUTE EXPERIENCE IN THE MISSING YEARS IN SOME CASES. NOTE THAT THE VARIABLES yrsexpf, yrsexpfsz, etc., INCLUDE THESE COMPUTATIONS, SO THAT IF YOU WANT TO USE FULL TIME OR PART TIME EXPERIENCE, YOU DON’T NEED TO ADD THESE PREDICT VARIABLES IN. THEY ARE INCLUDED IN THE DATA SET TO ILLUSTRATE THE RESULTS OF THE COMPUTATION PROCESS. THE VARIABLES WITH AN orig PREFIX ARE THE ORIGINAL PSID VARIABLES. THESE HAVE BEEN PROCESSED AND IN SOME CASES RENAMED FOR CONVENIENCE. THE hd SUFFIX MEANS THAT THE VARIABLE REFERS TO THE HEAD OF THE FAMILY, AND THE wf SUFFIX MEANS THAT IT REFERS TO THE WIFE OR FEMALE COHABITOR IF THERE IS ONE. AS SHOWN IN THE ACCOMPANYING REGRESSION PROGRAM, THESE orig VARIABLES AREN’T USED DIRECTLY IN THE REGRESSIONS. THERE ARE MORE OF THE ORIGINAL PSID VARIABLES, WHICH WERE USED TO CONSTRUCT THE VARIABLES USED IN THE REGRESSIONS. HD MEANS HEAD AND WF MEANS WIFE OR FEMALE COHABITOR.

  1. intnum68: 1968 INTERVIEW NUMBER
  2. pernum68: PERSON NUMBER 68
  3. wave: Current Wave of the PSID
  4. sex: gender SEX OF INDIVIDUAL (1=male, 2=female)
  5. intnum: Wave-specific Interview Number
  6. farminc: Farm Income
  7. region: regLab Region of Current Interview
  8. famwgt: this is the PSID’s family weight, which is used in all analyses
  9. relhead: ER34103L this is the relation to the head of household (10=head; 20=legally married wife; 22=cohabiting partner)
  10. age: Age
  11. employed: ER34116L Whether or not employed or on temp leave (everyone gets a 1 for this variable, since our wage analyses use only the currently employed)
  12. sch: schLbl Highest Year of Schooling
  13. annhrs: Annual Hours Worked
  14. annlabinc: Annual Labor Income
  15. occ: 3 Digit Occupation 2000 codes
  16. ind: 3 Digit Industry 2000 codes
  17. white: White, nonhispanic dummy variable
  18. black: Black, nonhispanic dummy variable
  19. hisp: Hispanic dummy variable
  20. othrace: Other Race dummy variable
  21. degree: degreeLbl Agent's Degree Status (0=no college degree; 1=bachelor’s without advanced degree; 2=advanced degree)
  22. degupd: degreeLbl Agent's Degree Status (Updated with 2009 values)
  23. schupd: schLbl Schooling (updated years of schooling)
  24. annwks: Annual Weeks Worked
  25. unjob: unJobLbl Union Coverage dummy variable
  26. usualhrwk: Usual Hrs Worked Per Week
  27. labincbus: Labor Income from...
Search
Clear search
Close search
Google apps
Main menu