100+ datasets found
  1. N

    Connecticut Population Breakdown by Gender Dataset: Male and Female...

    • neilsberg.com
    csv, json
    Updated Feb 24, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Neilsberg Research (2025). Connecticut Population Breakdown by Gender Dataset: Male and Female Population Distribution // 2025 Edition [Dataset]. https://www.neilsberg.com/insights/connecticut-population-by-gender/
    Explore at:
    json, csvAvailable download formats
    Dataset updated
    Feb 24, 2025
    Dataset authored and provided by
    Neilsberg Research
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Connecticut
    Variables measured
    Male Population, Female Population, Male Population as Percent of Total Population, Female Population as Percent of Total Population
    Measurement technique
    The data presented in this dataset is derived from the latest U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates. To measure the two variables, namely (a) population and (b) population as a percentage of the total population, we initially analyzed and categorized the data for each of the gender classifications (biological sex) reported by the US Census Bureau. For further information regarding these estimates, please feel free to reach out to us via email at research@neilsberg.com.
    Dataset funded by
    Neilsberg Research
    Description
    About this dataset

    Context

    The dataset tabulates the population of Connecticut by gender, including both male and female populations. This dataset can be utilized to understand the population distribution of Connecticut across both sexes and to determine which sex constitutes the majority.

    Key observations

    There is a slight majority of female population, with 50.95% of total population being female. Source: U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.

    Content

    When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.

    Scope of gender :

    Please note that American Community Survey asks a question about the respondents current sex, but not about gender, sexual orientation, or sex at birth. The question is intended to capture data for biological sex, not gender. Respondents are supposed to respond with the answer as either of Male or Female. Our research and this dataset mirrors the data reported as Male and Female for gender distribution analysis. No further analysis is done on the data reported from the Census Bureau.

    Variables / Data Columns

    • Gender: This column displays the Gender (Male / Female)
    • Population: The population of the gender in the Connecticut is shown in this column.
    • % of Total Population: This column displays the percentage distribution of each gender as a proportion of Connecticut total population. Please note that the sum of all percentages may not equal one due to rounding of values.

    Good to know

    Margin of Error

    Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

    Custom data

    If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

    Inspiration

    Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

    Recommended for further research

    This dataset is a part of the main dataset for Connecticut Population by Race & Ethnicity. You can refer the same here

  2. Largest female population share 2024, by country

    • statista.com
    Updated Nov 28, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). Largest female population share 2024, by country [Dataset]. https://www.statista.com/statistics/1238987/female-population-share-by-country/
    Explore at:
    Dataset updated
    Nov 28, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    2024
    Area covered
    World
    Description

    Worldwide, the male population is slightly higher than the female population, although this varies by country. As of 2024, Hong Kong has the highest share of women worldwide with almost ** percent. Moldova followed behind with around ** percent. Among the countries with the largest share of women in the total population, several were former Soviet states or were located in Eastern Europe. By contrast, Qatar, the United Arab Emirates, and Oman had some of the highest proportions of men in their populations.

  3. N

    Madison, FL Population Breakdown by Gender Dataset: Male and Female...

    • neilsberg.com
    csv, json
    Updated Feb 24, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Neilsberg Research (2025). Madison, FL Population Breakdown by Gender Dataset: Male and Female Population Distribution // 2025 Edition [Dataset]. https://www.neilsberg.com/research/datasets/b24186ef-f25d-11ef-8c1b-3860777c1fe6/
    Explore at:
    json, csvAvailable download formats
    Dataset updated
    Feb 24, 2025
    Dataset authored and provided by
    Neilsberg Research
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Madison, Florida
    Variables measured
    Male Population, Female Population, Male Population as Percent of Total Population, Female Population as Percent of Total Population
    Measurement technique
    The data presented in this dataset is derived from the latest U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates. To measure the two variables, namely (a) population and (b) population as a percentage of the total population, we initially analyzed and categorized the data for each of the gender classifications (biological sex) reported by the US Census Bureau. For further information regarding these estimates, please feel free to reach out to us via email at research@neilsberg.com.
    Dataset funded by
    Neilsberg Research
    Description
    About this dataset

    Context

    The dataset tabulates the population of Madison by gender, including both male and female populations. This dataset can be utilized to understand the population distribution of Madison across both sexes and to determine which sex constitutes the majority.

    Key observations

    There is a majority of female population, with 53.38% of total population being female. Source: U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.

    Content

    When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.

    Scope of gender :

    Please note that American Community Survey asks a question about the respondents current sex, but not about gender, sexual orientation, or sex at birth. The question is intended to capture data for biological sex, not gender. Respondents are supposed to respond with the answer as either of Male or Female. Our research and this dataset mirrors the data reported as Male and Female for gender distribution analysis. No further analysis is done on the data reported from the Census Bureau.

    Variables / Data Columns

    • Gender: This column displays the Gender (Male / Female)
    • Population: The population of the gender in the Madison is shown in this column.
    • % of Total Population: This column displays the percentage distribution of each gender as a proportion of Madison total population. Please note that the sum of all percentages may not equal one due to rounding of values.

    Good to know

    Margin of Error

    Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

    Custom data

    If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

    Inspiration

    Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

    Recommended for further research

    This dataset is a part of the main dataset for Madison Population by Race & Ethnicity. You can refer the same here

  4. Male-female ratio expressed as men per 100 women in Mexico City 1910-2020

    • statista.com
    Updated Sep 15, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2024). Male-female ratio expressed as men per 100 women in Mexico City 1910-2020 [Dataset]. https://www.statista.com/statistics/1407939/male-female-ratio-mexico-city/
    Explore at:
    Dataset updated
    Sep 15, 2024
    Dataset authored and provided by
    Statistahttp://statista.com/
    Area covered
    Mexico
    Description

    The male-female ratio expressed as men per 100 women in Mexico City stood at approximately ***** in 2020. Between 1910 and 2020, the ratio rose by around ****, though the increase followed an uneven trajectory rather than a consistent upward trend.

  5. F

    Ratio of Female to Male Tertiary School Enrollment for the United States

    • fred.stlouisfed.org
    json
    Updated Jun 4, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). Ratio of Female to Male Tertiary School Enrollment for the United States [Dataset]. https://fred.stlouisfed.org/series/SEENRTERTFMZSUSA
    Explore at:
    jsonAvailable download formats
    Dataset updated
    Jun 4, 2024
    License

    https://fred.stlouisfed.org/legal/#copyright-public-domainhttps://fred.stlouisfed.org/legal/#copyright-public-domain

    Area covered
    United States
    Description

    Graph and download economic data for Ratio of Female to Male Tertiary School Enrollment for the United States (SEENRTERTFMZSUSA) from 1971 to 2022 about enrolled, ratio, tertiary schooling, females, males, education, and USA.

  6. h

    male-female

    • huggingface.co
    Updated Dec 7, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sonish Maharjan (2023). male-female [Dataset]. https://huggingface.co/datasets/SonishMaharjan/male-female
    Explore at:
    Dataset updated
    Dec 7, 2023
    Authors
    Sonish Maharjan
    License

    https://choosealicense.com/licenses/unknown/https://choosealicense.com/licenses/unknown/

    Description

    SonishMaharjan/male-female dataset hosted on Hugging Face and contributed by the HF Datasets community

  7. N

    Alabama Population Breakdown by Gender Dataset: Male and Female Population...

    • neilsberg.com
    csv, json
    Updated Feb 24, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Neilsberg Research (2025). Alabama Population Breakdown by Gender Dataset: Male and Female Population Distribution // 2025 Edition [Dataset]. https://www.neilsberg.com/insights/alabama-population-by-gender/
    Explore at:
    csv, jsonAvailable download formats
    Dataset updated
    Feb 24, 2025
    Dataset authored and provided by
    Neilsberg Research
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Alabama
    Variables measured
    Male Population, Female Population, Male Population as Percent of Total Population, Female Population as Percent of Total Population
    Measurement technique
    The data presented in this dataset is derived from the latest U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates. To measure the two variables, namely (a) population and (b) population as a percentage of the total population, we initially analyzed and categorized the data for each of the gender classifications (biological sex) reported by the US Census Bureau. For further information regarding these estimates, please feel free to reach out to us via email at research@neilsberg.com.
    Dataset funded by
    Neilsberg Research
    Description
    About this dataset

    Context

    The dataset tabulates the population of Alabama by gender, including both male and female populations. This dataset can be utilized to understand the population distribution of Alabama across both sexes and to determine which sex constitutes the majority.

    Key observations

    There is a slight majority of female population, with 51.46% of total population being female. Source: U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.

    Content

    When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.

    Scope of gender :

    Please note that American Community Survey asks a question about the respondents current sex, but not about gender, sexual orientation, or sex at birth. The question is intended to capture data for biological sex, not gender. Respondents are supposed to respond with the answer as either of Male or Female. Our research and this dataset mirrors the data reported as Male and Female for gender distribution analysis. No further analysis is done on the data reported from the Census Bureau.

    Variables / Data Columns

    • Gender: This column displays the Gender (Male / Female)
    • Population: The population of the gender in the Alabama is shown in this column.
    • % of Total Population: This column displays the percentage distribution of each gender as a proportion of Alabama total population. Please note that the sum of all percentages may not equal one due to rounding of values.

    Good to know

    Margin of Error

    Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

    Custom data

    If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

    Inspiration

    Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

    Recommended for further research

    This dataset is a part of the main dataset for Alabama Population by Race & Ethnicity. You can refer the same here

  8. Facebook: distribution of global audiences 2025, by gender

    • statista.com
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista, Facebook: distribution of global audiences 2025, by gender [Dataset]. https://www.statista.com/statistics/699241/distribution-of-users-on-facebook-worldwide-gender/
    Explore at:
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    Oct 2025
    Area covered
    Worldwide
    Description

    As of October 2025, approximately 2.35 billion people worldwide used Facebook. Around 56.6 percent of the platform’s user base were male.

  9. Population of Estonia, by gender 1950-2020

    • statista.com
    Updated Apr 25, 2014
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2014). Population of Estonia, by gender 1950-2020 [Dataset]. https://www.statista.com/statistics/1009074/male-female-population-estonia-1950-2020/
    Explore at:
    Dataset updated
    Apr 25, 2014
    Dataset authored and provided by
    Statistahttp://statista.com/
    Area covered
    Estonia
    Description

    In 1950, when Estonia's population was estimated at 1.1 million people, approximately 57 percent of the population was female, while 43 percent was male; this equated to a difference of more than 160,000 people. In the past century, as with many former-Soviet states, Estonia has consistently had one of the most disproportionate gender ratios in the world. The reason for this was due to the large number of men who were killed in wars during the first half of the twentieth century, which was particularly high across the Soviet Union, as well as a much higher life expectancy among women. The difference in the number of men and women in Estonia has gradually decreased over the past seven decades, but in 2020, there are still 70,000 more females than males, in a population of 1.3 million people; this equates to total shares of roughly 53 percent and 47 percent of the total population respectively.

  10. Gender ratios in select countries after the Second World War 1950

    • statista.com
    Updated Aug 12, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2024). Gender ratios in select countries after the Second World War 1950 [Dataset]. https://www.statista.com/statistics/1261433/post-wwii-gender-ratios-in-select-countries/
    Explore at:
    Dataset updated
    Aug 12, 2024
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    1950
    Area covered
    Asia, North America, CEE, Europe, World
    Description

    The Second World War had a sever impact on gender ratios across European countries, particularly in the Soviet Union. While the United States had a balanced gender ratio of one man for every woman, in the Soviet Union the ratio was below 5:4 in favor of women, and in Soviet Russia this figure was closer to 4:3.

    As young men were disproportionately killed during the war, this had long-term implications for demographic development, where the generation who would have typically started families in the 1940s was severely depleted in many countries.

  11. Population of Lithuania 1950-2020, by gender

    • statista.com
    Updated Jul 17, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2019). Population of Lithuania 1950-2020, by gender [Dataset]. https://www.statista.com/statistics/1016406/male-female-population-lithuania-1950-2020/
    Explore at:
    Dataset updated
    Jul 17, 2019
    Dataset authored and provided by
    Statistahttp://statista.com/
    Area covered
    Lithuania
    Description

    This statistic shows the total male and female population of Lithuania from 1950 to 2020. From the graph we can see that there is a relatively large difference in the number of males and females, particularly when put in context with the total overall population. The number of women exceeds the number of men by over 260 thousand in 1950, which is one of the long-term effects of the Second World War. During the war, Lithuania lost over 14 percent of its overall population, and the number of women was already higher than men before this, however the war caused this gap in population to widen much further. From 1950 onwards both male and female populations grow, and by 1990 the gap has shrunk down to 200 thousand people. In 1990 Lithuania gained it's independence from the Soviet Union, and from this point both populations begin to decline, falling to 1.26 million men in 2020, and 1.46 million women, with a difference of 200 thousand.

  12. N

    Jacksonville, NC Population Breakdown by Gender Dataset: Male and Female...

    • neilsberg.com
    csv, json
    Updated Feb 24, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Neilsberg Research (2025). Jacksonville, NC Population Breakdown by Gender Dataset: Male and Female Population Distribution // 2025 Edition [Dataset]. https://www.neilsberg.com/insights/jacksonville-nc-population-by-gender/
    Explore at:
    json, csvAvailable download formats
    Dataset updated
    Feb 24, 2025
    Dataset authored and provided by
    Neilsberg Research
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    North Carolina, Jacksonville
    Variables measured
    Male Population, Female Population, Male Population as Percent of Total Population, Female Population as Percent of Total Population
    Measurement technique
    The data presented in this dataset is derived from the latest U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates. To measure the two variables, namely (a) population and (b) population as a percentage of the total population, we initially analyzed and categorized the data for each of the gender classifications (biological sex) reported by the US Census Bureau. For further information regarding these estimates, please feel free to reach out to us via email at research@neilsberg.com.
    Dataset funded by
    Neilsberg Research
    Description
    About this dataset

    Context

    The dataset tabulates the population of Jacksonville by gender, including both male and female populations. This dataset can be utilized to understand the population distribution of Jacksonville across both sexes and to determine which sex constitutes the majority.

    Key observations

    There is a majority of male population, with 63.05% of total population being male. Source: U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.

    Content

    When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.

    Scope of gender :

    Please note that American Community Survey asks a question about the respondents current sex, but not about gender, sexual orientation, or sex at birth. The question is intended to capture data for biological sex, not gender. Respondents are supposed to respond with the answer as either of Male or Female. Our research and this dataset mirrors the data reported as Male and Female for gender distribution analysis. No further analysis is done on the data reported from the Census Bureau.

    Variables / Data Columns

    • Gender: This column displays the Gender (Male / Female)
    • Population: The population of the gender in the Jacksonville is shown in this column.
    • % of Total Population: This column displays the percentage distribution of each gender as a proportion of Jacksonville total population. Please note that the sum of all percentages may not equal one due to rounding of values.

    Good to know

    Margin of Error

    Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

    Custom data

    If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

    Inspiration

    Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

    Recommended for further research

    This dataset is a part of the main dataset for Jacksonville Population by Race & Ethnicity. You can refer the same here

  13. The proportion of males and females by the classification scheme shown in...

    • figshare.com
    xls
    Updated Jun 1, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Daniel B. Wright; Elin M. Skagerberg (2023). The proportion of males and females by the classification scheme shown in Figure 1. [Dataset]. http://doi.org/10.1371/journal.pone.0031661.t002
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Daniel B. Wright; Elin M. Skagerberg
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The proportion of males and females by the classification scheme shown in Figure 1.

  14. R

    Gender Detection And Labelling Dataset

    • universe.roboflow.com
    zip
    Updated Aug 18, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Gender Classification (2024). Gender Detection And Labelling Dataset [Dataset]. https://universe.roboflow.com/gender-classification-aflfc/gender-detection-and-labelling
    Explore at:
    zipAvailable download formats
    Dataset updated
    Aug 18, 2024
    Dataset authored and provided by
    Gender Classification
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    Male Female FOCG Bounding Boxes
    Description

    Gender Detection And Labelling

    ## Overview
    
    Gender Detection And Labelling is a dataset for object detection tasks - it contains Male Female FOCG annotations for 551 images.
    
    ## Getting Started
    
    You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
    
      ## License
    
      This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
    
  15. Pure gender bias detection (Male vs Female)

    • kaggle.com
    zip
    Updated Oct 15, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Krishna GSVV (2025). Pure gender bias detection (Male vs Female) [Dataset]. https://www.kaggle.com/datasets/krishnagsvv/pure-gender-bias-detection-male-vs-female
    Explore at:
    zip(102799 bytes)Available download formats
    Dataset updated
    Oct 15, 2025
    Authors
    Krishna GSVV
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Equilens Gender Bias

    • Purpose: This corpus was generated by the EquiLens Corpus Generator to enable controlled, reproducible experiments testing how language models respond when only the name varies across prompts. Each row is a single prompt where profession, trait, and template are fixed while the name varies (Male vs Female).
    • **Scope: **~1,680 prompts for gender bias across multiple professions, competence/social trait categories, and four template variants.
    • **Intended use: **Model-response collection, parsing/cleaning experiments, statistical testing for demographic differences, visualisation, and reproducible research.
    • Sources & provenance: Names, professions, and trait lists are curated and combined deterministically by the project's JSON config (word_lists.json). The generator and metadata are included in the repository for reproducibility.
    • License: MIT

    Column descriptions

    • comparison_type — Audit category (e.g., gender_bias)
    • name — First name used in the prompt (Male or Female)
    • name_category — Name group label (Male / Female)
    • profession — Profession used in the prompt (engineer, nurse, doctor, etc.)
    • trait — Trait word inserted into the template (analytical, caring, etc.)
    • trait_category — Trait class (Competence / Social)
    • template_id — Template variant id (0–3)
    • full_prompt_text — Final full prompt text presented to the model

    Quick reproducibility & validation (PowerShell) ```powershell

    from the dataset folder

    Test-Path .\corpus\audit_corpus_gender_bias.csv Get-Content .\corpus\audit_corpus_gender_bias.csv | Measure-Object -Line

    Create venv and install deps

    python -m venv .venv .venv\Scripts\Activate.ps1 pip install pandas tqdm ```

    Quick start: load and basic stats (Python) ```python import pandas as pd df = pd.read_csv("corpus/audit_corpus_gender_bias.csv")

    counts per category

    print(df['name_category'].value_counts())

    sample prompts

    print(df.sample(5)['full_prompt_text'].to_list()) ```

    Recommended evaluation workflow (high level) 1. Use this CSV to generate model responses for each prompt (consistent model settings). 2. Clean & parse outputs into numeric/label format as appropriate (use structured prompting where possible). 3. Aggregate responses grouped by name_category (Male vs Female) while holding profession/trait/template constant. 4. Compute descriptive stats per group (mean, median, sd) and per stratum (profession × trait_category). 5. Run statistical tests and effect-size estimates: - Permutation test or Mann-Whitney U (non-parametric) - Bootstrap confidence intervals for medians/means - Cohen’s d or Cliff’s delta for effect size 6. Correct for multiple comparisons (Benjamini–Hochberg) when testing many strata. 7. Visualise with violin + boxplots and difference plots with CIs.

    Suggested quantitative metrics - Mean/median differences (Male − Female) - Bootstrap 95% CI on difference - Cohen’s d or Cliff’s delta - p-values from permutation test / Mann-Whitney U - Proportion of model outputs that deviate from the expected neutral baseline (for categorical outputs)

    Suggested visualizations - Grouped violin plots (by profession) split by name_category - Difference-in-means bar with bootstrap CI per profession - Heatmap of effect sizes (profession × trait_category) - Distribution overlay of raw responses

    Recommended analysis notebooks/kernels to provide on Kaggle - 01_data_load_and_summary.ipynb — load CSV, sanity checks, counts - 02_model_response_collection.ipynb — how to call a model endpoint safely (placeholders) - 03_cleaning_and_parsing.ipynb — parsing rules and robustness tests - 04_statistical_tests.ipynb — permutation tests, bootstrap CI, effect sizes - 05_visualizations.ipynb — plots and interpretation

    Security & best practices - Never commit API keys in notebooks. Use environment variables and secrets built into Kaggle. - Keep model call rate-limited and log failures; use retry/backoff. - Use fixed random seeds for reproducibility where sampling occurs.

    Limitations & caveats (must show on dataset page) - Cultural and name recognition: names may suggest different demographics across regions; results are context-sensitive. - Only Male vs Female: dataset intentionally isolates binary gender categories; extend carefully for broader demographic categories. - Controlled prompts reduce ecological validity — real interactions may be longer and noisier. - Parsing risk: models sometimes add explanatory text; structured prompting or requesting a JSON response is recommended.

    How this dataset differs from academic prototypes - This corpus is deterministic and template-driven to ensure strict control over confounds (only the name varies). Use it when you require reproducibility and controlled comparisons rather than open-ended, real-world prompts.

    Suggested Kaggle tags and categor...

  16. Gender distribution at the world's leading universities 2024-2025

    • statista.com
    Updated Nov 28, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). Gender distribution at the world's leading universities 2024-2025 [Dataset]. https://www.statista.com/statistics/1345939/gender-distribution-world-leading-universities/
    Explore at:
    Dataset updated
    Nov 28, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Area covered
    World
    Description

    In Autumn 2024, among the students enrolled in the highest ranked university in the world, Oxford in the United Kingdom, 51 percent were female. See here for an overview of the highest-ranked universities in the world.

  17. N

    Waukesha, WI Population Breakdown by Gender Dataset: Male and Female...

    • neilsberg.com
    csv, json
    Updated Feb 24, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Neilsberg Research (2025). Waukesha, WI Population Breakdown by Gender Dataset: Male and Female Population Distribution // 2025 Edition [Dataset]. https://www.neilsberg.com/insights/waukesha-wi-population-by-gender/
    Explore at:
    csv, jsonAvailable download formats
    Dataset updated
    Feb 24, 2025
    Dataset authored and provided by
    Neilsberg Research
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Waukesha, Wisconsin
    Variables measured
    Male Population, Female Population, Male Population as Percent of Total Population, Female Population as Percent of Total Population
    Measurement technique
    The data presented in this dataset is derived from the latest U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates. To measure the two variables, namely (a) population and (b) population as a percentage of the total population, we initially analyzed and categorized the data for each of the gender classifications (biological sex) reported by the US Census Bureau. For further information regarding these estimates, please feel free to reach out to us via email at research@neilsberg.com.
    Dataset funded by
    Neilsberg Research
    Description
    About this dataset

    Context

    The dataset tabulates the population of Waukesha by gender, including both male and female populations. This dataset can be utilized to understand the population distribution of Waukesha across both sexes and to determine which sex constitutes the majority.

    Key observations

    There is a slight majority of female population, with 51.34% of total population being female. Source: U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.

    Content

    When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.

    Scope of gender :

    Please note that American Community Survey asks a question about the respondents current sex, but not about gender, sexual orientation, or sex at birth. The question is intended to capture data for biological sex, not gender. Respondents are supposed to respond with the answer as either of Male or Female. Our research and this dataset mirrors the data reported as Male and Female for gender distribution analysis. No further analysis is done on the data reported from the Census Bureau.

    Variables / Data Columns

    • Gender: This column displays the Gender (Male / Female)
    • Population: The population of the gender in the Waukesha is shown in this column.
    • % of Total Population: This column displays the percentage distribution of each gender as a proportion of Waukesha total population. Please note that the sum of all percentages may not equal one due to rounding of values.

    Good to know

    Margin of Error

    Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

    Custom data

    If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

    Inspiration

    Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

    Recommended for further research

    This dataset is a part of the main dataset for Waukesha Population by Race & Ethnicity. You can refer the same here

  18. Gender Pay Gap Dataset

    • kaggle.com
    zip
    Updated Feb 2, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    fedesoriano (2022). Gender Pay Gap Dataset [Dataset]. https://www.kaggle.com/datasets/fedesoriano/gender-pay-gap-dataset
    Explore at:
    zip(61650632 bytes)Available download formats
    Dataset updated
    Feb 2, 2022
    Authors
    fedesoriano
    Description

    Similar Datasets

    • Company Bankruptcy Prediction: LINK
    • The Boston House-Price Data: LINK
    • California Housing Prices Data (5 new features!): LINK
    • Spanish Wine Quality Dataset: LINK

    Context

    The gender pay gap or gender wage gap is the average difference between the remuneration for men and women who are working. Women are generally considered to be paid less than men. There are two distinct numbers regarding the pay gap: non-adjusted versus adjusted pay gap. The latter typically takes into account differences in hours worked, occupations were chosen, education, and job experience. In the United States, for example, the non-adjusted average female's annual salary is 79% of the average male salary, compared to 95% for the adjusted average salary.

    The reasons link to legal, social, and economic factors, and extend beyond "equal pay for equal work".

    The gender pay gap can be a problem from a public policy perspective because it reduces economic output and means that women are more likely to be dependent upon welfare payments, especially in old age.

    This dataset aims to replicate the data used in the famous paper "The Gender Wage Gap: Extent, Trends, and Explanations", which provides new empirical evidence on the extent of and trends in the gender wage gap, which declined considerably during the 1980–2010 period.

    Citation

    fedesoriano. (January 2022). Gender Pay Gap Dataset. Retrieved [Date Retrieved] from https://www.kaggle.com/fedesoriano/gender-pay-gap-dataset.

    Content

    There are 2 files in this dataset: a) the Panel Study of Income Dynamics (PSID) microdata over the 1980-2010 period, and b) the Current Population Survey (CPS) to provide some additional US national data on the gender pay gap.

    PSID variables:

    NOTES: THE VARIABLES WITH fz ADDED TO THEIR NAME REFER TO EXPERIENCE WHERE WE HAVE FILLED IN SOME ZEROS IN THE MISSING PSID YEARS WITH DATA FROM THE RESPONDENTS’ ANSWERS TO QUESTIONS ABOUT JOBS WORKED ON DURING THESE MISSING YEARS. THE fz variables WERE USED IN THE REGRESSION ANALYSES THE VARIABLES WITH A predict PREFIX REFER TO THE COMPUTATION OF ACTUAL EXPERIENCE ACCUMULATED DURING THE YEARS IN WHICH THE PSID DID NOT SURVEY THE RESPONDENTS. THERE ARE MORE PREDICTED EXPERIENCE LEVELS THAT ARE NEEDED TO IMPUTE EXPERIENCE IN THE MISSING YEARS IN SOME CASES. NOTE THAT THE VARIABLES yrsexpf, yrsexpfsz, etc., INCLUDE THESE COMPUTATIONS, SO THAT IF YOU WANT TO USE FULL TIME OR PART TIME EXPERIENCE, YOU DON’T NEED TO ADD THESE PREDICT VARIABLES IN. THEY ARE INCLUDED IN THE DATA SET TO ILLUSTRATE THE RESULTS OF THE COMPUTATION PROCESS. THE VARIABLES WITH AN orig PREFIX ARE THE ORIGINAL PSID VARIABLES. THESE HAVE BEEN PROCESSED AND IN SOME CASES RENAMED FOR CONVENIENCE. THE hd SUFFIX MEANS THAT THE VARIABLE REFERS TO THE HEAD OF THE FAMILY, AND THE wf SUFFIX MEANS THAT IT REFERS TO THE WIFE OR FEMALE COHABITOR IF THERE IS ONE. AS SHOWN IN THE ACCOMPANYING REGRESSION PROGRAM, THESE orig VARIABLES AREN’T USED DIRECTLY IN THE REGRESSIONS. THERE ARE MORE OF THE ORIGINAL PSID VARIABLES, WHICH WERE USED TO CONSTRUCT THE VARIABLES USED IN THE REGRESSIONS. HD MEANS HEAD AND WF MEANS WIFE OR FEMALE COHABITOR.

    1. intnum68: 1968 INTERVIEW NUMBER
    2. pernum68: PERSON NUMBER 68
    3. wave: Current Wave of the PSID
    4. sex: gender SEX OF INDIVIDUAL (1=male, 2=female)
    5. intnum: Wave-specific Interview Number
    6. farminc: Farm Income
    7. region: regLab Region of Current Interview
    8. famwgt: this is the PSID’s family weight, which is used in all analyses
    9. relhead: ER34103L this is the relation to the head of household (10=head; 20=legally married wife; 22=cohabiting partner)
    10. age: Age
    11. employed: ER34116L Whether or not employed or on temp leave (everyone gets a 1 for this variable, since our wage analyses use only the currently employed)
    12. sch: schLbl Highest Year of Schooling
    13. annhrs: Annual Hours Worked
    14. annlabinc: Annual Labor Income
    15. occ: 3 Digit Occupation 2000 codes
    16. ind: 3 Digit Industry 2000 codes
    17. white: White, nonhispanic dummy variable
    18. black: Black, nonhispanic dummy variable
    19. hisp: Hispanic dummy variable
    20. othrace: Other Race dummy variable
    21. degree: degreeLbl Agent's Degree Status (0=no college degree; 1=bachelor’s without advanced degree; 2=advanced degree)
    22. degupd: degreeLbl Agent's Degree Status (Updated with 2009 values)
    23. schupd: schLbl Schooling (updated years of schooling)
    24. annwks: Annual Weeks Worked
    25. unjob: unJobLbl Union Coverage dummy variable
    26. usualhrwk: Usual Hrs Worked Per Week
    27. labincbus: Labor Income from...
  19. Population of Latvia, by gender 1950-2020

    • statista.com
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista, Population of Latvia, by gender 1950-2020 [Dataset]. https://www.statista.com/statistics/1016282/male-female-population-latvia-1950-2020/
    Explore at:
    Dataset authored and provided by
    Statistahttp://statista.com/
    Area covered
    Latvia
    Description

    Since 1950 there has been a relatively large difference in the number of males and females in Latvia, particularly when put in context with the total overall population. The number of women exceeds the number of men by over 260 thousand in 1950, which is one of the long-term effects of the Second World War. During the war, Latvia lost approximately 12.5 percent of its overall population, an the number of women was already higher than men before this, however the war caused this gap in population to widen much further. From 1950 onwards both male and female populations grow, and by 1990 the gap has shrunk down to 180,000 people. In 1990 Latvia gained it's independence from the Soviet Union, and from this point both populations begin to decline, falling to 870 thousand men in 2020, and just over one million women, with a difference of 150 thousand people.

  20. Population in China 2014-2024, by gender

    • statista.com
    Updated Jan 15, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). Population in China 2014-2024, by gender [Dataset]. https://www.statista.com/statistics/251129/population-in-china-by-gender/
    Explore at:
    Dataset updated
    Jan 15, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Area covered
    China
    Description

    In 2024, there were around 719 million male inhabitants and 689 million female inhabitants living in China, amounting to around 1.41 billion people in total. China's total population decreased for the first time in decades in 2022, and population decline is expected to accelerate in the upcoming years. Birth control in China From the beginning of the 1970s on, having many children was no longer encouraged in mainland China. The one-child policy was then introduced in 1979 to control the total size of the Chinese population. According to the one-child policy, a married couple was only allowed to have one child. With the time, modifications were added to the policy, for example parents living in rural areas were allowed to have a second child if the first was a daughter, and most ethnic minorities were excepted from the policy. Population ageing The birth control led to a decreasing birth rate in China and a more skewed gender ratio of new births due to boy preference. Since the negative economic and social effects of an aging population were more and more felt in China, the one-child policy was considered an obstacle for the country’s further economic development. Since 2014, the one-child policy has been gradually relaxed and fully eliminated at the end of 2015. However, many young Chinese people are not willing to have more children due to high costs of raising a child, especially in urban areas.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Neilsberg Research (2025). Connecticut Population Breakdown by Gender Dataset: Male and Female Population Distribution // 2025 Edition [Dataset]. https://www.neilsberg.com/insights/connecticut-population-by-gender/

Connecticut Population Breakdown by Gender Dataset: Male and Female Population Distribution // 2025 Edition

Explore at:
json, csvAvailable download formats
Dataset updated
Feb 24, 2025
Dataset authored and provided by
Neilsberg Research
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Area covered
Connecticut
Variables measured
Male Population, Female Population, Male Population as Percent of Total Population, Female Population as Percent of Total Population
Measurement technique
The data presented in this dataset is derived from the latest U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates. To measure the two variables, namely (a) population and (b) population as a percentage of the total population, we initially analyzed and categorized the data for each of the gender classifications (biological sex) reported by the US Census Bureau. For further information regarding these estimates, please feel free to reach out to us via email at research@neilsberg.com.
Dataset funded by
Neilsberg Research
Description
About this dataset

Context

The dataset tabulates the population of Connecticut by gender, including both male and female populations. This dataset can be utilized to understand the population distribution of Connecticut across both sexes and to determine which sex constitutes the majority.

Key observations

There is a slight majority of female population, with 50.95% of total population being female. Source: U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.

Content

When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.

Scope of gender :

Please note that American Community Survey asks a question about the respondents current sex, but not about gender, sexual orientation, or sex at birth. The question is intended to capture data for biological sex, not gender. Respondents are supposed to respond with the answer as either of Male or Female. Our research and this dataset mirrors the data reported as Male and Female for gender distribution analysis. No further analysis is done on the data reported from the Census Bureau.

Variables / Data Columns

  • Gender: This column displays the Gender (Male / Female)
  • Population: The population of the gender in the Connecticut is shown in this column.
  • % of Total Population: This column displays the percentage distribution of each gender as a proportion of Connecticut total population. Please note that the sum of all percentages may not equal one due to rounding of values.

Good to know

Margin of Error

Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

Custom data

If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

Inspiration

Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

Recommended for further research

This dataset is a part of the main dataset for Connecticut Population by Race & Ethnicity. You can refer the same here

Search
Clear search
Close search
Google apps
Main menu