3 datasets found
  1. HNWI worldwide 2024, by country

    • statista.com
    Updated Jul 9, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). HNWI worldwide 2024, by country [Dataset]. https://www.statista.com/forecasts/1171539/hnwi-by-country
    Explore at:
    Dataset updated
    Jul 9, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    Jan 1, 2024 - Dec 31, 2024
    Area covered
    Albania
    Description

    The United States is leading the ranking by number of high networth individuals , recording **** million individuals. Following closely behind is China with **** million individuals, while Lesotho is trailing the ranking with * thousand individuals, resulting in a difference of **** million individuals to the ranking leader, the United States. High Net Worth Individuals are here defined as persons with investible assets of at least *********** U.S. dollars in current exchange rate terms.The shown data are an excerpt of Statista's Key Market Indicators (KMI). The KMI are a collection of primary and secondary indicators on the macro-economic, demographic and technological environment in more than *** countries and regions worldwide. All input data are sourced from international institutions, national statistical offices, and trade associations. All data has been are processed to generate comparable datasets (see supplementary notes under details for more information).

  2. P

    SynthPAI Dataset

    • paperswithcode.com
    Updated Jun 10, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Hanna Yukhymenko; Robin Staab; Mark Vero; Martin Vechev (2024). SynthPAI Dataset [Dataset]. https://paperswithcode.com/dataset/synthpai
    Explore at:
    Dataset updated
    Jun 10, 2024
    Authors
    Hanna Yukhymenko; Robin Staab; Mark Vero; Martin Vechev
    Description

    SynthPAI was created to provide a dataset that can be used to investigate the personal attribute inference (PAI) capabilities of LLM on online texts. Due to associated privacy concerns with real-world data, open datasets are rare (non-existent) in the research community. SynthPAI is a synthetic dataset that aims to fill this gap.

    Dataset Details Dataset Description SynthPAI was created using 300 GPT-4 agents seeded with individual personalities interacting with each other in a simulated online forum and consists of 103 threads and 7823 comments. For each profile, we further provide a set of personal attributes that a human could infer from the profile. We additionally conducted a user study to evaluate the quality of the synthetic comments, establishing that humans can barely distinguish between real and synthetic comments.

    Curated by: The dataset was created by SRILab at ETH Zurich. It was not created on behalf of any outside entity. Funded by: Two authors of this work are supported by the Swiss State Secretariat for Education, Research and Innovation (SERI) (SERI-funded ERC Consolidator Grant). This project did, however, not receive explicit funding by SERI and was devised independently. Views and opinions expressed are however those of the authors only and do not necessarily reflect those of the SERI-funded ERC Consolidator Grant. Shared by: SRILab at ETH Zurich Language(s) (NLP): English License: CC-BY-NC-SA-4.0

    Dataset Sources

    Repository: https://github.com/eth-sri/SynthPAI Paper: https://arxiv.org/abs/2406.07217

    Uses The dataset is intended to be used as a privacy-preserving method of (i) evaluating PAI capabilities of language models and (ii) aiding the development of potential defenses against such automated inferences.

    Direct Use As in the associated paper , where we include an analysis of the personal attribute inference (PAI) capabilities of 18 state-of-the-art LLMs across different attributes and on anonymized texts.

    Out-of-Scope Use The dataset shall not be used as part of any system that performs attribute inferences on real natural persons without their consent or otherwise maliciously.

    Dataset Structure We provide the instance descriptions below. Each data point consists of a single comment (that can be a top-level post):

    Comment

    author str: unique identifier of the person writing

    username str: corresponding username

    parent_id str: unique identifier of the parent comment

    thread_id str: unique identifier of the thread

    children list[str]: unique identifiers of children comments

    profile Profile: profile making the comment - described below

    text str: text of the comment

    guesses list[dict]: Dict containing model estimates of attributes based on the comment. Only contains attributes for which a prediction exists.

    reviews dict: Dict containing human estimates of attributes based on the comment. Each guess contains a corresponding hardness rating (and certainty rating). Contains all attributes

    The associated profiles are structured as follows

    Profile

    username str: identifier

    attributes: set of personal attributes that describe the user (directly listed below)

    The corresponding attributes and values are

    Attributes

    Age continuous [18-99] The age of a user in years.

    Place of Birth tuple [city, country] The place of birth of a user. We create tuples jointly for city and country in free-text format. (field name: birth_city_country)

    Location tuple [city, country] The current location of a user. We create tuples jointly for city and country in free-text format. (field name: city_country)

    Education free-text We use a free-text field to describe the user's education level. This includes additional details such as the degree and major. To ensure comparability with the evaluation of prior work, we later map these to a categorical scale: high school, college degree, master's degree, PhD.

    Income Level free-text [low, medium, high, very high] The income level of a user. We first generate a continuous income level in the profile's local currency. In our code, we map this to a categorical value considering the distribution of income levels in the respective profile location. For this, we roughly follow the local equivalents of the following reference levels for the US: Low (<30k USD), Middle (30-60k USD), High (60-150k USD), Very High (>150k USD).

    Occupation free-text The occupation of a user, described as a free-text field.

    Relationship Status categorical [single, In a Relationship, married, divorced, widowed] The relationship status of a user as one of 5 categories.

    Sex categorical [Male, Female] Biological Sex of a profile.

    Dataset Creation Curation Rationale SynthPAI was created to provide a dataset that can be used to investigate the personal attribute inference (PAI) capabilities of LLM on online texts. Due to associated privacy concerns with real-world data, open datasets are rare (non-existent) in the research community. SynthPAI is a synthetic dataset that aims to fill this gap. We additionally conducted a user study to evaluate the quality of the synthetic comments, establishing that humans can barely distinguish between real and synthetic comments.

    Source Data The dataset is fully synthetic and was created using GPT-4 agents (version gpt-4-1106-preview) seeded with individual personalities interacting with each other in a simulated online forum.

    Data Collection and Processing The dataset was created by sampling comments from the agents in threads. A human then inferred a set of personal attributes from sets of comments associated with each profile. Further, it was manually reviewed to remove any offensive or inappropriate content. We give a detailed overview of our dataset-creation procedure in the corresponding paper.

    Annotations

    Annotations are provided by authors of the paper.

    Personal and Sensitive Information

    All contained personal information is purely synthetic and does not relate to any real individual.

    Bias, Risks, and Limitations All profiles are synthetic and do not correspond to any real subpopulations. We provide a distribution of the personal attributes of the profiles in the accompanying paper. As the dataset has been created synthetically, data points can inherit limitations (e.g., biases) from the underlying model, GPT-4. While we manually reviewed comments individually, we cannot provide respective guarantees.

    Citation BibTeX:

    @misc{2406.07217, Author = {Hanna Yukhymenko and Robin Staab and Mark Vero and Martin Vechev}, Title = {A Synthetic Dataset for Personal Attribute Inference}, Year = {2024}, Eprint = {arXiv:2406.07217}, } APA:

    Hanna Yukhymenko, Robin Staab, Mark Vero, Martin Vechev: “A Synthetic Dataset for Personal Attribute Inference”, 2024; arXiv:2406.07217.

    Dataset Card Authors

    Hanna Yukhymenko Robin Staab Mark Vero

  3. d

    Data from: The impact of income, land, and wealth inequality on agricultural...

    • dataone.org
    • datadryad.org
    • +1more
    Updated Jun 23, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Michele Graziano Ceddia (2025). The impact of income, land, and wealth inequality on agricultural expansion in Latin America [Dataset]. http://doi.org/10.5061/dryad.0sn4046
    Explore at:
    Dataset updated
    Jun 23, 2025
    Dataset provided by
    Dryad Digital Repository
    Authors
    Michele Graziano Ceddia
    Time period covered
    Jan 14, 2020
    Area covered
    Latin America
    Description

    Agricultural expansion remains the most prominent proximate cause of tropical deforestation in Latin America, a region characterized by deforestation rates substantially above the world average and extremely high inequality. This paper deploys several multivariate statistical models to test whether different aspects of inequality, within a context of increasing agricultural productivity, promote agricultural expansion (Jevons paradox) or contraction (land-sparing) in 10 Latin American countries over 1990–2010. Here I show the existence of distinct patterns between the instantaneous and the overall (i.e., accounting for temporal lags) effect of increasing agricultural productivity, conditional on the degree of income, land, and wealth inequality. In a context of perfect equality, the instantaneous effect of increases in agricultural productivity is to promote agricultural expansion (Jevons paradox). When temporal lags are accounted for, agricultural productivity appears to be mainly land...

  4. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Statista (2025). HNWI worldwide 2024, by country [Dataset]. https://www.statista.com/forecasts/1171539/hnwi-by-country
Organization logo

HNWI worldwide 2024, by country

Explore at:
Dataset updated
Jul 9, 2025
Dataset authored and provided by
Statistahttp://statista.com/
Time period covered
Jan 1, 2024 - Dec 31, 2024
Area covered
Albania
Description

The United States is leading the ranking by number of high networth individuals , recording **** million individuals. Following closely behind is China with **** million individuals, while Lesotho is trailing the ranking with * thousand individuals, resulting in a difference of **** million individuals to the ranking leader, the United States. High Net Worth Individuals are here defined as persons with investible assets of at least *********** U.S. dollars in current exchange rate terms.The shown data are an excerpt of Statista's Key Market Indicators (KMI). The KMI are a collection of primary and secondary indicators on the macro-economic, demographic and technological environment in more than *** countries and regions worldwide. All input data are sourced from international institutions, national statistical offices, and trade associations. All data has been are processed to generate comparable datasets (see supplementary notes under details for more information).

Search
Clear search
Close search
Google apps
Main menu