https://cubig.ai/store/terms-of-servicehttps://cubig.ai/store/terms-of-service
1) Data Introduction • The Billionaires Statistics Dataset (2023) is a comprehensive set of personal and business information, including rankings of billionaires worldwide, net assets, industries, businesses, nationalities, birth and residence information, and asset sources.
2) Data Utilization (1) Billionaires Statistics Dataset (2023) has characteristics that: • The dataset consists of more than 35 columns, including the billionaire's rank, final Worth, industry, country, age, country of residence, source of assets, related industries, citizenship, organization, selfMade, birth information, data collection date, economic and social indicators (GDP, CPI, education enrollment, life expectancy, tax revenue, population, etc.). • In addition to individual asset information, economic indicators and demographic data by country are combined, allowing a three-dimensional analysis of billionaires and each country's economic and social environment. (2) Billionaires Statistics Dataset (2023) can be used to: • Wealth Distribution and Industry Analysis: Using billionaires' net worth, industry, and national data, we can analyze global wealth concentration and wealth distribution by industry and region. • A study linking demographics and economic indicators: Billionaire data can be combined with various economic and social indicators such as GDP, CPI, tax revenue, education, and life expectancy to be used for in-depth research on wealth formation, social background, ratio of self-made and inherited wealth, and regional characteristics.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
https://imgur.com/AYzsmYU.jpg" alt="Dataset Structure">
I read an article yesterday which got my mind storming, A article by Worldbank on August 15th, 2022 better explains it, It has been quoted below,
I already have a project i'm working on since Feb 2021, trying to solving this problem, listed in my datasets
This dataset showcases the statistics over the past 6-7 decades which covers the production of 150+ unique crops, 50+ livestock elements, Land distribution by usage and population, As aspiring data scientists one can try to extract insights incentivizing the optimal use of natural resources and distribution of resources
Record high food prices have triggered a global crisis that will drive millions more into extreme poverty, magnifying hunger and malnutrition, while threatening to erase hard-won gains in development. The war in Ukraine, supply chain disruptions, and the continued economic fallout of the COVID-19 pandemic are reversing years of development gains and pushing food prices to all-time highs. Rising food prices have a greater impact on people in low- and middle-income countries, since they spend a larger share of their income on food than people in high-income countries. This brief looks at rising food insecurity and World Bank responses to date.
<--- | (❁´◡`❁) | ---> |
---|---|---|
![]() | ![]() | ![]() |
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Analysis of ‘World's Billionaires’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/seriadiallo1/world-billionaires on 30 September 2021.
--- Dataset description provided by original source is as follows ---
This dataset contains 200 rows and 7 columns.
The World's Billionaires is an annual ranking by documented net worth of the world's wealthiest billionaires compiled and published in March annually by the American business magazine Forbes. The list was first published in March 1987. The total net worth of each individual on the list is estimated and is cited in United States dollars, based on their documented assets and accounting for debt. Royalty and dictators whose wealth comes from their positions are excluded from these lists. This ranking is an index of the wealthiest documented individuals, excluding and ranking against those with wealth that is not able to be completely ascertained. (wikipedia)
--- Original source retains full ownership of the source dataset ---
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This comprehensive dataset provides a wealth of information about all countries worldwide, covering a wide range of indicators and attributes. It encompasses demographic statistics, economic indicators, environmental factors, healthcare metrics, education statistics, and much more. With every country represented, this dataset offers a complete global perspective on various aspects of nations, enabling in-depth analyses and cross-country comparisons.
Key Features
- Country: Name of the country.
- Density (P/Km2): Population density measured in persons per square kilometer.
- Abbreviation: Abbreviation or code representing the country.
- Agricultural Land (%): Percentage of land area used for agricultural purposes.
- Land Area (Km2): Total land area of the country in square kilometers.
- Armed Forces Size: Size of the armed forces in the country.
- Birth Rate: Number of births per 1,000 population per year.
- Calling Code: International calling code for the country.
- Capital/Major City: Name of the capital or major city.
- CO2 Emissions: Carbon dioxide emissions in tons.
- CPI: Consumer Price Index, a measure of inflation and purchasing power.
- CPI Change (%): Percentage change in the Consumer Price Index compared to the previous year.
- Currency_Code: Currency code used in the country.
- Fertility Rate: Average number of children born to a woman during her lifetime.
- Forested Area (%): Percentage of land area covered by forests.
- Gasoline_Price: Price of gasoline per liter in local currency.
- GDP: Gross Domestic Product, the total value of goods and services produced in the country.
- Gross Primary Education Enrollment (%): Gross enrollment ratio for primary education.
- Gross Tertiary Education Enrollment (%): Gross enrollment ratio for tertiary education.
- Infant Mortality: Number of deaths per 1,000 live births before reaching one year of age.
- Largest City: Name of the country's largest city.
- Life Expectancy: Average number of years a newborn is expected to live.
- Maternal Mortality Ratio: Number of maternal deaths per 100,000 live births.
- Minimum Wage: Minimum wage level in local currency.
- Official Language: Official language(s) spoken in the country.
- Out of Pocket Health Expenditure (%): Percentage of total health expenditure paid out-of-pocket by individuals.
- Physicians per Thousand: Number of physicians per thousand people.
- Population: Total population of the country.
- Population: Labor Force Participation (%): Percentage of the population that is part of the labor force.
- Tax Revenue (%): Tax revenue as a percentage of GDP.
- Total Tax Rate: Overall tax burden as a percentage of commercial profits.
- Unemployment Rate: Percentage of the labor force that is unemployed.
- Urban Population: Percentage of the population living in urban areas.
- Latitude: Latitude coordinate of the country's location.
- Longitude: Longitude coordinate of the country's location.
Potential Use Cases
- Analyze population density and land area to study spatial distribution patterns.
- Investigate the relationship between agricultural land and food security.
- Examine carbon dioxide emissions and their impact on climate change.
- Explore correlations between economic indicators such as GDP and various socio-economic factors.
- Investigate educational enrollment rates and their implications for human capital development.
- Analyze healthcare metrics such as infant mortality and life expectancy to assess overall well-being.
- Study labor market dynamics through indicators such as labor force participation and unemployment rates.
- Investigate the role of taxation and its impact on economic development.
- Explore urbanization trends and their social and environmental consequences.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
This layer was created by Duncan Smith and based on work by the European Commission JRC and CIESIN. A description from his website follows:--------------------A brilliant new dataset produced by the European Commission JRC and CIESIN Columbia University was recently released- the Global Human Settlement Layer (GHSL). This is the first time that detailed and comprehensive population density and built-up area for the world has been available as open data. As usual, my first thought was to make an interactive map, now online at- http://luminocity3d.org/WorldPopDen/The World Population Density map is exploratory, as the dataset is very rich and new, and I am also testing out new methods for navigating statistics at both national and city scales on this site. There are clearly many applications of this data in understanding urban geographies at different scales, urban development, sustainability and change over time.
ICT has profound implications for education, both because ICT can facilitate new forms of learning and because it has become important for young people to master ICT in preparation for adult life. But how extensive is access to ICT in schools and informal settings and how is it used by students? Drawing on data from the OECD’s Programme for International Student Assessment (PISA), Are Students Ready for a Technology-Rich World? What PISA Studies Tell Us, examines whether access to computers for students is equitable across countries and student groups; how students use ICT and what their attitudes are towards ICT; the relationship between students’ access to and use of ICT and their performance in PISA 2003; and the implications for educational policy.
https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement
The Hindi Healthcare Chat Dataset is a rich collection of over 12,000 text-based conversations between customers and call center agents, focused on real-world healthcare interactions. Designed to reflect authentic language use and domain-specific dialogue patterns, this dataset supports the development of conversational AI, chatbots, and NLP models tailored for healthcare applications in Hindi-speaking regions.
The dataset captures a wide spectrum of healthcare-related chat scenarios, ensuring comprehensive coverage for training robust AI systems:
This variety helps simulate realistic healthcare support workflows and patient-agent dynamics.
This dataset reflects the natural flow of Hindi healthcare communication and includes:
These elements ensure the dataset is contextually relevant and linguistically rich for real-world use cases.
Conversations range from simple inquiries to complex advisory sessions, including:
Each conversation typically includes these structural components:
This structured flow mirrors actual healthcare support conversations and is ideal for training advanced dialogue systems.
Available in JSON, CSV, and TXT formats, each conversation includes:
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
this graphs is ourdataworld :
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F16731800%2F00b0f9cc2bd8326c60fd0ea3b5dbe4b7%2Finequality.png?generation=1710013947537354&alt=media" alt="">
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F16731800%2F1978511abe249d3081a3a95bae2ef7d5%2Fincome-share-top-1-before-tax-wid-extrapolations.png?generation=1710013977201099&alt=media" alt="">
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F16731800%2F2a5a54725f65801ba75b6ab07bc5cb9f%2Fincome-share-top-1-before-tax-wid-extrapolations%20(1).png?generation=1710013994341360&alt=media" alt="">
How are incomes and wealth distributed between people? Both within countries and across the world as a whole?
On this page, you can find all our data, visualizations, and writing relating to economic inequality.
This evidence demonstrates that inequality in many countries is substantial and, in numerous instances, has been escalating. Global economic inequality is extensive and exacerbated by intersecting disparities in health, education, and various other dimensions.
However, economic inequality is not uniformly increasing. In many countries, it has declined or remained steady. Furthermore, global inequality – following two centuries of ascent – is presently decreasing as well.
The significant variations observed across countries and over time are pivotal. They indicate that high and rising inequality is not inevitable and that the current extent of inequality is subject to change.
About this data This data explorer offers various inequality indicators measured according to two distinct definitions of income sourced from different outlets.
Data from the World Inequality Database pertains to inequality prior to taxes and benefits. Data from the World Bank pertains to either income post taxes and benefits or consumption, contingent on the country and year. For additional details regarding the definitions and methodologies underlying this data, refer to the accompanying article below, where you can also delve into and juxtapose a broader spectrum of indicators from various sources.
https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement
The Punjabi Healthcare Chat Dataset is a rich collection of over 12,000 text-based conversations between customers and call center agents, focused on real-world healthcare interactions. Designed to reflect authentic language use and domain-specific dialogue patterns, this dataset supports the development of conversational AI, chatbots, and NLP models tailored for healthcare applications in Punjabi-speaking regions.
The dataset captures a wide spectrum of healthcare-related chat scenarios, ensuring comprehensive coverage for training robust AI systems:
This variety helps simulate realistic healthcare support workflows and patient-agent dynamics.
This dataset reflects the natural flow of Punjabi healthcare communication and includes:
These elements ensure the dataset is contextually relevant and linguistically rich for real-world use cases.
Conversations range from simple inquiries to complex advisory sessions, including:
Each conversation typically includes these structural components:
This structured flow mirrors actual healthcare support conversations and is ideal for training advanced dialogue systems.
Available in JSON, CSV, and TXT formats, each conversation includes:
https://crawlfeeds.com/privacy_policyhttps://crawlfeeds.com/privacy_policy
The Celebrity Net Worth Dataset offers an in-depth look at the estimated financial assets and wealth of global celebrities, extracted from CelebrityNetWorth.com by Crawl Feeds. This dataset provides the latest available financial data as of January 31, 2022, making it a valuable resource for analyzing the earnings, investments, and overall wealth of prominent figures in various industries such as entertainment, sports, music, and more.
For access to more updated celebrity net worth datasets, reach out to the Crawl Feeds team for further assistance.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Besides far-reaching public health consequences, the COVID-19 pandemic had a significant psychological impact on people around the world. To gain further insight into this matter, we introduce the Real World Worry Waves Dataset (RW3D). The dataset combines rich open-ended free-text responses with survey data on emotions, significant life events, and psychological stressors in a repeated-measures design in the UK over three years (2020: n = 2441, 2021: n = 1716 and 2022: n = 1152). This paper provides background information on the data collection procedure, the recorded variables, participants’ demographics, and higher-order psychological and text-derived variables that emerged from the data. The RW3D is a unique primary data resource that could inspire new research questions on the psychological impact of the pandemic, especially those that connect modalities (here: text data, psychological survey variables and demographics) over time.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
LAC is the most water-rich region in the world by most metrics; however, water resource distribution throughout the region does not correspond demand. To understand water risk throughout the region, this dataset provides population and land area estimates for factors related to water risk, allowing users to explore vulnerability throughout the region to multiple dimensions of water risk. This dataset contains estimates of populations living in areas of water stress and risk in 27 countries in Latin America and the Caribbean (LAC) at the municipal level. The dataset contains categories of 18 factors related to water risk and 39 indices of water risk and population estimates within each with aggregations possible at the basin, state, country, and regional level. The population data used to generate this dataset were obtained from the WorldPop project 2020 UN-adjusted population projections, while estimates of water stress and risk come from WRI’s Aqueduct 3.0 Water Risk Framework. Municipal administrative boundaries are from the Database of Global Administrative Areas (GADM). For more information on the methodology users are invited to read IADB Technical Note IDB-TN-2411: “Scarcity in the Land of Plenty”, and WRIs “Aqueduct 3.0: Updated Decision-relevant Global Water Risk Indicators”.
This data package contains data on key health, education, nutrition, and population statistics gathered from different international sources.
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
This dataset was synthetically generated to simulate ride-sharing pricing dynamics. It includes features such as Distance, Time of Day, Demand, Weather, Base Price, Weather Multiplier, and Final Price. The dataset aims to model real-world scenarios for ride-sharing services, providing a rich resource for machine learning, data analysis, and predictive modeling tasks.
Note : "This dataset is static and will not be updated regularly."
Public Domain Mark 1.0https://creativecommons.org/publicdomain/mark/1.0/
License information was derived automatically
The Mapping Ocean Wealth data viewer is a live online resource for sharing understanding of the value of marine and coastal ecosystems to people. It includes global maps, regionally-specific studies, reference data, and a number of “apps” providing key data analytics. Maps and apps can be opened according to key themes or geographies. The navigator the left of the maps enables you to add or remove any additional map layers as you explore. Information keys explain how the maps were made and provide additional links. Further information and resources can be found on Oceanwealth.org
https://datacatalog.worldbank.org/public-licenses?fragment=cchttps://datacatalog.worldbank.org/public-licenses?fragment=cc
Using national income and expenditure distribution data from 119 countries, the authors decompose total income inequality between the individuals in the world, by continent and by "region" (countries grouped by income level). They use a Gini decomposition that allows for an exact breakdown (without a residual term) of the overall Gini by recipients. Looking first at income inequality in income between countries is more important than inequality within countries. Africa, Latin America, and Western Europe and North America are quite homogeneous continent, with small differences between countries (so that most of the inequality on these continents is explained by inequality within countries). Next the authors divide the world into three groups: the rich G7 countries (and those with similar income levels), the less developed countries (those with per capita income less than or equal to Brazil's), and the middle-income countries (those with per capita income between Brazil's and Italy's). They find little overlap between such groups - very few people in developing countries have incomes in the range of those in the rich countries.
First-person video dataset recorded in daily life situations of 17 participants, annotated by themselves for privacy sensitivity. The dataset of Steil et al. contains more than 90 hours of data recorded continuously from 20 participants (six females, aged 22-31) over more than four hours each. Participants were students with different backgrounds and subjects with normal or corrected- to-normal vision. During the recordings, participants roamed a university campus and performed their everyday activities, such as meeting people, eating, or working as they normally would on any day at the university. To obtain some data from multiple, and thus also “privacy-sensitive”, places on the university campus, participants were asked to not stay in one place for more than 30 minutes. Participants were further asked to stop the recording after about one and a half hours so that the laptop’s battery packs could be changed and the eye tracker re-calibrated. This yielded three recordings of about 1.5 hours per participant. Participants regularly interacted with a mobile phone provided to them and were also encouraged to use their own laptop, desktop computer, or music player if desired. The dataset thus covers a rich set of representative real-world situations, including sensitive environments and tasks. The data is only to be used for non-commercial scientific purposes.
https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement
The Vietnamese Healthcare Chat Dataset is a rich collection of over 10,000 text-based conversations between customers and call center agents, focused on real-world healthcare interactions. Designed to reflect authentic language use and domain-specific dialogue patterns, this dataset supports the development of conversational AI, chatbots, and NLP models tailored for healthcare applications in Vietnamese-speaking regions.
The dataset captures a wide spectrum of healthcare-related chat scenarios, ensuring comprehensive coverage for training robust AI systems:
This variety helps simulate realistic healthcare support workflows and patient-agent dynamics.
This dataset reflects the natural flow of Vietnamese healthcare communication and includes:
These elements ensure the dataset is contextually relevant and linguistically rich for real-world use cases.
Conversations range from simple inquiries to complex advisory sessions, including:
Each conversation typically includes these structural components:
This structured flow mirrors actual healthcare support conversations and is ideal for training advanced dialogue systems.
Available in JSON, CSV, and TXT formats, each conversation includes:
Advani, Hughson and Tarrant (2021) model the revenue that could be raised from an annual and a one-off wealth tax of the design recommended by Advani, Chamberlain and Summers in the Wealth Tax Commission’s Final Report (2020). This deposit contains the code required to replicate the revenue modelling and distributional analysis. The modelling draws on data from the Wealth and Assets Survey, supplemented with the Sunday Times Rich List, which we use to implement a Pareto correction for the under-coverage of wealth at the top.Around the world, the unprecedented public spending required to tackle COVID-19 will inevitably be followed by a debate about how to rebuild public finances. At the same time, politicians in many countries are already facing far-reaching questions from their electorates about the widening cracks in the social fabric that this pandemic has exposed, as prior inequalities become amplified and public services are stretched to their limits. These simultaneous shocks to national politics inevitably encourage people to 'think big' on tax policy. Even before the current crisis there were widespread calls for reforms to the taxation of wealth in the UK. These proposals have so far focused on reforming existing taxes. However, other countries have begun to raise the idea of introducing a 'wealth tax'-a new tax on ownership of wealth (net of debt). COVID-19 has rapidly pushed this idea higher up political agendas around the world, but existing studies fall a long way short of providing policymakers with a comprehensive blueprint for whether and how to introduce a wealth tax. Critics point to a number of legitimate issues that would need to be addressed. Would it be fair, and would the public support it? Is this type of tax justified from an economic perspective? How would you stop the wealthiest from hiding their assets? Will they all simply leave? How can you value some assets? What happens to people who own lots of wealth, but have little income with which to pay a wealth tax? And if wealth taxes are such a good idea, why have many countries abandoned them? These are important questions, without straightforward answers. The UK government last considered a wealth tax in the mid-1970s. This was also the last time that academics and policymakers in the UK thought seriously about how such a tax could be implemented. Over the past half century, much has changed in the mobility of people, the structure of our tax system, the availability of data, and the scope for digital solutions and coordination between tax authorities. Old plans therefore cannot be pulled 'off the shelf'. This project will evaluate whether a wealth tax for the UK would be desirable and deliverable. We will address the following three main research questions: (1) Is a wealth tax justified in principle, on economic or other grounds? (2) How should a wealth tax be designed, including definition of the tax base and solutions to administrative challenges such as valuation and liquidity? (3) What would be the revenue and distributional effects of a wealth tax in the UK, for a variety of design options and at specified rates/thresholds? To answer these questions, we will draw on a network of world-leading exports on tax policy from across academia, policy spheres, and legal practice. We will examine international experience, synthesising a large body of existing research originating in countries that already have (or have had) a wealth tax. We will add to these resources through novel research that draws on adjacent fields and disciplines to craft new solutions to the practical problems faced in delivering a wealth tax. We will also review common objections to a wealth tax. These new insights will be published in a series of 'evidence papers' made available directly to the public and policymakers. We will also publish a final report that states key recommendations for government and (if appropriate) delivers a 'ready to legislate' design for a wealth tax. We will not recommend specific rates or thresholds for the tax. Instead, we will create an online 'tax simulator' so that policymakers and members of the public can model the revenue and distributional effects of different options. We will also work with international partners to inform debates about wealth taxes in other countries. The modelling draws on data from the Wealth and Assets Survey, supplemented with the Sunday Times Rich List, which we use to implement a Pareto correction for the under-coverage of wealth at the top.
https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement
The Spanish Healthcare Chat Dataset is a rich collection of over 10,000 text-based conversations between customers and call center agents, focused on real-world healthcare interactions. Designed to reflect authentic language use and domain-specific dialogue patterns, this dataset supports the development of conversational AI, chatbots, and NLP models tailored for healthcare applications in Spanish-speaking regions.
The dataset captures a wide spectrum of healthcare-related chat scenarios, ensuring comprehensive coverage for training robust AI systems:
This variety helps simulate realistic healthcare support workflows and patient-agent dynamics.
This dataset reflects the natural flow of Spanish healthcare communication and includes:
These elements ensure the dataset is contextually relevant and linguistically rich for real-world use cases.
Conversations range from simple inquiries to complex advisory sessions, including:
Each conversation typically includes these structural components:
This structured flow mirrors actual healthcare support conversations and is ideal for training advanced dialogue systems.
Available in JSON, CSV, and TXT formats, each conversation includes:
https://cubig.ai/store/terms-of-servicehttps://cubig.ai/store/terms-of-service
1) Data Introduction • The Billionaires Statistics Dataset (2023) is a comprehensive set of personal and business information, including rankings of billionaires worldwide, net assets, industries, businesses, nationalities, birth and residence information, and asset sources.
2) Data Utilization (1) Billionaires Statistics Dataset (2023) has characteristics that: • The dataset consists of more than 35 columns, including the billionaire's rank, final Worth, industry, country, age, country of residence, source of assets, related industries, citizenship, organization, selfMade, birth information, data collection date, economic and social indicators (GDP, CPI, education enrollment, life expectancy, tax revenue, population, etc.). • In addition to individual asset information, economic indicators and demographic data by country are combined, allowing a three-dimensional analysis of billionaires and each country's economic and social environment. (2) Billionaires Statistics Dataset (2023) can be used to: • Wealth Distribution and Industry Analysis: Using billionaires' net worth, industry, and national data, we can analyze global wealth concentration and wealth distribution by industry and region. • A study linking demographics and economic indicators: Billionaire data can be combined with various economic and social indicators such as GDP, CPI, tax revenue, education, and life expectancy to be used for in-depth research on wealth formation, social background, ratio of self-made and inherited wealth, and regional characteristics.