100+ datasets found

Rankings of Countries Dataset
kaggle.com
Updated Jul 17, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Shuv😈 (2023). Rankings of Countries Dataset [Dataset]. https://www.kaggle.com/datasets/shuvammandal121/global-country-rankings-dataset
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jul 17, 2023
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Shuv😈
Description
Content

The "Global Country Rankings Dataset" is a comprehensive collection of metrics and indicators that ranks countries worldwide based on their socioeconomic performance. This datasets are providing valuable insights into the relative standings of nations in terms of key factors such as GDP per capita, economic growth, and various other relevant criteria.

Researchers, analysts, and policymakers can leverage this dataset to gain a deeper understanding of the global economic landscape and track the progress of countries over time. The dataset covers a wide range of metrics, including but not limited to:

Economic growth: the rate of change of real GDP- Country rankings: The average for 2021 based on 184 countries was 5.26 percent.The highest value was in the Maldives: 41.75 percent and the lowest value was in Afghanistan: -20.74 percent. The indicator is available from 1961 to 2021.

GDP per capita, Purchasing Power Parity - Country rankings: The average for 2021 based on 182 countries was 21283.21 U.S. dollars.The highest value was in Luxembourg: 115683.49 U.S. dollars and the lowest value was in Burundi: 705.03 U.S. dollars. The indicator is available from 1990 to 2021.

GDP per capita, current U.S. dollars - Country rankings: The average for 2021 based on 186 countries was 17937.03 U.S. dollars.The highest value was in Monaco: 234315.45 U.S. dollars and the lowest value was in Burundi: 221.48 U.S. dollars. The indicator is available from 1960 to 2021.

GDP per capita, constant 2010 dollars - Country rankings: The average for 2021 based on 184 countries was 15605.8 U.S. dollars.The highest value was in Monaco: 204190.16 U.S. dollars and the lowest value was in Burundi: 261.02 U.S. dollars. The indicator is available from 1960 to 2021.

source: https://www.theglobaleconomy.com/
w
Dataset of books called The biggest mining village in the world : a social...
workwithdata.com
Updated Apr 17, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Work With Data (2025). Dataset of books called The biggest mining village in the world : a social history of Ashington [Dataset]. https://www.workwithdata.com/datasets/books?f=1&fcol0=book&fop0=%3D&fval0=The+biggest+mining+village+in+the+world+%3A+a+social+history+of+Ashington
Explore at:
Dataset updated
Apr 17, 2025
Dataset authored and provided by
Work With Data
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset is about books. It has 1 row and is filtered where the book is The biggest mining village in the world : a social history of Ashington. It features 7 columns including author, publication date, language, and book publisher.
o
Geonames - All Cities with a population > 1000
public.opendatasoft.com
data.smartidf.services
+2more
csv, excel, geojson +1
Updated Mar 10, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). Geonames - All Cities with a population > 1000 [Dataset]. https://public.opendatasoft.com/explore/dataset/geonames-all-cities-with-a-population-1000/
Explore at:
csv, json, geojson, excelAvailable download formats
Dataset updated
Mar 10, 2024
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
All cities with a population > 1000 or seats of adm div (ca 80.000)Sources and ContributionsSources : GeoNames is aggregating over hundred different data sources. Ambassadors : GeoNames Ambassadors help in many countries. Wiki : A wiki allows to view the data and quickly fix error and add missing places. Donations and Sponsoring : Costs for running GeoNames are covered by donations and sponsoring.Enrichment:add country name
List_of_countries_by_population_in_1800
kaggle.com
zip
Updated Jul 17, 2020
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mathurin Aché (2020). List_of_countries_by_population_in_1800 [Dataset]. https://www.kaggle.com/datasets/mathurinache/list-of-countries-by-population-in-1800
Explore at:
zip(355 bytes)Available download formats
Dataset updated
Jul 17, 2020
Authors
Mathurin Aché
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
This dataset is extracted from https://en.wikipedia.org/wiki/List_of_countries_by_population_in_1800. Context: There s a story behind every dataset and heres your opportunity to share yours.Content: What s inside is more than just rows and columns. Make it easy for others to get started by describing how you acquired the data and what time period it represents, too. Acknowledgements:We wouldn t be here without the help of others. If you owe any attributions or thanks, include them here along with any citations of past research.Inspiration: Your data will be in front of the world s largest data science community. What questions do you want to see answered?
d
Employee Data | The Largest Dataset Of Active Profiles | Global / 1B Records...
datarade.ai
.json
Updated Apr 19, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Avanteer (2025). Employee Data | The Largest Dataset Of Active Profiles | Global / 1B Records / Updated Daily [Dataset]. https://datarade.ai/data-products/employee-data-the-largest-dataset-of-active-profiles-glob-avanteer
Explore at:
.jsonAvailable download formats
Dataset updated
Apr 19, 2025
Dataset authored and provided by
Avanteer
Area covered
Fiji, State of, Anguilla, Tunisia, Bulgaria, United Arab Emirates, Pitcairn, Gambia, Nicaragua, Maldives
Description
//// 🌍 Avanteer Employee Data ////

The Largest Dataset of Active Global Profiles 1B+ Records | Updated Daily | Built for Scale & Accuracy

Avanteer’s Employee Data offers unparalleled access to the world’s most comprehensive dataset of active professional profiles. Designed for companies building data-driven products or workflows, this resource supports recruitment, lead generation, enrichment, and investment intelligence — with unmatched scale and update frequency.

//// 🔧 What You Get ////

1B+ active profiles across industries, roles, and geographies

Work history, education history, languages, skills and multiple additional datapoints.

AI-enriched datapoints include: Gender Age Normalized seniority Normalized department Normalized skillset MBTI assessment

Daily updates, with change-tracking fields to capture job changes, promotions, and new entries.

Flexible delivery via API, S3, or flat file.

Choice of formats: raw, cleaned, or AI-enriched.

Built-in compliance aligned with GDPR and CCPA.

//// 💡 Key Use Cases ////

✅ Smarter Talent Acquisition Identify, enrich, and engage high-potential candidates using up-to-date global profiles.

✅ B2B Lead Generation at Scale Build prospecting lists with confidence using job-related and firmographic filters to target decision-makers across verticals.

✅ Data Enrichment for SaaS & Platforms Supercharge ATS, CRMs, or HR tech products by syncing enriched, structured employee data through real-time or batch delivery.

✅ Investor & Market Intelligence Analyze team structures, hiring trends, and senior leadership signals to discover early-stage investment opportunities or evaluate portfolio companies.

//// 🧰 Built for Top-Tier Teams Who Move Fast ////

Zero duplicate, by design

<300ms API response time

99.99% guaranteed API uptime

Onboarding support including data samples, test credits, and consultations

Advanced data quality checks

//// ✅ Why Companies Choose Avanteer ////

➔ The largest daily-updated dataset of global professional profiles

➔ Trusted by sales, HR, and data teams building at enterprise scale

➔ Transparent, compliant data collection with opt-out infrastructure baked in

➔ Dedicated support with fast onboarding and hands-on implementation help

////////////////////////////////

Empower your team with reliable, current, and scalable employee data — all from a single source.
MoreFixes: Largest CVE dataset with fixes
zenodo.org
zip
Updated Sep 25, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jafar Akhoundali; Jafar Akhoundali; Sajad Rahim Nouri; Sajad Rahim Nouri; Kristian F. D. Rietveld; Kristian F. D. Rietveld; Olga GADYATSKAYA; Olga GADYATSKAYA (2024). MoreFixes: Largest CVE dataset with fixes [Dataset]. http://doi.org/10.5281/zenodo.11199120
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.11199120
Dataset updated
Sep 25, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Jafar Akhoundali; Jafar Akhoundali; Sajad Rahim Nouri; Sajad Rahim Nouri; Kristian F. D. Rietveld; Kristian F. D. Rietveld; Olga GADYATSKAYA; Olga GADYATSKAYA
Description
In our work, we have designed and implemented a novel workflow with several heuristic methods to combine state-of-the-art methods related to CVE fix commits gathering. As a consequence of our improvements, we have been able to gather the largest programming language-independent real-world dataset of CVE vulnerabilities with the associated fix commits.
Our dataset containing 26,617 unique CVEs coming from 6,945 unique GitHub projects is, to the best of our knowledge, by far the biggest CVE vulnerability dataset with fix commits available today. These CVEs are associated with 31,883 unique commits that fixed those vulnerabilities. Compared to prior work, our dataset brings about a 397% increase in CVEs, a 295% increase in covered open-source projects, and a 480% increase in commit fixes.
Our larger dataset thus substantially improves over the current real-world vulnerability datasets and enables further progress in research on vulnerability detection and software security. We used NVD(nvd.nist.gov) and Github Secuirty advisory Database as the main sources of our pipeline.

We release to the community a 14GB PostgreSQL database that contains information on CVEs up to January 24, 2024, CWEs of each CVE, files and methods changed by each commit, and repository metadata.
Additionally, patch files related to the fix commits are available as a separate package. Furthermore, we make our dataset collection tool also available to the community.

`cvedataset-patches.zip` file contains fix patches, and `dump_morefixes_27-03-2024_19_52_58.sql.zip` contains a postgtesql dump of fixes, together with several other fields such as CVEs, CWEs, repository meta-data, commit data, file changes, method changed, etc.

MoreFixes data-storage strategy is based on CVEFixes to store CVE commits fixes from open-source repositories, and uses a modified version of Porspector(part of ProjectKB from SAP) as a module to detect commit fixes of a CVE. Our full methodology is presented in the paper, with the title of "MoreFixes: A Large-Scale Dataset of CVE Fix Commits Mined through Enhanced Repository Discovery", which will be published in the Promise conference (2024).

For more information about usage and sample queries, visit the Github repository: https://github.com/JafarAkhondali/Morefixes

If you are using this dataset, please be aware that the repositories that we mined contain different licenses and you are responsible to handle any licesnsing issues. This is also the similar case with CVEFixes.

This product uses the NVD API but is not endorsed or certified by the NVD.

This research was partially supported by the Dutch Research Council (NWO) under the project NWA.1215.18.008 Cyber Security by Integrated Design (C-SIDe).

To restore the dataset, you can use the docker-compose file available at the gitub repository. Dataset default credentials after restoring dump:

POSTGRES_USER=postgrescvedumper POSTGRES_DB=postgrescvedumper POSTGRES_PASSWORD=a42a18537d74c3b7e584c769152c3d
Largest Glaciers and Glacier Complexes in the World, Version 1 - Dataset -...
data.nasa.gov
data.staging.idas-ds1.appdat.jsc.nasa.gov
Updated Apr 1, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
data.nasa.gov (2025). Largest Glaciers and Glacier Complexes in the World, Version 1 - Dataset - NASA Open Data Portal [Dataset]. https://data.nasa.gov/dataset/largest-glaciers-and-glacier-complexes-in-the-world-version-1-75379
Explore at:
Dataset updated
Apr 1, 2025
Dataset provided by
NASAhttp://nasa.gov/
Area covered
World
Description
This data set provides a list of the three largest glaciers and glacier complexes in each of the 19 glacial regions of the world as defined by the Global Terrestrial Network for Glaciers. The data are provided in shapefile format with an outline for each of the largest ice bodies along with a number of attributes such as area in km2.
w
Dataset of books called The largest life-boats in the world : a history of...
workwithdata.com
Updated Apr 17, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Work With Data (2025). Dataset of books called The largest life-boats in the world : a history of the 60ft Barnett class twin screw motor life-boats [Dataset]. https://www.workwithdata.com/datasets/books?f=1&fcol0=book&fop0=%3D&fval0=The+largest+life-boats+in+the+world+%3A+a+history+of+the+60ft+Barnett+class+twin+screw+motor+life-boats
Explore at:
Dataset updated
Apr 17, 2025
Dataset authored and provided by
Work With Data
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
World
Description
This dataset is about books. It has 1 row and is filtered where the book is The largest life-boats in the world : a history of the 60ft Barnett class twin screw motor life-boats. It features 7 columns including author, publication date, language, and book publisher.
N
White Earth, ND Population Breakdown by Gender and Age Dataset: Male and...
neilsberg.com
csv, json
Updated Feb 19, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Neilsberg Research (2024). White Earth, ND Population Breakdown by Gender and Age Dataset: Male and Female Population Distribution Across 18 Age Groups // 2024 Edition [Dataset]. https://www.neilsberg.com/research/datasets/8e8e96eb-c989-11ee-9145-3860777c1fe6/
Explore at:
json, csvAvailable download formats
Dataset updated
Feb 19, 2024
Dataset authored and provided by
Neilsberg Research
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
White Earth, North Dakota
Variables measured
Male and Female Population Under 5 Years, Male and Female Population over 85 years, Male and Female Population Between 5 and 9 years, Male and Female Population Between 10 and 14 years, Male and Female Population Between 15 and 19 years, Male and Female Population Between 20 and 24 years, Male and Female Population Between 25 and 29 years, Male and Female Population Between 30 and 34 years, Male and Female Population Between 35 and 39 years, Male and Female Population Between 40 and 44 years, and 8 more
Measurement technique
The data presented in this dataset is derived from the latest U.S. Census Bureau American Community Survey (ACS) 2018-2022 5-Year Estimates. To measure the three variables, namely (a) Population (Male), (b) Population (Female), and (c) Gender Ratio (Males per 100 Females), we initially analyzed and categorized the data for each of the gender classifications (biological sex) reported by the US Census Bureau across 18 age groups, ranging from under 5 years to 85 years and above. These age groups are described above in the variables section. For further information regarding these estimates, please feel free to reach out to us via email at research@neilsberg.com.
Dataset funded by
Neilsberg Research
Description
About this dataset

Context

The dataset tabulates the population of White Earth by gender across 18 age groups. It lists the male and female population in each age group along with the gender ratio for White Earth. The dataset can be utilized to understand the population distribution of White Earth by gender and age. For example, using this dataset, we can identify the largest age group for both Men and Women in White Earth. Additionally, it can be used to see how the gender ratio changes from birth to senior most age group and male to female ratio across each age group for White Earth.

Key observations

Largest age group (population): Male # 10-14 years (17) | Female # 40-44 years (13). Source: U.S. Census Bureau American Community Survey (ACS) 2018-2022 5-Year Estimates.

Content

When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2018-2022 5-Year Estimates.

Age groups:

Under 5 years

5 to 9 years

10 to 14 years

15 to 19 years

20 to 24 years

25 to 29 years

30 to 34 years

35 to 39 years

40 to 44 years

45 to 49 years

50 to 54 years

55 to 59 years

60 to 64 years

65 to 69 years

70 to 74 years

75 to 79 years

80 to 84 years

85 years and over

Scope of gender :

Please note that American Community Survey asks a question about the respondents current sex, but not about gender, sexual orientation, or sex at birth. The question is intended to capture data for biological sex, not gender. Respondents are supposed to respond with the answer as either of Male or Female. Our research and this dataset mirrors the data reported as Male and Female for gender distribution analysis.

Variables / Data Columns

Age Group: This column displays the age group for the White Earth population analysis. Total expected values are 18 and are define above in the age groups section.

Population (Male): The male population in the White Earth is shown in the following column.

Population (Female): The female population in the White Earth is shown in the following column.

Gender Ratio: Also known as the sex ratio, this column displays the number of males per 100 females in White Earth for each age group.

Good to know

Margin of Error

Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

Custom data

If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

Inspiration

Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

Recommended for further research

This dataset is a part of the main dataset for White Earth Population by Gender. You can refer the same here
d
NCEI Standard Product: World Ocean Database (WOD)
catalog.data.gov
data.cnra.ca.gov
+1more
Updated Feb 1, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(Point of Contact) (2024). NCEI Standard Product: World Ocean Database (WOD) [Dataset]. https://catalog.data.gov/dataset/ncei-standard-product-world-ocean-database-wod3
Explore at:
Dataset updated
Feb 1, 2024
Dataset provided by
(Point of Contact)
Description
The World Ocean Database (WOD) is the world's largest publicly available uniform format quality controlled ocean profile dataset. Ocean profile data are sets of measurements of an ocean variable vs. depth at a single geographic location within a short (minutes to hours) temporal period in some portion of the water column from the surface to the bottom. To be considered a profile for the WOD, there must be more than a single depth/variable pair. Multiple profiles at the same location from the same set of instruments is an oceanographic cast. Ocean variables in the WOD include temperature, salinity, oxygen, nutrients, tracers, and biological variables such as plankton and chlorophyll. Quality control procedures are documented and performed on each cast and the results are included as flags on each measurement. The WOD contains the data on the originally measured depth levels (observed) and also interpolated to standard depth levels to present a more uniform set of iso-surfaces for oceanographic and climate work. The source of the WOD is more than 20,000 separate archived datasets contributed by institutions, project, government agencies, and individual investigators from the United States and around the world. Each dataset is available in its original form in the National Centers for Environmental Information data archives. All datasets are converted to the same standard format, checked for duplication within the WOD, and assigned quality flags based on objective tests. Additional subjective flags are set upon calculation of ocean climatological mean fields which make up the World Ocean Atlas (WOA) series. The WOD consists of periodic major releases and quarterly updates to those releases. Each major release is associated with a concurrent release of a WOA release, and contains final quality control flags used in the WOA, which includes manual as well as automated steps. Each quarterly update release includes additional historical and recent data and preliminary quality control. The latest major release was WOD 2018 (WOD18), which includes nearly 16 million oceanographic casts, from the second voyage of Captain Cook (1772) to the modern Argo floats (end of 2017). The WOD presents data in netCDF ragged array format following the Climate and Forecast (CF) conventions for ease of use mindful of space limitations.
w
Dataset of books series that contain The Millennium Development Goals (MDGs)...
workwithdata.com
Updated Nov 25, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Work With Data (2024). Dataset of books series that contain The Millennium Development Goals (MDGs) : A Short History of the World’s Biggest Promise [Dataset]. https://www.workwithdata.com/datasets/book-series?f=1&fcol0=j0-book&fop0=%3D&fval0=The+Millennium+Development+Goals+(MDGs)+:+A+Short+History+of+the+World%E2%80%99s+Biggest+Promise&j=1&j0=books
Explore at:
Dataset updated
Nov 25, 2024
Dataset authored and provided by
Work With Data
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset is about book series. It has 1 row and is filtered where the books is The Millennium Development Goals (MDGs) : A Short History of the World’s Biggest Promise. It features 10 columns including number of authors, number of books, earliest publication date, and latest publication date.
w
Dataset of book subjects where books equals The largest theatre in the world...
workwithdata.com
Updated Jul 2, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Work With Data (2024). Dataset of book subjects where books equals The largest theatre in the world : thirty years of television drama [Dataset]. https://www.workwithdata.com/datasets/book-subjects?f=1&fcol0=book&fop0=%3D&fval0=The+largest+theatre+in+the+world+%3A+thirty+years+of+television+drama
Explore at:
Dataset updated
Jul 2, 2024
Dataset authored and provided by
Work With Data
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset is about book subjects. It has 2 rows and is filtered where the books is The largest theatre in the world : thirty years of television drama. It features 10 columns including number of authors, number of books, earliest publication date, and latest publication date.
A
‘China Largest Companies’ analyzed by Analyst-2
analyst-2.ai
Updated Apr 2, 2017
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com) (2017). ‘China Largest Companies’ analyzed by Analyst-2 [Dataset]. https://analyst-2.ai/analysis/kaggle-china-largest-companies-5855/latest
Explore at:
Dataset updated
Apr 2, 2017
Dataset authored and provided by
Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com)
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
China
Description
Analysis of ‘China Largest Companies’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/yamqwe/china-largest-companiese on 28 January 2022.

--- Dataset description provided by original source is as follows ---

About this dataset

From the Forbes Global 2000 list last updated on May 2013. Forbes publishes an annual list of the world's 2000 largest publicly listed corporations. The Forbes Global 2000 weighs sales, profits, assets and market value equally so companies can be ranked by size. Figures for all companies are in US dollars.

Source: Economy Watch

This dataset was created by Finance and contains around 100 samples along with Profits ($billion), Market Value ($billion), technical information and other features such as: - Sales ($billion) - Assets ($billion) - and more.

How to use this dataset

Analyze Global Rank in relation to Profits ($billion)

Study the influence of Market Value ($billion) on Sales ($billion)

More datasets

Acknowledgements

If you use this dataset in your research, please credit Finance

Start A New Notebook!

--- Original source retains full ownership of the source dataset ---
Global Country Information 2023
zenodo.org
data.niaid.nih.gov
csv
Updated Jun 15, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Nidula Elgiriyewithana; Nidula Elgiriyewithana (2024). Global Country Information 2023 [Dataset]. http://doi.org/10.5281/zenodo.8165229
Explore at:
csvAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.8165229
Dataset updated
Jun 15, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Nidula Elgiriyewithana; Nidula Elgiriyewithana
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Description

This comprehensive dataset provides a wealth of information about all countries worldwide, covering a wide range of indicators and attributes. It encompasses demographic statistics, economic indicators, environmental factors, healthcare metrics, education statistics, and much more. With every country represented, this dataset offers a complete global perspective on various aspects of nations, enabling in-depth analyses and cross-country comparisons.

Key Features

Country: Name of the country.

Density (P/Km2): Population density measured in persons per square kilometer.

Abbreviation: Abbreviation or code representing the country.

Agricultural Land (%): Percentage of land area used for agricultural purposes.

Land Area (Km2): Total land area of the country in square kilometers.

Armed Forces Size: Size of the armed forces in the country.

Birth Rate: Number of births per 1,000 population per year.

Calling Code: International calling code for the country.

Capital/Major City: Name of the capital or major city.

CO2 Emissions: Carbon dioxide emissions in tons.

CPI: Consumer Price Index, a measure of inflation and purchasing power.

CPI Change (%): Percentage change in the Consumer Price Index compared to the previous year.

Currency_Code: Currency code used in the country.

Fertility Rate: Average number of children born to a woman during her lifetime.

Forested Area (%): Percentage of land area covered by forests.

Gasoline_Price: Price of gasoline per liter in local currency.

GDP: Gross Domestic Product, the total value of goods and services produced in the country.

Gross Primary Education Enrollment (%): Gross enrollment ratio for primary education.

Gross Tertiary Education Enrollment (%): Gross enrollment ratio for tertiary education.

Infant Mortality: Number of deaths per 1,000 live births before reaching one year of age.

Largest City: Name of the country's largest city.

Life Expectancy: Average number of years a newborn is expected to live.

Maternal Mortality Ratio: Number of maternal deaths per 100,000 live births.

Minimum Wage: Minimum wage level in local currency.

Official Language: Official language(s) spoken in the country.

Out of Pocket Health Expenditure (%): Percentage of total health expenditure paid out-of-pocket by individuals.

Physicians per Thousand: Number of physicians per thousand people.

Population: Total population of the country.

Population: Labor Force Participation (%): Percentage of the population that is part of the labor force.

Tax Revenue (%): Tax revenue as a percentage of GDP.

Total Tax Rate: Overall tax burden as a percentage of commercial profits.

Unemployment Rate: Percentage of the labor force that is unemployed.

Urban Population: Percentage of the population living in urban areas.

Latitude: Latitude coordinate of the country's location.

Longitude: Longitude coordinate of the country's location.

Potential Use Cases

Analyze population density and land area to study spatial distribution patterns.

Investigate the relationship between agricultural land and food security.

Examine carbon dioxide emissions and their impact on climate change.

Explore correlations between economic indicators such as GDP and various socio-economic factors.

Investigate educational enrollment rates and their implications for human capital development.

Analyze healthcare metrics such as infant mortality and life expectancy to assess overall well-being.

Study labor market dynamics through indicators such as labor force participation and unemployment rates.

Investigate the role of taxation and its impact on economic development.

Explore urbanization trends and their social and environmental consequences.
i
Standardized World Income Inequality Database , SWIID
ingridportal.eu
Updated May 4, 2019
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2019). Standardized World Income Inequality Database , SWIID [Dataset]. http://doi.org/10.23728/b2share.d85fbdaf194c4a78aa79438e95a051fe
Explore at:
Unique identifier
https://doi.org/10.23728/b2share.d85fbdaf194c4a78aa79438e95a051fe
Dataset updated
May 4, 2019
Description
Cross-national research on the causes and consequences of income inequality has been hindered by the limitations of existing inequality datasets: greater coverage across countries and over time is available from these sources only at the cost of significantly reduced comparability across observations. The goal of the Standardized World Income Inequality Database (SWIID) is to overcome these limitations. A custom missing-data algorithm was used to standardize the United Nations University's World Income Inequality Database and data from other sources; data collected by the Luxembourg Income Study served as the standard. The SWIID provides comparable Gini indices of gross and net income inequality for 192 countries for as many years as possible from 1960 to the present along with estimates of uncertainty in these statistics. By maximizing comparability for the largest possible sample of countries and years, the SWIID is better suited to broadly cross-national research on income inequality than previously available sources: it offers coverage double that of the next largest income inequality dataset, and its record of comparability is three to eight times better than those of alternate datasets.
World Mineral Statistics Dataset.
data.europa.eu
data.wu.ac.at
html, unknown
Updated Nov 25, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
British Geological Survey (BGS) (2021). World Mineral Statistics Dataset. [Dataset]. https://data.europa.eu/data/datasets/world-mineral-statistics-dataset?locale=el
Explore at:
html, unknownAvailable download formats
Dataset updated
Nov 25, 2021
Dataset provided by
British Geological Surveyhttps://www.bgs.ac.uk/
Authors
British Geological Survey (BGS)
Description
The British Geological Survey has one of the largest databases in the world on the production and trade of minerals. The dataset contains annual production statistics by mass for more than 70 mineral commodities covering the majority of economically important and internationally-traded minerals, metals and mineral-based materials. For each commodity the annual production statistics are recorded for individual countries, grouped by continent. Import and export statistics are also available for years up to 2002. Maintenance of the database is funded by the Science Budget and output is used by government, private industry and others in support of policy, economic analysis and commercial strategy. As far as possible the production data are compiled from primary, official sources. Quality assurance is maintained by participation in such groups as the International Consultative Group on Non-ferrous Metal Statistics. Individual commodity and country tables are available for sale on request.
COVID-19: Dataset of Global Research by Dimensions
console.cloud.google.com
Updated Jul 10, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
https://console.cloud.google.com/marketplace/browse?filter=partner:Digital%20Science%20%26%20Research%20Solutions%20Inc&hl=es&inv=1&invt=Ab0Iaw (2023). COVID-19: Dataset of Global Research by Dimensions [Dataset]. https://console.cloud.google.com/marketplace/product/digitalscience-public/covid-19-dataset-dimensions?hl=es
Explore at:
Dataset updated
Jul 10, 2023
Dataset provided by
Googlehttp://google.com/
Description
This dataset from Dimensions.ai contains all published articles, preprints, clinical trials, grants and research datasets that are related to COVID-19. This growing collection of research information now amounts to hundreds of thousands of items, and it is the only dataset of its kind. You can find an overview of the content in this interactive Data Studio dashboard: https://reports.dimensions.ai/covid-19/ The full metadata includes the researchers and organizations involved in the research, as well as abstracts, open access status, research categories and much more. You may wish to use the Dimensions web application to explore the dataset: https://covid-19.dimensions.ai/. This dataset is for researchers, universities, pharmaceutical & biotech companies, politicians, clinicians, journalists, and anyone else who wishes to explore the impact of the current COVID-19 pandemic. It is updated daily, and free for anyone to access. Please share this information with anyone you think would benefit from it. If you have any suggestions as to how we can improve our search terms to maximise the volume of research related to COVID-19, please contact us at support@dimensions.ai. About Dimensions: Dimensions is the largest database of research insight in the world. It contains a comprehensive collection of linked data related to the global research and innovation ecosystem, all in a single platform. This includes hundreds of millions of publications, preprints, grants, patents, clinical trials, datasets, researchers and organizations. Because Dimensions maps the entire research lifecycle, you can follow academic and industry research from early stage funding, through to output and on to social and economic impact. This Covid-19 dataset is a subset of the full database. The full Dimensions database is also available on BigQuery, via subscription. Please visit www.dimensions.ai/bigquery to gain access.Más información
Most popular database management systems worldwide 2024
statista.com
Updated Jun 19, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2024). Most popular database management systems worldwide 2024 [Dataset]. https://www.statista.com/statistics/809750/worldwide-popularity-ranking-database-management-systems/
Explore at:
Dataset updated
Jun 19, 2024
Dataset authored and provided by
Statistahttp://statista.com/
Time period covered
Jun 2024
Area covered
Worldwide
Description
As of June 2024, the most popular database management system (DBMS) worldwide was Oracle, with a ranking score of 1244.08; MySQL and Microsoft SQL server rounded out the top three. Although the database management industry contains some of the largest companies in the tech industry, such as Microsoft, Oracle and IBM, a number of free and open-source DBMSs such as PostgreSQL and MariaDB remain competitive. Database Management Systems As the name implies, DBMSs provide a platform through which developers can organize, update, and control large databases. Given the business world’s growing focus on big data and data analytics, knowledge of SQL programming languages has become an important asset for software developers around the world, and database management skills are seen as highly desirable. In addition to providing developers with the tools needed to operate databases, DBMS are also integral to the way that consumers access information through applications, which further illustrates the importance of the software.
open-pii-masking-500k-ai4privacy
kaggle.com
dataverse.harvard.edu
Updated Mar 17, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Michael Anthony (2025). open-pii-masking-500k-ai4privacy [Dataset]. https://www.kaggle.com/datasets/mikedoes/open-pii-masking-500k-ai4privacy
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Mar 17, 2025
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Michael Anthony
License
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Description
🌍 World's largest open dataset for privacy masking 🌎

The dataset is useful to train and evaluate models to remove personally identifiable and sensitive information from text, especially in the context of AI assistants and LLMs.

Dataset Analytics 📊 - ai4privacy/open-pii-masking-500k-ai4privacy

p5y Data Analytics

Total Entries: 580,227

Total Tokens: 19,199,982

Average Source Text Length: 17.37 words

Total PII Labels: 5,705,973

Number of Unique PII Classes: 20 (Open PII Labelset)

Unique Identity Values: 704,215

Language Distribution Analytics

**Number of Unique Languages**: 8 | Language | Count | Percentage | |--------------------|----------|------------| | English (en) 🇺🇸🇬🇧🇨🇦🇮🇳 | 150,693 | 25.97% | | French (fr) 🇫🇷🇨🇭🇨🇦 | 112,136 | 19.33% | | German (de) 🇩🇪🇨🇭 | 82,384 | 14.20% | | Spanish (es) 🇪🇸 🇲🇽 | 78,013 | 13.45% | | Italian (it) 🇮🇹🇨🇭 | 68,824 | 11.86% | | Dutch (nl) 🇳🇱 | 26,628 | 4.59% | | Hindi (hi)* 🇮🇳 | 33,963 | 5.85% | | Telugu (te)* 🇮🇳 | 27,586 | 4.75% | *these languages are in experimental stages

Region Distribution Analytics

**Number of Unique Regions**: 11 | Region | Count | Percentage | |-----------------------|----------|------------| | Switzerland (CH) 🇨🇭 | 112,531 | 19.39% | | India (IN) 🇮🇳 | 99,724 | 17.19% | | Canada (CA) 🇨🇦 | 74,733 | 12.88% | | Germany (DE) 🇩🇪 | 41,604 | 7.17% | | Spain (ES) 🇪🇸 | 39,557 | 6.82% | | Mexico (MX) 🇲🇽 | 38,456 | 6.63% | | France (FR) 🇫🇷 | 37,886 | 6.53% | | Great Britain (GB) 🇬🇧 | 37,092 | 6.39% | | United States (US) 🇺🇸 | 37,008 | 6.38% | | Italy (IT) 🇮🇹 | 35,008 | 6.03% | | Netherlands (NL) 🇳🇱 | 26,628 | 4.59% |

Machine Learning Task Analytics

| Split | Count | Percentage | |-------------|----------|------------| | **Train** | 464,150 | 79.99% | | **Validate**| 116,077 | 20.01% |

Usage

Option 1: Python terminal pip install datasets python from datasets import load_dataset dataset = load_dataset("ai4privacy/open-pii-masking-500k-ai4privacy")

Compatible Machine Learning Tasks:

Tokenclassification. Check out a HuggingFace's guide on token classification.

ALBERT, BERT, BigBird, BioGpt, BLOOM, BROS, CamemBERT, CANINE, ConvBERT, Data2VecText, DeBERTa, DeBERTa-v2, DistilBERT, ELECTRA, ERNIE, ErnieM, ESM, Falcon, FlauBERT, FNet, Funnel Transformer, GPT-Sw3, OpenAI GPT-2, GPTBigCode, GPT Neo, GPT NeoX, I-BERT, [LayoutLM](http...
n
Dataset of development of business during the COVID-19 crisis
narcis.nl
data.mendeley.com
Updated Nov 9, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Litvinova, T (via Mendeley Data) (2020). Dataset of development of business during the COVID-19 crisis [Dataset]. http://doi.org/10.17632/9vvrd34f8t.1
Explore at:
Unique identifier
https://doi.org/10.17632/9vvrd34f8t.1
Dataset updated
Nov 9, 2020
Dataset provided by
Data Archiving and Networked Services (DANS)
Authors
Litvinova, T (via Mendeley Data)
Description
To create the dataset, the top 10 countries leading in the incidence of COVID-19 in the world were selected as of October 22, 2020 (on the eve of the second full of pandemics), which are presented in the Global 500 ranking for 2020: USA, India, Brazil, Russia, Spain, France and Mexico. For each of these countries, no more than 10 of the largest transnational corporations included in the Global 500 rating for 2020 and 2019 were selected separately. The arithmetic averages were calculated and the change (increase) in indicators such as profitability and profitability of enterprises, their ranking position (competitiveness), asset value and number of employees. The arithmetic mean values of these indicators for all countries of the sample were found, characterizing the situation in international entrepreneurship as a whole in the context of the COVID-19 crisis in 2020 on the eve of the second wave of the pandemic. The data is collected in a general Microsoft Excel table. Dataset is a unique database that combines COVID-19 statistics and entrepreneurship statistics. The dataset is flexible data that can be supplemented with data from other countries and newer statistics on the COVID-19 pandemic. Due to the fact that the data in the dataset are not ready-made numbers, but formulas, when adding and / or changing the values in the original table at the beginning of the dataset, most of the subsequent tables will be automatically recalculated and the graphs will be updated. This allows the dataset to be used not just as an array of data, but as an analytical tool for automating scientific research on the impact of the COVID-19 pandemic and crisis on international entrepreneurship. The dataset includes not only tabular data, but also charts that provide data visualization. The dataset contains not only actual, but also forecast data on morbidity and mortality from COVID-19 for the period of the second wave of the pandemic in 2020. The forecasts are presented in the form of a normal distribution of predicted values and the probability of their occurrence in practice. This allows for a broad scenario analysis of the impact of the COVID-19 pandemic and crisis on international entrepreneurship, substituting various predicted morbidity and mortality rates in risk assessment tables and obtaining automatically calculated consequences (changes) on the characteristics of international entrepreneurship. It is also possible to substitute the actual values identified in the process and following the results of the second wave of the pandemic to check the reliability of pre-made forecasts and conduct a plan-fact analysis. The dataset contains not only the numerical values of the initial and predicted values of the set of studied indicators, but also their qualitative interpretation, reflecting the presence and level of risks of a pandemic and COVID-19 crisis for international entrepreneurship.

Facebook

Twitter

Click to copy link

Link copied

Cite

Shuv😈 (2023). Rankings of Countries Dataset [Dataset]. https://www.kaggle.com/datasets/shuvammandal121/global-country-rankings-dataset

Rankings of Countries Dataset

Exploring the Socioeconomic Landscape: A Ranking of Countries based on GDP

Explore at:

CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.

Dataset updated

Jul 17, 2023

Dataset provided by

Kagglehttp://kaggle.com/

Authors

Shuv😈

Description

Content

The "Global Country Rankings Dataset" is a comprehensive collection of metrics and indicators that ranks countries worldwide based on their socioeconomic performance. This datasets are providing valuable insights into the relative standings of nations in terms of key factors such as GDP per capita, economic growth, and various other relevant criteria.

Researchers, analysts, and policymakers can leverage this dataset to gain a deeper understanding of the global economic landscape and track the progress of countries over time. The dataset covers a wide range of metrics, including but not limited to:

Economic growth: the rate of change of real GDP- Country rankings: The average for 2021 based on 184 countries was 5.26 percent.The highest value was in the Maldives: 41.75 percent and the lowest value was in Afghanistan: -20.74 percent. The indicator is available from 1961 to 2021.

GDP per capita, Purchasing Power Parity - Country rankings: The average for 2021 based on 182 countries was 21283.21 U.S. dollars.The highest value was in Luxembourg: 115683.49 U.S. dollars and the lowest value was in Burundi: 705.03 U.S. dollars. The indicator is available from 1990 to 2021.

GDP per capita, current U.S. dollars - Country rankings: The average for 2021 based on 186 countries was 17937.03 U.S. dollars.The highest value was in Monaco: 234315.45 U.S. dollars and the lowest value was in Burundi: 221.48 U.S. dollars. The indicator is available from 1960 to 2021.

GDP per capita, constant 2010 dollars - Country rankings: The average for 2021 based on 184 countries was 15605.8 U.S. dollars.The highest value was in Monaco: 204190.16 U.S. dollars and the lowest value was in Burundi: 261.02 U.S. dollars. The indicator is available from 1960 to 2021.

source: https://www.theglobaleconomy.com/

Clear search

Close search

Google apps

Main menu

Rankings of Countries Dataset

Content

Dataset of books called The biggest mining village in the world : a social...

Geonames - All Cities with a population > 1000

List_of_countries_by_population_in_1800

Employee Data | The Largest Dataset Of Active Profiles | Global / 1B Records...

MoreFixes: Largest CVE dataset with fixes

Largest Glaciers and Glacier Complexes in the World, Version 1 - Dataset -...

Dataset of books called The largest life-boats in the world : a history of...

White Earth, ND Population Breakdown by Gender and Age Dataset: Male and...

About this dataset

Content

Inspiration

Recommended for further research

NCEI Standard Product: World Ocean Database (WOD)

Dataset of books series that contain The Millennium Development Goals (MDGs)...

Dataset of book subjects where books equals The largest theatre in the world...

‘China Largest Companies’ analyzed by Analyst-2

About this dataset

How to use this dataset

Acknowledgements

Start A New Notebook!

Global Country Information 2023

Standardized World Income Inequality Database , SWIID

World Mineral Statistics Dataset.

COVID-19: Dataset of Global Research by Dimensions

Most popular database management systems worldwide 2024

open-pii-masking-500k-ai4privacy

🌍 World's largest open dataset for privacy masking 🌎

Dataset Analytics 📊 - ai4privacy/open-pii-masking-500k-ai4privacy

p5y Data Analytics

Language Distribution Analytics

Region Distribution Analytics

Machine Learning Task Analytics

Usage

Compatible Machine Learning Tasks:

Dataset of development of business during the COVID-19 crisis

Rankings of Countries Dataset

Exploring the Socioeconomic Landscape: A Ranking of Countries based on GDP

Content