The "Global Country Rankings Dataset" is a comprehensive collection of metrics and indicators that ranks countries worldwide based on their socioeconomic performance. This datasets are providing valuable insights into the relative standings of nations in terms of key factors such as GDP per capita, economic growth, and various other relevant criteria.
Researchers, analysts, and policymakers can leverage this dataset to gain a deeper understanding of the global economic landscape and track the progress of countries over time. The dataset covers a wide range of metrics, including but not limited to:
Economic growth: the rate of change of real GDP- Country rankings: The average for 2021 based on 184 countries was 5.26 percent.The highest value was in the Maldives: 41.75 percent and the lowest value was in Afghanistan: -20.74 percent. The indicator is available from 1961 to 2021.
GDP per capita, Purchasing Power Parity - Country rankings: The average for 2021 based on 182 countries was 21283.21 U.S. dollars.The highest value was in Luxembourg: 115683.49 U.S. dollars and the lowest value was in Burundi: 705.03 U.S. dollars. The indicator is available from 1990 to 2021.
GDP per capita, current U.S. dollars - Country rankings: The average for 2021 based on 186 countries was 17937.03 U.S. dollars.The highest value was in Monaco: 234315.45 U.S. dollars and the lowest value was in Burundi: 221.48 U.S. dollars. The indicator is available from 1960 to 2021.
GDP per capita, constant 2010 dollars - Country rankings: The average for 2021 based on 184 countries was 15605.8 U.S. dollars.The highest value was in Monaco: 204190.16 U.S. dollars and the lowest value was in Burundi: 261.02 U.S. dollars. The indicator is available from 1960 to 2021.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset is about books. It has 1 row and is filtered where the book is The biggest mining village in the world : a social history of Ashington. It features 7 columns including author, publication date, language, and book publisher.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
All cities with a population > 1000 or seats of adm div (ca 80.000)Sources and ContributionsSources : GeoNames is aggregating over hundred different data sources. Ambassadors : GeoNames Ambassadors help in many countries. Wiki : A wiki allows to view the data and quickly fix error and add missing places. Donations and Sponsoring : Costs for running GeoNames are covered by donations and sponsoring.Enrichment:add country name
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset is extracted from https://en.wikipedia.org/wiki/List_of_countries_by_population_in_1800. Context: There s a story behind every dataset and heres your opportunity to share yours.Content: What s inside is more than just rows and columns. Make it easy for others to get started by describing how you acquired the data and what time period it represents, too. Acknowledgements:We wouldn t be here without the help of others. If you owe any attributions or thanks, include them here along with any citations of past research.Inspiration: Your data will be in front of the world s largest data science community. What questions do you want to see answered?
//// 🌍 Avanteer Employee Data ////
The Largest Dataset of Active Global Profiles 1B+ Records | Updated Daily | Built for Scale & Accuracy
Avanteer’s Employee Data offers unparalleled access to the world’s most comprehensive dataset of active professional profiles. Designed for companies building data-driven products or workflows, this resource supports recruitment, lead generation, enrichment, and investment intelligence — with unmatched scale and update frequency.
//// 🔧 What You Get ////
1B+ active profiles across industries, roles, and geographies
Work history, education history, languages, skills and multiple additional datapoints.
AI-enriched datapoints include: Gender Age Normalized seniority Normalized department Normalized skillset MBTI assessment
Daily updates, with change-tracking fields to capture job changes, promotions, and new entries.
Flexible delivery via API, S3, or flat file.
Choice of formats: raw, cleaned, or AI-enriched.
Built-in compliance aligned with GDPR and CCPA.
//// 💡 Key Use Cases ////
✅ Smarter Talent Acquisition Identify, enrich, and engage high-potential candidates using up-to-date global profiles.
✅ B2B Lead Generation at Scale Build prospecting lists with confidence using job-related and firmographic filters to target decision-makers across verticals.
✅ Data Enrichment for SaaS & Platforms Supercharge ATS, CRMs, or HR tech products by syncing enriched, structured employee data through real-time or batch delivery.
✅ Investor & Market Intelligence Analyze team structures, hiring trends, and senior leadership signals to discover early-stage investment opportunities or evaluate portfolio companies.
//// 🧰 Built for Top-Tier Teams Who Move Fast ////
Zero duplicate, by design
<300ms API response time
99.99% guaranteed API uptime
Onboarding support including data samples, test credits, and consultations
Advanced data quality checks
//// ✅ Why Companies Choose Avanteer ////
➔ The largest daily-updated dataset of global professional profiles
➔ Trusted by sales, HR, and data teams building at enterprise scale
➔ Transparent, compliant data collection with opt-out infrastructure baked in
➔ Dedicated support with fast onboarding and hands-on implementation help
////////////////////////////////
Empower your team with reliable, current, and scalable employee data — all from a single source.
In our work, we have designed and implemented a novel workflow with several heuristic methods to combine state-of-the-art methods related to CVE fix commits gathering. As a consequence of our improvements, we have been able to gather the largest programming language-independent real-world dataset of CVE vulnerabilities with the associated fix commits.
Our dataset containing 26,617 unique CVEs coming from 6,945 unique GitHub projects is, to the best of our knowledge, by far the biggest CVE vulnerability dataset with fix commits available today. These CVEs are associated with 31,883 unique commits that fixed those vulnerabilities. Compared to prior work, our dataset brings about a 397% increase in CVEs, a 295% increase in covered open-source projects, and a 480% increase in commit fixes.
Our larger dataset thus substantially improves over the current real-world vulnerability datasets and enables further progress in research on vulnerability detection and software security. We used NVD(nvd.nist.gov) and Github Secuirty advisory Database as the main sources of our pipeline.
We release to the community a 14GB PostgreSQL database that contains information on CVEs up to January 24, 2024, CWEs of each CVE, files and methods changed by each commit, and repository metadata.
Additionally, patch files related to the fix commits are available as a separate package. Furthermore, we make our dataset collection tool also available to the community.
`cvedataset-patches.zip` file contains fix patches, and `dump_morefixes_27-03-2024_19_52_58.sql.zip` contains a postgtesql dump of fixes, together with several other fields such as CVEs, CWEs, repository meta-data, commit data, file changes, method changed, etc.
MoreFixes data-storage strategy is based on CVEFixes to store CVE commits fixes from open-source repositories, and uses a modified version of Porspector(part of ProjectKB from SAP) as a module to detect commit fixes of a CVE. Our full methodology is presented in the paper, with the title of "MoreFixes: A Large-Scale Dataset of CVE Fix Commits Mined through Enhanced Repository Discovery", which will be published in the Promise conference (2024).
For more information about usage and sample queries, visit the Github repository: https://github.com/JafarAkhondali/Morefixes
If you are using this dataset, please be aware that the repositories that we mined contain different licenses and you are responsible to handle any licesnsing issues. This is also the similar case with CVEFixes.
This product uses the NVD API but is not endorsed or certified by the NVD.
This research was partially supported by the Dutch Research Council (NWO) under the project NWA.1215.18.008 Cyber Security by Integrated Design (C-SIDe).
To restore the dataset, you can use the docker-compose file available at the gitub repository. Dataset default credentials after restoring dump:
POSTGRES_USER=postgrescvedumper
POSTGRES_DB=postgrescvedumper
POSTGRES_PASSWORD=a42a18537d74c3b7e584c769152c3d
This data set provides a list of the three largest glaciers and glacier complexes in each of the 19 glacial regions of the world as defined by the Global Terrestrial Network for Glaciers. The data are provided in shapefile format with an outline for each of the largest ice bodies along with a number of attributes such as area in km2.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset is about books. It has 1 row and is filtered where the book is The largest life-boats in the world : a history of the 60ft Barnett class twin screw motor life-boats. It features 7 columns including author, publication date, language, and book publisher.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Context
The dataset tabulates the population of White Earth by gender across 18 age groups. It lists the male and female population in each age group along with the gender ratio for White Earth. The dataset can be utilized to understand the population distribution of White Earth by gender and age. For example, using this dataset, we can identify the largest age group for both Men and Women in White Earth. Additionally, it can be used to see how the gender ratio changes from birth to senior most age group and male to female ratio across each age group for White Earth.
Key observations
Largest age group (population): Male # 10-14 years (17) | Female # 40-44 years (13). Source: U.S. Census Bureau American Community Survey (ACS) 2018-2022 5-Year Estimates.
When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2018-2022 5-Year Estimates.
Age groups:
Scope of gender :
Please note that American Community Survey asks a question about the respondents current sex, but not about gender, sexual orientation, or sex at birth. The question is intended to capture data for biological sex, not gender. Respondents are supposed to respond with the answer as either of Male or Female. Our research and this dataset mirrors the data reported as Male and Female for gender distribution analysis.
Variables / Data Columns
Good to know
Margin of Error
Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.
Custom data
If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.
Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.
This dataset is a part of the main dataset for White Earth Population by Gender. You can refer the same here
The World Ocean Database (WOD) is the world's largest publicly available uniform format quality controlled ocean profile dataset. Ocean profile data are sets of measurements of an ocean variable vs. depth at a single geographic location within a short (minutes to hours) temporal period in some portion of the water column from the surface to the bottom. To be considered a profile for the WOD, there must be more than a single depth/variable pair. Multiple profiles at the same location from the same set of instruments is an oceanographic cast. Ocean variables in the WOD include temperature, salinity, oxygen, nutrients, tracers, and biological variables such as plankton and chlorophyll. Quality control procedures are documented and performed on each cast and the results are included as flags on each measurement. The WOD contains the data on the originally measured depth levels (observed) and also interpolated to standard depth levels to present a more uniform set of iso-surfaces for oceanographic and climate work. The source of the WOD is more than 20,000 separate archived datasets contributed by institutions, project, government agencies, and individual investigators from the United States and around the world. Each dataset is available in its original form in the National Centers for Environmental Information data archives. All datasets are converted to the same standard format, checked for duplication within the WOD, and assigned quality flags based on objective tests. Additional subjective flags are set upon calculation of ocean climatological mean fields which make up the World Ocean Atlas (WOA) series. The WOD consists of periodic major releases and quarterly updates to those releases. Each major release is associated with a concurrent release of a WOA release, and contains final quality control flags used in the WOA, which includes manual as well as automated steps. Each quarterly update release includes additional historical and recent data and preliminary quality control. The latest major release was WOD 2018 (WOD18), which includes nearly 16 million oceanographic casts, from the second voyage of Captain Cook (1772) to the modern Argo floats (end of 2017). The WOD presents data in netCDF ragged array format following the Climate and Forecast (CF) conventions for ease of use mindful of space limitations.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset is about book series. It has 1 row and is filtered where the books is The Millennium Development Goals (MDGs) : A Short History of the World’s Biggest Promise. It features 10 columns including number of authors, number of books, earliest publication date, and latest publication date.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset is about book subjects. It has 2 rows and is filtered where the books is The largest theatre in the world : thirty years of television drama. It features 10 columns including number of authors, number of books, earliest publication date, and latest publication date.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Analysis of ‘China Largest Companies’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/yamqwe/china-largest-companiese on 28 January 2022.
--- Dataset description provided by original source is as follows ---
From the Forbes Global 2000 list last updated on May 2013. Forbes publishes an annual list of the world's 2000 largest publicly listed corporations. The Forbes Global 2000 weighs sales, profits, assets and market value equally so companies can be ranked by size. Figures for all companies are in US dollars.
Source: Economy Watch
This dataset was created by Finance and contains around 100 samples along with Profits ($billion), Market Value ($billion), technical information and other features such as: - Sales ($billion) - Assets ($billion) - and more.
- Analyze Global Rank in relation to Profits ($billion)
- Study the influence of Market Value ($billion) on Sales ($billion)
- More datasets
If you use this dataset in your research, please credit Finance
--- Original source retains full ownership of the source dataset ---
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This comprehensive dataset provides a wealth of information about all countries worldwide, covering a wide range of indicators and attributes. It encompasses demographic statistics, economic indicators, environmental factors, healthcare metrics, education statistics, and much more. With every country represented, this dataset offers a complete global perspective on various aspects of nations, enabling in-depth analyses and cross-country comparisons.
Key Features
- Country: Name of the country.
- Density (P/Km2): Population density measured in persons per square kilometer.
- Abbreviation: Abbreviation or code representing the country.
- Agricultural Land (%): Percentage of land area used for agricultural purposes.
- Land Area (Km2): Total land area of the country in square kilometers.
- Armed Forces Size: Size of the armed forces in the country.
- Birth Rate: Number of births per 1,000 population per year.
- Calling Code: International calling code for the country.
- Capital/Major City: Name of the capital or major city.
- CO2 Emissions: Carbon dioxide emissions in tons.
- CPI: Consumer Price Index, a measure of inflation and purchasing power.
- CPI Change (%): Percentage change in the Consumer Price Index compared to the previous year.
- Currency_Code: Currency code used in the country.
- Fertility Rate: Average number of children born to a woman during her lifetime.
- Forested Area (%): Percentage of land area covered by forests.
- Gasoline_Price: Price of gasoline per liter in local currency.
- GDP: Gross Domestic Product, the total value of goods and services produced in the country.
- Gross Primary Education Enrollment (%): Gross enrollment ratio for primary education.
- Gross Tertiary Education Enrollment (%): Gross enrollment ratio for tertiary education.
- Infant Mortality: Number of deaths per 1,000 live births before reaching one year of age.
- Largest City: Name of the country's largest city.
- Life Expectancy: Average number of years a newborn is expected to live.
- Maternal Mortality Ratio: Number of maternal deaths per 100,000 live births.
- Minimum Wage: Minimum wage level in local currency.
- Official Language: Official language(s) spoken in the country.
- Out of Pocket Health Expenditure (%): Percentage of total health expenditure paid out-of-pocket by individuals.
- Physicians per Thousand: Number of physicians per thousand people.
- Population: Total population of the country.
- Population: Labor Force Participation (%): Percentage of the population that is part of the labor force.
- Tax Revenue (%): Tax revenue as a percentage of GDP.
- Total Tax Rate: Overall tax burden as a percentage of commercial profits.
- Unemployment Rate: Percentage of the labor force that is unemployed.
- Urban Population: Percentage of the population living in urban areas.
- Latitude: Latitude coordinate of the country's location.
- Longitude: Longitude coordinate of the country's location.
Potential Use Cases
- Analyze population density and land area to study spatial distribution patterns.
- Investigate the relationship between agricultural land and food security.
- Examine carbon dioxide emissions and their impact on climate change.
- Explore correlations between economic indicators such as GDP and various socio-economic factors.
- Investigate educational enrollment rates and their implications for human capital development.
- Analyze healthcare metrics such as infant mortality and life expectancy to assess overall well-being.
- Study labor market dynamics through indicators such as labor force participation and unemployment rates.
- Investigate the role of taxation and its impact on economic development.
- Explore urbanization trends and their social and environmental consequences.
Cross-national research on the causes and consequences of income inequality has been hindered by the limitations of existing inequality datasets: greater coverage across countries and over time is available from these sources only at the cost of significantly reduced comparability across observations. The goal of the Standardized World Income Inequality Database (SWIID) is to overcome these limitations. A custom missing-data algorithm was used to standardize the United Nations University's World Income Inequality Database and data from other sources; data collected by the Luxembourg Income Study served as the standard. The SWIID provides comparable Gini indices of gross and net income inequality for 192 countries for as many years as possible from 1960 to the present along with estimates of uncertainty in these statistics. By maximizing comparability for the largest possible sample of countries and years, the SWIID is better suited to broadly cross-national research on income inequality than previously available sources: it offers coverage double that of the next largest income inequality dataset, and its record of comparability is three to eight times better than those of alternate datasets.
The British Geological Survey has one of the largest databases in the world on the production and trade of minerals. The dataset contains annual production statistics by mass for more than 70 mineral commodities covering the majority of economically important and internationally-traded minerals, metals and mineral-based materials. For each commodity the annual production statistics are recorded for individual countries, grouped by continent. Import and export statistics are also available for years up to 2002. Maintenance of the database is funded by the Science Budget and output is used by government, private industry and others in support of policy, economic analysis and commercial strategy. As far as possible the production data are compiled from primary, official sources. Quality assurance is maintained by participation in such groups as the International Consultative Group on Non-ferrous Metal Statistics. Individual commodity and country tables are available for sale on request.
This dataset from Dimensions.ai contains all published articles, preprints, clinical trials, grants and research datasets that are related to COVID-19. This growing collection of research information now amounts to hundreds of thousands of items, and it is the only dataset of its kind. You can find an overview of the content in this interactive Data Studio dashboard: https://reports.dimensions.ai/covid-19/ The full metadata includes the researchers and organizations involved in the research, as well as abstracts, open access status, research categories and much more. You may wish to use the Dimensions web application to explore the dataset: https://covid-19.dimensions.ai/. This dataset is for researchers, universities, pharmaceutical & biotech companies, politicians, clinicians, journalists, and anyone else who wishes to explore the impact of the current COVID-19 pandemic. It is updated daily, and free for anyone to access. Please share this information with anyone you think would benefit from it. If you have any suggestions as to how we can improve our search terms to maximise the volume of research related to COVID-19, please contact us at support@dimensions.ai. About Dimensions: Dimensions is the largest database of research insight in the world. It contains a comprehensive collection of linked data related to the global research and innovation ecosystem, all in a single platform. This includes hundreds of millions of publications, preprints, grants, patents, clinical trials, datasets, researchers and organizations. Because Dimensions maps the entire research lifecycle, you can follow academic and industry research from early stage funding, through to output and on to social and economic impact. This Covid-19 dataset is a subset of the full database. The full Dimensions database is also available on BigQuery, via subscription. Please visit www.dimensions.ai/bigquery to gain access.Más información
As of June 2024, the most popular database management system (DBMS) worldwide was Oracle, with a ranking score of 1244.08; MySQL and Microsoft SQL server rounded out the top three. Although the database management industry contains some of the largest companies in the tech industry, such as Microsoft, Oracle and IBM, a number of free and open-source DBMSs such as PostgreSQL and MariaDB remain competitive. Database Management Systems As the name implies, DBMSs provide a platform through which developers can organize, update, and control large databases. Given the business world’s growing focus on big data and data analytics, knowledge of SQL programming languages has become an important asset for software developers around the world, and database management skills are seen as highly desirable. In addition to providing developers with the tools needed to operate databases, DBMS are also integral to the way that consumers access information through applications, which further illustrates the importance of the software.
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
The dataset is useful to train and evaluate models to remove personally identifiable and sensitive information from text, especially in the context of AI assistants and LLMs.
Option 1: Python
terminal
pip install datasets
python
from datasets import load_dataset
dataset = load_dataset("ai4privacy/open-pii-masking-500k-ai4privacy")
To create the dataset, the top 10 countries leading in the incidence of COVID-19 in the world were selected as of October 22, 2020 (on the eve of the second full of pandemics), which are presented in the Global 500 ranking for 2020: USA, India, Brazil, Russia, Spain, France and Mexico. For each of these countries, no more than 10 of the largest transnational corporations included in the Global 500 rating for 2020 and 2019 were selected separately. The arithmetic averages were calculated and the change (increase) in indicators such as profitability and profitability of enterprises, their ranking position (competitiveness), asset value and number of employees. The arithmetic mean values of these indicators for all countries of the sample were found, characterizing the situation in international entrepreneurship as a whole in the context of the COVID-19 crisis in 2020 on the eve of the second wave of the pandemic. The data is collected in a general Microsoft Excel table. Dataset is a unique database that combines COVID-19 statistics and entrepreneurship statistics. The dataset is flexible data that can be supplemented with data from other countries and newer statistics on the COVID-19 pandemic. Due to the fact that the data in the dataset are not ready-made numbers, but formulas, when adding and / or changing the values in the original table at the beginning of the dataset, most of the subsequent tables will be automatically recalculated and the graphs will be updated. This allows the dataset to be used not just as an array of data, but as an analytical tool for automating scientific research on the impact of the COVID-19 pandemic and crisis on international entrepreneurship. The dataset includes not only tabular data, but also charts that provide data visualization. The dataset contains not only actual, but also forecast data on morbidity and mortality from COVID-19 for the period of the second wave of the pandemic in 2020. The forecasts are presented in the form of a normal distribution of predicted values and the probability of their occurrence in practice. This allows for a broad scenario analysis of the impact of the COVID-19 pandemic and crisis on international entrepreneurship, substituting various predicted morbidity and mortality rates in risk assessment tables and obtaining automatically calculated consequences (changes) on the characteristics of international entrepreneurship. It is also possible to substitute the actual values identified in the process and following the results of the second wave of the pandemic to check the reliability of pre-made forecasts and conduct a plan-fact analysis. The dataset contains not only the numerical values of the initial and predicted values of the set of studied indicators, but also their qualitative interpretation, reflecting the presence and level of risks of a pandemic and COVID-19 crisis for international entrepreneurship.
The "Global Country Rankings Dataset" is a comprehensive collection of metrics and indicators that ranks countries worldwide based on their socioeconomic performance. This datasets are providing valuable insights into the relative standings of nations in terms of key factors such as GDP per capita, economic growth, and various other relevant criteria.
Researchers, analysts, and policymakers can leverage this dataset to gain a deeper understanding of the global economic landscape and track the progress of countries over time. The dataset covers a wide range of metrics, including but not limited to:
Economic growth: the rate of change of real GDP- Country rankings: The average for 2021 based on 184 countries was 5.26 percent.The highest value was in the Maldives: 41.75 percent and the lowest value was in Afghanistan: -20.74 percent. The indicator is available from 1961 to 2021.
GDP per capita, Purchasing Power Parity - Country rankings: The average for 2021 based on 182 countries was 21283.21 U.S. dollars.The highest value was in Luxembourg: 115683.49 U.S. dollars and the lowest value was in Burundi: 705.03 U.S. dollars. The indicator is available from 1990 to 2021.
GDP per capita, current U.S. dollars - Country rankings: The average for 2021 based on 186 countries was 17937.03 U.S. dollars.The highest value was in Monaco: 234315.45 U.S. dollars and the lowest value was in Burundi: 221.48 U.S. dollars. The indicator is available from 1960 to 2021.
GDP per capita, constant 2010 dollars - Country rankings: The average for 2021 based on 184 countries was 15605.8 U.S. dollars.The highest value was in Monaco: 204190.16 U.S. dollars and the lowest value was in Burundi: 261.02 U.S. dollars. The indicator is available from 1960 to 2021.