Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Context
This list ranks the 50 states in the United States by Chinese population, as estimated by the United States Census Bureau. It also highlights population changes in each state over the past five years.
When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 5-Year Estimates, including:
Variables / Data Columns
Good to know
Margin of Error
Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.
Custom data
If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.
Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.
Facebook
TwitterThis uniquely granular dataset captures 13,427 development projects worth $843 billion financed by more than 300 Chinese government institutions and state-owned entities across 165 countries in every major region of the world from 2000-2017.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Context
This list ranks the 17 cities in the Blue Earth County, MN by Chinese population, as estimated by the United States Census Bureau. It also highlights population changes in each city over the past five years.
When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 5-Year Estimates, including:
Variables / Data Columns
Good to know
Margin of Error
Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.
Custom data
If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.
Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The total population in China was estimated at 1409.7 million people in 2023, according to the latest census figures and projections from Trading Economics. This dataset provides - China Population - actual values, historical data, forecast, chart, statistics, economic calendar and news.
Facebook
TwitterDelve into the dynamics of food prices in China with this dataset sourced from the World Food Programme Price Database. Covering essential food items like maize, rice, beans, fish, and sugar across various markets in China, this dataset provides a valuable resource for understanding food price trends over time. Whether you're an economist, policymaker, or researcher, explore how factors such as supply, demand, and market dynamics influence food pricing in one of the world's largest economies. With data updated weekly and spanning back to 1992, this dataset offers rich insights into the evolving landscape of food prices in China.
Headers description:
Source: https://data.humdata.org/dataset/wfp-food-prices-for-china
Facebook
Twitterhttp://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/
Description
This Dataset contains details of World Population by country. According to the worldometer, the current population of the world is 8.2 billion people. Highest populated country is India followed by China and USA.
Attribute Information
Acknowledgements
https://www.worldometers.info/world-population/population-by-country/
Facebook
TwitterThe region of present-day China has historically been the most populous region in the world; however, its population development has fluctuated throughout history. In 2022, China was overtaken as the most populous country in the world, and current projections suggest its population is heading for a rapid decline in the coming decades. Transitions of power lead to mortality The source suggests that conflict, and the diseases brought with it, were the major obstacles to population growth throughout most of the Common Era, particularly during transitions of power between various dynasties and rulers. It estimates that the total population fell by approximately 30 million people during the 14th century due to the impact of Mongol invasions, which inflicted heavy losses on the northern population through conflict, enslavement, food instability, and the introduction of bubonic plague. Between 1850 and 1870, the total population fell once more, by more than 50 million people, through further conflict, famine and disease; the most notable of these was the Taiping Rebellion, although the Miao an Panthay Rebellions, and the Dungan Revolt, also had large death tolls. The third plague pandemic also originated in Yunnan in 1855, which killed approximately two million people in China. 20th and 21st centuries There were additional conflicts at the turn of the 20th century, which had significant geopolitical consequences for China, but did not result in the same high levels of mortality seen previously. It was not until the overlapping Chinese Civil War (1927-1949) and Second World War (1937-1945) where the death tolls reached approximately 10 and 20 million respectively. Additionally, as China attempted to industrialize during the Great Leap Forward (1958-1962), economic and agricultural mismanagement resulted in the deaths of tens of millions (possibly as many as 55 million) in less than four years, during the Great Chinese Famine. This mortality is not observable on the given dataset, due to the rapidity of China's demographic transition over the entire period; this saw improvements in healthcare, sanitation, and infrastructure result in sweeping changes across the population. The early 2020s marked some significant milestones in China's demographics, where it was overtaken by India as the world's most populous country, and its population also went into decline. Current projections suggest that China is heading for a "demographic disaster", as its rapidly aging population is placing significant burdens on China's economy, government, and society. In stark contrast to the restrictive "one-child policy" of the past, the government has introduced a series of pro-fertility incentives for couples to have larger families, although the impact of these policies are yet to materialize. If these current projections come true, then China's population may be around half its current size by the end of the century.
Facebook
TwitterThis dataset geolocates Chinese Government-financed projects that were implemented between 2000-2014. It captures 3,485 projects worth $273.6 billion in total official financing. The dataset includes both Chinese aid and non-concessional official financing.
The data package available for download at the link above includes the following files:
all_flow_classes.csv oda-like_flows.csv oof-like_flows.csv vague_flows.csv project_descriptions_and_sources.csv
Each row in these datasets contain a project location. To make it easier for users to distinguish between projects that do or do not meet the strict definition of “aid,” these files provide project location records that have been pre-filtered according to the “flow_class” variable (ODA-like, OOF-like, or Vague OF). Descriptions of these flow classes and their meanings are included in the accompanying ReadMe.
Funding: This research was made possible with generous financial support from the John D. and Catherine T. MacArthur Foundation, Humanity United, the William and Flora Hewlett Foundation, the Academic Research Fund of Singapore’s Ministry of Education, the United Nations University World Institute for Development Economics Research (UNU-WIDER), the German Research Foundation (DFG), and the College of William and Mary.
This dataset is made available by AidData. They are doing some amazing work. For any Licensing related queries please refer to AidData's website. Please give them a visit on their website -> https://www.aiddata.org
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The COFI database includes power-generation projects in Belt and Road Initiative (BRI) countries financed by Chinese corporations and banks that reached financial closure from 2000 to 2023. Types of financing include debt and equity investment, with the latter including greenfield foreign direct investments (FDI) and cross-border mergers and acquisitions (M&As). COFI is consolidated using nine source databases using both automated join method in R Studio, and manual joining by analysts. The database includes power plant characteristics data and investment detail data. It captures 575 power plants in 87 BRI countries, including 314 equity investment transactions and 341 debt investment transactions made by Chinese investors. Key data points for financial transactions in COFI include the financial instrument (equity or debt), investor name, amount, and financial close year. Key technical characteristics tracked for projects in COFI include name, installed capacity, commissioning year, country, and primary fuel type. This project is a collaboration among the Boston University Global Development Policy Center, the Inter-American Dialogue, the China-Africa Research Initiative at the Johns Hopkins University (CARI), and the World Resources Institute (WRI). The detailed methodology is given in the World Resources Institute publication “China Overseas Finance Inventory”. Cautions When analyzing debt investment amounts, users should be aware of the difference between loan commitment and actual disbursement. Our database records the loan commitment for a certain year and not actual disbursement. The investment amount should only provide a rough picture of where Chinese companies are investing and not how much their exact portion is. In this version of the database, all equity investment amounts are missing. This is because the equity amount is either missing or estimated in the source databases. Citation
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
China % of Population with Access to Water: City data was reported at 99.433 % in 2023. This records an increase from the previous number of 99.387 % for 2022. China % of Population with Access to Water: City data is updated yearly, averaging 96.120 % from Dec 1985 (Median) to 2023, with 31 observations. The data reached an all-time high of 99.433 % in 2023 and a record low of 63.900 % in 2000. China % of Population with Access to Water: City data remains active status in CEIC and is reported by Ministry of Housing and Urban-Rural Development. The data is categorized under China Premium Database’s Utility Sector – Table CN.RCA: Percentage of Population with Access to Water.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
China Population: Average Household Size data was reported at 2.800 Person in 2023. This records an increase from the previous number of 2.760 Person for 2022. China Population: Average Household Size data is updated yearly, averaging 3.150 Person from Dec 1982 (Median) to 2023, with 31 observations. The data reached an all-time high of 4.430 Person in 1982 and a record low of 2.620 Person in 2020. China Population: Average Household Size data remains active status in CEIC and is reported by National Bureau of Statistics. The data is categorized under China Premium Database’s Socio-Demographic – Table CN.GA: Population: No of Person per Household.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This series of 11 datasets is drawn from Rhoads, Edward J. M. Stepping Forth into the World: The Chinese Educational Mission to the United States, 1872-81. Hong Kong University Press, 2011. They document the 120 young Chinese who participated in the pioneering Chinese Educational Mission (CEM) in the United States (1872-1881). The first 8 files are drawn directly from the tables in Rhoads: Table 2.1 CEM students, by detachment (p.14-17) Table 5.1. Initial host family assignments (p.51-54) Table 7.1. CEM students in middle schools (by state and locality) (p. 90-94) Table 7.2 CEM students in public high schools (by state and locality) (p.96-99) Table 7.3 CEM students in private academies (by state and locality) (p.99-100) Table 8.1 CEM students in colleges (by academic year of enrollment) (p.116-118) Table 9.1 Deaths, dismissals, and withdrawals from the CEM (by date) (p.136) Table 9.2 CEM students in the June 1880 census (p.138-142) Based on these tables, I created three synthetic datasets which can be used for statistical and network analyses: cem_attributes: students' vital attributes, including their multiple names and transliteration, date and place of birth, and other attribute data (one row for each individual). cem_host: students' host families in the United States cem_education: students' educational curricula Each file contains two tabs, one for the data (data), one for the description of variables (key). Grey columns refer to the unstructured information given in the original source.
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
This dataset is designed for fine-tuning large language models in the medical domain. It consists of a series of conversations between users (patients) and assistants (doctors). Each conversation centers around a specific medical topic, such as gynecology, male dysfunction, erectile dysfunction, endocrinology, internal medicine, hepatology, etc.
Each conversation typically includes the following components: 1. System Prompt: Provides the doctor's specialization, e.g., "You are a doctor specializing in gynecology." 2. User Query: The patient describes symptoms or asks health-related questions. 3. Doctor's Response: The doctor offers advice and a diagnostic plan based on the user's query.
By using such dialogue datasets, language models can better understand and generate medical-related text, providing more accurate and useful advice.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Exports in China decreased to 305.35 USD Billion in October from 328.46 USD Billion in September of 2025. This dataset provides - China Exports - actual values, historical data, forecast, chart, statistics, economic calendar and news.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The USD/CNY exchange rate fell to 7.0696 on December 2, 2025, down 0.05% from the previous session. Over the past month, the Chinese Yuan has strengthened 0.81%, and is up by 3.15% over the last 12 months. Chinese Yuan - values, historical data, forecasts and news - updated on December of 2025.
Facebook
TwitterAttribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
License information was derived automatically
Currently, in the field of chart datasets, most existing resources are mainly in English, and there are almost no open-source Chinese chart datasets, which brings certain limitations to research and applications related to Chinese charts. This dataset draws on the construction method of the DVQA dataset to create a chart dataset focused on the Chinese environment. To ensure the authenticity and practicality of the dataset, we first referred to the authoritative website of the National Bureau of Statistics and selected 24 widely used data label categories in practical applications, totaling 262 specific labels. These tag categories cover multiple important areas such as socio-economic, demographic, and industrial development. In addition, in order to further enhance the diversity and practicality of the dataset, this paper sets 10 different numerical dimensions. These numerical dimensions not only provide a rich range of values, but also include multiple types of values, which can simulate various data distributions and changes that may be encountered in real application scenarios. This dataset has carefully designed various types of Chinese bar charts to cover various situations that may be encountered in practical applications. Specifically, the dataset not only includes conventional vertical and horizontal bar charts, but also introduces more challenging stacked bar charts to test the performance of the method on charts of different complexities. In addition, to further increase the diversity and practicality of the dataset, the text sets diverse attribute labels for each chart type. These attribute labels include but are not limited to whether they have data labels, whether the text is rotated 45 °, 90 °, etc. The addition of these details makes the dataset more realistic for real-world application scenarios, while also placing higher demands on data extraction methods. In addition to the charts themselves, the dataset also provides corresponding data tables and title text for each chart, which is crucial for understanding the content of the chart and verifying the accuracy of the extracted results. This dataset selects Matplotlib, the most popular and widely used data visualization library in the Python programming language, to be responsible for generating chart images required for research. Matplotlib has become the preferred tool for data scientists and researchers in data visualization tasks due to its rich features, flexible configuration options, and excellent compatibility. By utilizing the Matplotlib library, every detail of the chart can be precisely controlled, from the drawing of data points to the annotation of coordinate axes, from the addition of legends to the setting of titles, ensuring that the generated chart images not only meet the research needs, but also have high readability and attractiveness visually. The dataset consists of 58712 pairs of Chinese bar charts and corresponding data tables, divided into training, validation, and testing sets in a 7:2:1 ratio.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The Gross Domestic Product (GDP) in China was worth 18743.80 billion US dollars in 2024, according to official data from the World Bank. The GDP value of China represents 17.65 percent of the world economy. This dataset provides - China GDP - actual values, historical data, forecast, chart, statistics, economic calendar and news.
Facebook
Twitterhttps://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement
Welcome to the Mandarin Chinese General Conversation Speech Dataset — a rich, linguistically diverse corpus purpose-built to accelerate the development of Mandarin speech technologies. This dataset is designed to train and fine-tune ASR systems, spoken language understanding models, and generative voice AI tailored to real-world Mandarin Chinese communication.
Curated by FutureBeeAI, this 30 hours dataset offers unscripted, spontaneous two-speaker conversations across a wide array of real-life topics. It enables researchers, AI developers, and voice-first product teams to build robust, production-grade Mandarin speech models that understand and respond to authentic Chinese accents and dialects.
The dataset comprises 30 hours of high-quality audio, featuring natural, free-flowing dialogue between native speakers of Mandarin Chinese. These sessions range from informal daily talks to deeper, topic-specific discussions, ensuring variability and context richness for diverse use cases.
The dataset spans a wide variety of everyday and domain-relevant themes. This topic diversity ensures the resulting models are adaptable to broad speech contexts.
Each audio file is paired with a human-verified, verbatim transcription available in JSON format.
These transcriptions are production-ready, enabling seamless integration into ASR model pipelines or conversational AI workflows.
The dataset comes with granular metadata for both speakers and recordings:
Such metadata helps developers fine-tune model training and supports use-case-specific filtering or demographic analysis.
This dataset is a versatile resource for multiple Mandarin speech and language AI applications:
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
China Total Employment data was reported at 733,510.000 Person th in 2022. This records a decrease from the previous number of 746,520.000 Person th for 2021. China Total Employment data is updated yearly, averaging 746,470.000 Person th from Dec 1990 (Median) to 2022, with 33 observations. The data reached an all-time high of 763,490.000 Person th in 2014 and a record low of 647,490.000 Person th in 1990. China Total Employment data remains active status in CEIC and is reported by Organisation for Economic Co-operation and Development. The data is categorized under Global Database’s China – Table CN.OECD.MSTI: Population, Labour Force and Employment: Non OECD Member: Annual.
The national breakdown by source of funds does not fully match with the classification defined in the Frascati Manual. The R&D financed by the government, business enterprises, and by the rest of the world can be retrieved but part of the expenditure has no specific source of financing, i.e. self-raised funding (in particular for independent research institutions), the funds from the higher education sector and left-over government grants from previous years.
The government and higher education sectors cover all fields of NSE and SSH while the business enterprise sector only covers the fields of NSE. There are only few organisations in the private non-profit sector, hence no R&D survey has been carried out in this sector and the data are not available.
From 2009, researcher data are collected according to the Frascati Manual definition of researcher. Beforehand, this was only the case for independent research institutions, while for the other sectors data were collected according to the UNESCO concept of “scientist and engineer”.
In 2009, the survey coverage in the business and the government sectors has been expanded.
Before 2000, all of the personnel data and 95% of the expenditure data in the business enterprise sector are for large and medium-sized enterprises only. Since 2000 however, the survey covers almost all industries and all enterprises above a certain threshold. In 2000 and 2004, a census of all enterprises was held, while in the intermediate years data for small enterprises are estimated.
Due to the reform of the S&T system some government institutions have become enterprises, and their R&D data have been reflected in the Business Enterprise sector since 2000.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
CBFdataset is a dataset of Chinese bamboo flute (CBF) performances, created for ecologically valid analysis of music playing techniques in context.
The dataset comprises monophonic recordings of classic CBF pieces and isolated playing techniques, recorded by 10 professional CBF performers; and expert annotations of seven playing techniques: vibrato, tremolo, trill, flutter-tongue (FT), acciaccatura, portamento, and glissando. The recorded pieces include Busy Delivering Harvest (BH) 扬鞭催马运粮忙, Jolly Meeting (JM) 喜相逢, Morning (Mo) 早晨, and Flying Partridge (FP) 鹧鸪飞. All data was recorded in a professional recording studio using a Zoom H6 recorder at 44.1kHz/24-bits. The difference between different Versions 1.2, 1.1, and 1.0:
V1.2 is the complete CBFdataset with a total duration of 2.6 hours.
V1.1 splits the CBFdataset into two subsets according to playing technique types: CBF-periDB and CBF-petsDB. The former contains all the full-length pieces, isolated playing techniques, and annotations of four periodic modulations: vibrato, tremolo, trill, and flutter-tongue. The latter comprises the same full-length recordings, isolated playing techniques, and annotations of three pitch evolution-based techniques: acciaccatura, portamento, and glissando.
V1.0 includes only the CBF-periDB.
Related updates, demos, and code for reproducibility are available at http://c4dm.eecs.qmul.ac.uk/CBFdataset.html. Any queries, please feel free to contact Changhong at changhong.wang@telecom-paris.fr. Please cite the following paper when using this dataset:
Changhong Wang, Emmanouil Benetos, Vincent Lostanlen, and Elaine Chew, "Adaptive Scattering Transforms for Playing Technique Recognition," IEEE/ACM Transactions on Audio, Speech, and Language Processing (TASLP), 30 (2022): 1407-1421.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Context
This list ranks the 50 states in the United States by Chinese population, as estimated by the United States Census Bureau. It also highlights population changes in each state over the past five years.
When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 5-Year Estimates, including:
Variables / Data Columns
Good to know
Margin of Error
Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.
Custom data
If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.
Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.