Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This series of 11 datasets is drawn from Rhoads, Edward J. M. Stepping Forth into the World: The Chinese Educational Mission to the United States, 1872-81. Hong Kong University Press, 2011.
They document the 120 young Chinese who participated in the pioneering Chinese Educational Mission (CEM) in the United States (1872-1881). The first 8 files are drawn directly from the tables in Rhoads:
Based on these tables, I created three synthetic datasets which can be used for statistical and network analyses:
Each file contains two tabs, one for the data (data), one for the description of variables (key). Grey columns refer to the unstructured information given in the original source.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Higher education plays a critical role in driving an innovative economy by equipping students with knowledge and skills demanded by the workforce.While researchers and practitioners have developed data systems to track detailed occupational skills, such as those established by the U.S. Department of Labor (DOL), much less effort has been made to document which of these skills are being developed in higher education at a similar granularity.Here, we fill this gap by presenting Course-Skill Atlas -- a longitudinal dataset of skills inferred from over three million course syllabi taught at nearly three thousand U.S. higher education institutions. To construct Course-Skill Atlas, we apply natural language processing to quantify the alignment between course syllabi and detailed workplace activities (DWAs) used by the DOL to describe occupations. We then aggregate these alignment scores to create skill profiles for institutions and academic majors. Our dataset offers a large-scale representation of college education's role in preparing students for the labor market.Overall, Course-Skill Atlas can enable new research on the source of skills in the context of workforce development and provide actionable insights for shaping the future of higher education to meet evolving labor demands, especially in the face of new technologies.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset is derived from the Whoʻs Who of American Returned Students 遊美同學錄 [Youmei Tongxue Lu] published in Peking [Beijing] in 1917, compiled by the Returned Students’ Information Bureau (Liumei xuesheng tongxunchu 留美學生通訊處) established at Tsinghua School in 1915. This book is crucial for documenting the early liumei's experiences during the transitional period between the late Qing dynasty and the early years of the Republic (1911-).
The dataset records all the institutions to which the students were affiliated in the course of their lives, including the educational institutions in which they studied in China, the United States, and other countries; the public or private organizations in which they were employed; as well as their memberships in clubs and associations. The names of organizations were retrieved automatically from the Chinese biographies using named entity recognition (SpaCy model), then manually cleaned, classified, and validated by the author.
The attached file contains three tabs for (1) the list of affiliations (data); (2) the classification of organizations (class), and (3) the description of variables (key). The dataset records a total of 2,883 affiliations, linking 401 unique individuals to 1,344 unique institutions, distributed as followed:
category | n |
education | 565 |
association | 271 |
administration | 132 |
business | 110 |
facility | 92 |
media | 66 |
government | 49 |
factory | 30 |
other | 22 |
military | 7 |
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This table presents the bilingual typology of academic disciplines used in the Modern China Biographical Database. It is based mainly on the typology created by Yuan T'ung-li in his three bibliographical volumes about the doctoral dissertations by Chinese students in the United States, the United Kingdom, and continental Europe. We adapted this typology to include other disciplines that were present in historical sources.
Dataset Description: Typology of Disciplines (Level 2)
Overview: This dataset provides a bilingual typology of academic disciplines, specifically focusing on Level 2 classifications. The terms are extracted from various Chinese sources, with English translations provided. It is structured hierarchically, connecting each Level 2 discipline to broader categories (Level 1 and Level 0), facilitating multilingual academic classification.
Structure: The dataset consists of the following key columns:
Level 2 Discipline (English & Chinese): The specific sub-discipline classification.
Level 1 Discipline (English & Chinese): A broader category that groups multiple Level 2 disciplines.
Level 0 Discipline (English & Chinese): The highest-level classification representing major academic domains.
Level 1 Code: A numerical or coded identifier for Level 1 disciplines, supporting structured data processing.
Purpose & Applications:
Hierarchical Classification: Enables structured categorization of academic fields across multiple levels.
Multilingual Standardization: Supports bilingual terminology consistency in academic and research contexts.
Main sources:
Yuan, T’ung-li. A Guide to Doctoral Dissertations by Chinese Students in America, 1905-1960. Washington, D.C.: Published under the auspices of the Sino-American Cultural Society, 1961.
———. A Guide to Doctoral Dissertations by Chinese Students in Continental Europe, 1907-1962. S.l., 1964.
———. Doctoral dissertations by Chinese students in Great Britain and Northern Ireland, 1916-1961. Uden sted og forlag, 1963.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The following dataset contains the list of the 418 members - both Chinese and non-Chinese - of the American University Club of China (Shanghai), based on a directory published in 1936. Established around 1902, the American University Club (AUC) was one of the earliest and largest organizations of American university alumni in pre-1949 China.
The attached file comprises two tabs, one for the data and one for describing the variables (fields). The dataset includes the following variables:
Group | Name | Description | DataType |
---|---|---|---|
Identity | Name_full | Full name, as given in source (English or Wade-Giles) | Raw |
Identity | SurName | Surname as given in source | Raw |
Identity | FirstName | First name or initials, as given in source | Raw |
Identity | Name_zh | Full name in Chinese | Raw |
Identity | Name_py | Pinyin transliteration of full name | Cooked |
Identity | Nationality | Nationality or country of origin | Cooked |
Identity | Deceased | Deceased member or not | Raw |
Club Membership | Life_member | Life membership in AUC (year of admission) | Raw |
Education | University | University in which the individual studied | Raw |
Education | State | State in which the university was located | Cooked |
Education | Country_edu | Country in which the university was located | Cooked |
Education | Degree_source | Academic degree, as given in source | Raw |
Education | Degree_level | Level of qualification | Cooked |
Education | Field_main | Field of study (general category) | Cooked |
Education | Field_2 | Field of study (subcategory) | Cooked |
Education | Year_start | Year of enrollment | Raw |
Education | Year_end | Year of graduation | Raw |
Education | Honorary | Honorary degree | Cooked |
Career | Employer_main | Employer (name of institution, main level) | Raw |
Career | Employer_2 | Employer (institution sublevel) | Raw |
Career | Sector_1 | Sector of employment (main category) | Cooked |
Career | Sector_2 | Sector of employment (subcategory) | Cooked |
Career | Country_1936 | Country of residence or employment (1936) | Cooked |
Career | City | City of residence or employment (1936) | Raw |
Career | Street_name | Current address (business or residence): street name (main) | Raw |
Career | Street_2 | Current address (business or residence): street name (secondary) | Raw |
Career | Street_nbr | Current address (business or residence): street number | Raw |
Career | Building | Current address (business or residence): building name | Raw |
Metadata | Page | Source: Page number in the original source | Raw |
Note: "DataType" indicates whether the information was provided as such in the original source, or whether it was re-processed by the historian.
Référence: American University Club of Shanghai. American University Men in China. Shanghai: Comacrib Press, 1936.
https://www.icpsr.umich.edu/web/ICPSR/studies/37499/termshttps://www.icpsr.umich.edu/web/ICPSR/studies/37499/terms
In the years 2014 through 2019, three U.S. universities, Michigan State University, the University of Minnesota, Twin Cities, and The University of Utah, received Language Proficiency Flagship Initiative grants as part of the larger Language Flagship, which is a National Security Education Program (NSEP) and Defense Language and National Security Education Office (DLNSEO) initiative to improve language learning in the United States. The goal of the three universities' Language Proficiency Flagship Initiative grants was to document language proficiency in regular tertiary foreign language programs so that the programs, and ones like them at other universities, could use the proficiency-achievement data to set programmatic learning benchmarks and recommendations, as called for by the Modern Language Association in 2007. This call was reiterated by the National Standards Collaborative Board in 2015.During the first three years of the three, university-specific five-year grants (Fall 2014 through Spring 2017), each university collected language proficiency data during academic years 2014-2015, 2015-2016, and 2016-2017, from language learners in selected, regular language programs to document the students' proficiency achievements.University A tested Chinese, French, Russian, and Spanish with the NSEP grant funding, and German, Italian, Japanese, Korean, and Portuguese with additional (in-kind) financial support from within University A.University B tested Arabic, French, Portuguese, Russian, and Spanish with the NSEP grant funding, and German and Korean with additional (in-kind) financial support from University B.University C tested Arabic, Chinese, Portuguese, and Russian with the NSEP grant funding, and Korean with additional (in-kind) financial support from University C.Each university additionally provided the students background questionnaires at the time of testing. As stipulated by the grant terms, at the universities, students were offered to take up to three proficiency tests each semester: speaking, listening, and reading. Writing was not assessed because the grants did not financially cover the costs of writing assessments. The universities were required by grant terms to use official, nationally recognized, and standardized language tests that reported scores out on one of two standardized proficiency test scales: either the American Councils of Teaching Foreign Languages (ACTFL, 2012) proficiency scale, or the Interagency Language Roundtable (ILR: Herzog, n.d.) proficiency scale. The three universities thus contracted mostly with Language Testing International, ACTFL's official testing subsidiary, to purchase and administer to students the Oral Proficiency Interview - computer (OPIc) for speaking, the Listening Proficiency Test (LPT) for listening, and the Reading Proficiency Test (RPT) for reading. However, earlier in the grant cycling, because ACTFL did not yet have tests in all of the languages to be tested, some of the earlier testing was contracted with American Councils and Avant STAMP, even though those tests are not specifically geared for the specific populations of learners present in the given project.Students were able to opt out of testing in certain cases; those cases varied from university to university. The speaking tests occurred normally within intact classes that came into computer labs to take the tests. Students were often times requested to take the listening and reading tests outside of class time in proctored language labs on the campuses on walk-in bases, or they took the listening and reading tests in a language lab during a regular class setting. These decisions were often made by the language instructors and/or the language programs. The data are cross-sectional, but certain individuals took the tests repeatedly, thus, longitudinal data sets are nested within the cross-sectional data.The three universities worked mostly independently during the initial year of data collection because the identities of the three universities receiving the grants was not announced until weeks before testing was to begin at all three campuses. Thus, each university independently designed its background questionnaire. However, because all three were guided by the same set of grant-rules to use nationally-recognized standardized tests for the assessments, combining all three universities' test data was
Not seeing a result you expected?
Learn how you can add new datasets to our index.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This series of 11 datasets is drawn from Rhoads, Edward J. M. Stepping Forth into the World: The Chinese Educational Mission to the United States, 1872-81. Hong Kong University Press, 2011.
They document the 120 young Chinese who participated in the pioneering Chinese Educational Mission (CEM) in the United States (1872-1881). The first 8 files are drawn directly from the tables in Rhoads:
Based on these tables, I created three synthetic datasets which can be used for statistical and network analyses:
Each file contains two tabs, one for the data (data), one for the description of variables (key). Grey columns refer to the unstructured information given in the original source.