Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
A shift in scientific publishing from paper-based to knowledge-based practices promotes reproducibility, machine actionability and knowledge discovery. This is important for disciplines like social science, as study indicators are often social constructs such as race or education; hypothesis tests are challenging to compare in demographic research due to their limited temporal and spatial coverage; and natural language in research papers is often imprecise and ambiguous. Therefore, we present the MIRA-KG, consisting of: (1) an ontology for capturing social demography research, which links hypotheses and findings to evidence, (2) annotations of papers on health inequality in terms of the ontology, gathered by (i) prompting a Large Language Model to annotate paper abstracts using the ontology, (ii) mapping concepts to terms from NCBO BioPortal ontologies and GeoNames, and (iii) refining the final graph by a set of SHACL constraints, developed according to data quality criteria. The utility of the resource lies in its use for formally representing social demography research hypotheses, discovering research biases, discovery of knowledge, and the derivation of novel questions.
This dataset was generated using the code available on Github at https://w3id.org/mira/ at version v1.0. It uses the following ontology: https://w3id.org/mira/ontology/. A dump of the requirement stories and other resources used to generate the resource can be found on the drive: https://drive.google.com/drive/folders/1QKAOVV0TXfF4vYQ7b5dkHkXQjBqnh75W?usp=sharing.
( 1 ) United Nations Population Division. World Population Prospects: 2019 Revision. ( 2 ) Census reports and other statistical publications from national statistical offices, ( 3 ) Eurostat: Demographic Statistics, ( 4 ) United Nations Statistical Division. Population and Vital Statistics Reprot ( various years ), ( 5 ) U.S. Census Bureau: International Database, and ( 6 ) Secretariat of the Pacific Community: Statistics and Demography Programme.
ResourcesMapTeacher guide Student worksheetGet startedOpen the map.Use the teacher guide to explore the map with your class or have students work through it on their own with the worksheet.New to GeoInquiriesTM? See Getting to Know GeoInquiries.Science standardsAPES: III. B. – Population biology concepts.APES: II.B.1. – Human population dynamics - historical population sizes; distribution; fertility rates; growth rates and doubling times; demographic transition; age-structure diagrams.Learning outcomesStudents will predict total historical population trends from age-structure information.Students will relate population growth to k (carrying capacity) or r (reproductive factor) selective environmental conditions.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
You found Russian Demography (1990-2017) Dataset. It contains demographic features like natural population growth, birth rate, urbanization, etc. Data was collected from various Internet resources.
Dataset has 2380 rows and 7 columns. Keys for columns:
ЕМИСС (UIISS) - Unified interdepartmental information and statistical system
You can analyze the relationships between various years, find best regions by each feature and compare them.
International Journal of Humanities and Social Science CiteScore 2024-2025 - ResearchHelpDesk - International Journal of Humanities and Social Science (IJHSS) is an open access, peer-reviewed and refereed journal published by Center for Promoting Ideas (CPI), USA. The main objective of IJHSS is to provide an intellectual platform for the international scholars. IJHSS aims to promote interdisciplinary studies in humanities and social science and become the leading journal in humanities and social science in the world. The journal publishes research papers in the fields of humanities and social science such as anthropology, Business studies, Communication studies, Corporate governance, Criminology, Crosscultural studies, Demography, Development studies, Economics, Education, Ethics, Geography, History, Lndustrial relations, Lnformation science, International relations, Law, Linguistics, Library science, Media studies, Methodology, Philosophy, Political science, Population Studies, Psychology, Public administration, Sociology, Social welfare, Linguistics, Literature, Paralegal, Performing arts (music, Theatre & dance), Religious studies, Visual arts, Women studies and so on.
Percentage Population growth has been calculated from the change between the 2001 and the 2006 Population and Housing Census data. The 2001 data was concorded to 2006 boundaries by ABS, and the calculations were completed by BRS. The change between 2001-2006 has been presented as a percentage population growth and attributed to each Statistical Local Area and then rasterised. Capital cities have been masked out of this analysis.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Demographic analysis examines and measures the dimensions and dynamics of populations; it can cover whole societies or groups defined by criteria such as education, nationality, religion, and ethnicity. Educational institutions usually treat demography as a field of sociology, though there are a number of independent demography departments. These methods have primarily been developed to study human populations, but are extended to a variety of areas where researchers want to know how populations of social actors can change across time through processes of birth, death, and migration. In the context of human biological populations, demographic analysis uses administrative records to develop an independent estimate of the population
https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
Professional organizations in STEM (science, technology, engineering, and mathematics) can use demographic data to quantify recruitment and retention (R&R) of underrepresented groups within their memberships. However, variation in the types of demographic data collected can influence the targeting and perceived impacts of R&R efforts - e.g., giving false signals of R&R for some groups. We obtained demographic surveys from 73 U.S.-affiliated STEM organizations, collectively representing 712,000 members and conference-attendees. We found large differences in the demographic categories surveyed (e.g., disability status, sexual orientation) and the available response options. These discrepancies indicate a lack of consensus regarding the demographic groups that should be recognized and, for groups that are omitted from surveys, an inability of organizations to prioritize and evaluate R&R initiatives. Aligning inclusive demographic surveys across organizations will provide baseline data that can be used to target and evaluate R&R initiatives to better serve underrepresented groups throughout STEM. Methods We surveyed 164 STEM organizations (73 responses, rate = 44.5%) between December 2020 and July 2021 with the goal of understanding what demographic data each organization collects from its constituents (i.e., members and conference-attendees) and how the data are used. Organizations were sourced from a list of professional societies affiliated with the American Association for the Advancement of Science, AAAS, (n = 156) or from social media (n = 8). The survey was sent to the elected leadership and management firms for each organization, and follow-up reminders were sent after one month. The responding organizations represented a wide range of fields: 31 life science organizations (157,000 constituents), 5 mathematics organizations (93,000 constituents), 16 physical science organizations (207,000 constituents), 7 technology organizations (124,000 constituents), and 14 multi-disciplinary organizations spanning multiple branches of STEM (131,000 constituents). A list of the responding organizations is available in the Supplementary Materials. Based on the AAAS-affiliated recruitment of the organizations and the similar distribution of constituencies across STEM fields, we conclude that the responding organizations are a representative cross-section of the most prominent STEM organizations in the U.S. Each organization was asked about the demographic information they collect from their constituents, the response rates to their surveys, and how the data were used. Survey description The following questions are written as presented to the participating organizations. Question 1: What is the name of your STEM organization? Question 2: Does your organization collect demographic data from your membership and/or meeting attendees? Question 3: When was your organization’s most recent demographic survey (approximate year)? Question 4: We would like to know the categories of demographic information collected by your organization. You may answer this question by either uploading a blank copy of your organization’s survey (linked provided in online version of this survey) OR by completing a short series of questions. Question 5: On the most recent demographic survey or questionnaire, what categories of information were collected? (Please select all that apply)
Disability status Gender identity (e.g., male, female, non-binary) Marital/Family status Racial and ethnic group Religion Sex Sexual orientation Veteran status Other (please provide)
Question 6: For each of the categories selected in Question 5, what options were provided for survey participants to select? Question 7: Did the most recent demographic survey provide a statement about data privacy and confidentiality? If yes, please provide the statement. Question 8: Did the most recent demographic survey provide a statement about intended data use? If yes, please provide the statement. Question 9: Who maintains the demographic data collected by your organization? (e.g., contracted third party, organization executives) Question 10: How has your organization used members’ demographic data in the last five years? Examples: monitoring temporal changes in demographic diversity, publishing diversity data products, planning conferences, contributing to third-party researchers. Question 11: What is the size of your organization (number of members or number of attendees at recent meetings)? Question 12: What was the response rate (%) for your organization’s most recent demographic survey? *Organizations were also able to upload a copy of their demographics survey instead of responding to Questions 5-8. If so, the uploaded survey was used (by the study authors) to evaluate Questions 5-8.
This research is about public expectation for information disclosure quality, we investagate it via questionnaire survey, the dataset is demographic characteristics of participants.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Context
The dataset tabulates the Science Hill population distribution across 18 age groups. It lists the population in each age group along with the percentage population relative of the total population for Science Hill. The dataset can be utilized to understand the population distribution of Science Hill by age. For example, using this dataset, we can identify the largest age group in Science Hill.
Key observations
The largest age group in Science Hill, KY was for the group of age Under 5 years years with a population of 84 (10.63%), according to the ACS 2019-2023 5-Year Estimates. At the same time, the smallest age group in Science Hill, KY was the 85 years and over years with a population of 0 (0%). Source: U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates
When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates
Age groups:
Variables / Data Columns
Good to know
Margin of Error
Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.
Custom data
If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.
Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.
This dataset is a part of the main dataset for Science Hill Population by Age. You can refer the same here
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Psychological and brain science explore human behavior and the human brain by studying volunteers who participate in these studies. Given that the mind and behavior of participants are influenced by their own biological and social factors, the generalizability of findings in these fields largely depends on the representativeness of samples. However, the representativeness of samples in psychological and brain science has long been criticized as “WEIRD” (Western, Educated, Industrialized, Rich, and Democratic). In recent years, several meta-researches have surveyed the representativeness of samples in published studies from different sub-fields, but an overall understanding of the representativeness of samples in psychological and brain science is lacking. In this review, we analyze these meta-researches to provide a comprehensive perspective on the current state of sample representativeness. Two common issues emerged across these meta-researches. Firstly, the demographics of participants were incomplete in most of the published studies. Most psychological and brain science studies reported participants' gender, age, and country, but participants' race/ethnicity, education level, and socioeconomic status were far less reported. Other important demographics, such as rural/urban division, were not reported at all. Additionally, the reporting of these demographics has increased only slightly in recent years compared to decades ago. Thus, the under-reporting of demographic information in literature was largely unchanged. Secondly, based on the reported demographics, we found that samples in the field are far from being representative of the world population: most participants are young, highly educated Caucasian females in Western countries; middle-aged and older, less educated, colored people in and outside Western countries are less likely to be studied. In terms of countries, Southeast Asian, African, Latin American, and Middle Eastern countries appear fewer in psychological and brain science research.These two issues may be due to the following reasons: convenience sampling dominates psychological and brain science; Western researchers dominate the field of psychology and brain science, with most of the editors-in-chief, editorial board members, and authors coming from Europe and America; psychology and brain science undervalued the effect of socioeconomic and cultural factors; and researchers mistakenly believe that findings from Western participants can be generalized to all human beings. Addressing the issue of sample representativeness in psychological and brain sciences requires a concerted effort by researchers, academic societies, journals, and funding agencies: Researchers should collect and report detailed demographic information about participants, state the limitations of generalizability, and use sampling methods that can increase representativeness whenever possible (e.g., probability sampling); academic societies should pay attention to the representativeness issues by organizing more academic symposium or workshops on this topic; journals should increase the representativeness of editorial board members and encourage more rigorous research with samples from underrepresented groups or studies that examine the generalizability of important findings; funding agencies can encourage researchers to pay more attention to study groups from underrepresented countries, and provide financial support for studying hard-to-research population. Improving sample representativeness will enhance the value of applying psychological and brain science knowledge in real-life settings and promote the building of a community with a shared future for mankind.
A collection of population life tables covering a multitude of countries and many years. Most of the HLD life tables are life tables for national populations, which have been officially published by national statistical offices. Some of the HLD life tables refer to certain regional or ethnic sub-populations within countries. Parts of the HLD life tables are non-official life tables produced by researchers. Life tables describe the extent to which a generation of people (i.e. life table cohort) dies off with age. Life tables are the most ancient and important tool in demography. They are widely used for descriptive and analytical purposes in demography, public health, epidemiology, population geography, biology and many other branches of science. HLD includes the following types of data: * complete life tables in text format; * abridged life tables in text format; * references to statistical publications and other data sources; * scanned copies of the original life tables as they were published. Three scientific institutions are jointly developing the HLD: the Max Planck Institute for Demographic Research (MPIDR) in Rostock, Germany, the Department of Demography at the University of California at Berkeley, USA and the Institut national d''��tudes d��mographiques (INED) in Paris, France. The MPIDR is responsible for maintaining the database.
De facto population as of 1 July of the year indicated and in the age group 0 - 14.
Projection data presented are consistent with the Medium variant of the 2010 Revision of World Population Prospects at the national level.
Species distribution models (SDMs) are widely used to infer species-environment relationships, predict spatial distributions, and characterise species’ environmental niches. While the importance of space and spatial scales is widely acknowledged in SDM applications, temporal components of the niche are rarely addressed. We discuss how phenology and demographic stages affect model inference in plant SDMs. Ignoring conspicuousness and timing of phenological stages may bias niche estimates through increased observer bias, while ignoring stand age may bias niche estimates through temporal mismatches with environmental variables, especially during times of rapid global warming. We present different methods to consider phenology and demographic stages in plant SDMs, including the selection of causal, spatiotemporally explicit predictors, and the calibration of stage-specific SDMs. Based on a case study with citizen science data, we illustrate how spatiotemporal SDMs provide deeper insights on..., We conducted a keyword-based search in the Web of Science to quantify how often temporal components related to phenology and demographic stages are explicitly considered in plant SDMs. A full list of keywords is provided in the Supporting Information Table S1. We used a nested set of keywords to identify all studies that mentioned SDMs (or common synonyms), were focused on plants, and were listing relevant keywords related to phenology or to demographic stages, respectively. The search was carried out on 5-Oct-2023 and was restricted to English-language journal articles in the period 1945-2022 (no studies using SDMs were published before that start year). Overall, we found more than 40,000 articles mentioning SDM and over 10,000 articles in our refined search for plant SDMs, with a strong increase in the number of articles over time. Among these, phenology (or related search terms) was mentioned in 970 articles and demographic stages (or related terms) in 1188 articles, each averaging c..., , # The niche through time: considering phenology and demographic stages in plant distribution models
https://doi.org/10.5061/dryad.sn02v6xct
Columns from WoS (Web of Science) search – these are identical in both excel sheets
These columns are the standard columns provided as WoS search output. If the entries contain "n/a", then no information was provided by WoS because those items are not applicable. For example, a journal article does not have any entries for book authors.
Column | Explanation |
---|---|
Publication Type | Type of publication: J .. Journal article |
Authors | Authors |
Book Authors | Book Authors |
Book Editors | Book Editors ... |
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Open Science in (Higher) Education – data of the February 2017 survey
This data set contains:
Full raw (anonymised) data set (completed responses) of Open Science in (Higher) Education February 2017 survey. Data are in xlsx and sav format.
Survey questionnaires with variables and settings (German original and English translation) in pdf. The English questionnaire was not used in the February 2017 survey, but only serves as translation.
Readme file (txt)
Survey structure
The survey includes 24 questions and its structure can be separated in five major themes: material used in courses (5), OER awareness, usage and development (6), collaborative tools used in courses (2), assessment and participation options (5), demographics (4). The last two questions include an open text questions about general issues on the topics and singular open education experiences, and a request on forwarding the respondent's e-mail address for further questionings. The online survey was created with Limesurvey[1]. Several questions include filters, i.e. these questions were only shown if a participants did choose a specific answer beforehand ([n/a] in Excel file, [.] In SPSS).
Demographic questions
Demographic questions asked about the current position, the discipline, birth year and gender. The classification of research disciplines was adapted to general disciplines at German higher education institutions. As we wanted to have a broad classification, we summarised several disciplines and came up with the following list, including the option "other" for respondents who do not feel confident with the proposed classification:
Natural Sciences
Arts and Humanities or Social Sciences
Economics
Law
Medicine
Computer Sciences, Engineering, Technics
Other
The current job position classification was also chosen according to common positions in Germany, including positions with a teaching responsibility at higher education institutions. Here, we also included the option "other" for respondents who do not feel confident with the proposed classification:
Professor
Special education teacher
Academic/scientific assistant or research fellow (research and teaching)
Academic staff (teaching)
Student assistant
Other
We chose to have a free text (numerical) for asking about a respondent's year of birth because we did not want to pre-classify respondents' age intervals. It leaves us options to have different analysis on answers and possible correlations to the respondents' age. Asking about the country was left out as the survey was designed for academics in Germany.
Remark on OER question
Data from earlier surveys revealed that academics suffer confusion about the proper definition of OER[2]. Some seem to understand OER as free resources, or only refer to open source software (Allen & Seaman, 2016, p. 11). Allen and Seaman (2016) decided to give a broad explanation of OER, avoiding details to not tempt the participant to claim "aware". Thus, there is a danger of having a bias when giving an explanation. We decided not to give an explanation, but keep this question simple. We assume that either someone knows about OER or not. If they had not heard of the term before, they do not probably use OER (at least not consciously) or create them.
Data collection
The target group of the survey was academics at German institutions of higher education, mainly universities and universities of applied sciences. To reach them we sent the survey to diverse institutional-intern and extern mailing lists and via personal contacts. Included lists were discipline-based lists, lists deriving from higher education and higher education didactic communities as well as lists from open science and OER communities. Additionally, personal e-mails were sent to presidents and contact persons from those communities, and Twitter was used to spread the survey.
The survey was online from Feb 6th to March 3rd 2017, e-mails were mainly sent at the beginning and around mid-term.
Data clearance
We got 360 responses, whereof Limesurvey counted 208 completes and 152 incompletes. Two responses were marked as incomplete, but after checking them turned out to be complete, and we added them to the complete responses dataset. Thus, this data set includes 210 complete responses. From those 150 incomplete responses, 58 respondents did not answer 1st question, 40 respondents discontinued after 1st question. Data shows a constant decline in response answers, we did not detect any striking survey question with a high dropout rate. We deleted incomplete responses and they are not in this data set.
Due to data privacy reasons, we deleted seven variables automatically assigned by Limesurvey: submitdate, lastpage, startlanguage, startdate, datestamp, ipaddr, refurl. We also deleted answers to question No 24 (email address).
References
Allen, E., & Seaman, J. (2016). Opening the Textbook: Educational Resources in U.S. Higher Education, 2015-16.
First results of the survey are presented in the poster:
Heck, Tamara, Blümel, Ina, Heller, Lambert, Mazarakis, Athanasios, Peters, Isabella, Scherp, Ansgar, & Weisel, Luzian. (2017). Survey: Open Science in Higher Education. Zenodo. http://doi.org/10.5281/zenodo.400561
Contact:
Open Science in (Higher) Education working group, see http://www.leibniz-science20.de/forschung/projekte/laufende-projekte/open-science-in-higher-education/.
[1] https://www.limesurvey.org
[2] The survey question about the awareness of OER gave a broad explanation, avoiding details to not tempt the participant to claim "aware".
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Datasets, conda environments and Softwares for the course "Population Genomics" of Prof Kasper Munch. This course material is maintained by the health data science sandbox. This webpage shows the latest version of the course material.
The data is connected to the following repository: https://github.com/hds-sandbox/Popgen_course_aarhus. The original course material from Prof Kasper Munch is at https://github.com/kaspermunch/PopulationGenomicsCourse.
Description
The participants will after the course have detailed knowledge of the methods and applications required to perform a typical population genomic study.
The participants must at the end of the course be able to:
The course introduces key concepts in population genomics from generation of population genetic data sets to the most common population genetic analyses and association studies. The first part of the course focuses on generation of population genetic data sets. The second part introduces the most common population genetic analyses and their theoretical background. Here topics include analysis of demography, population structure, recombination and selection. The last part of the course focus on applications of population genetic data sets for association studies in relation to human health.
Curriculum
The curriculum for each week is listed below. "Coop" refers to a set of lecture notes by Graham Coop that we will use throughout the course.
Course plan
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
There is a persistent shortage of underrepresented minority (URM) faculty who are involved in basic biomedical research at medical schools. We examined the entire training pathway of potential candidates to identify the points of greatest loss. Using a range of recent national data sources, including the National Science Foundation’s Survey of Earned Doctorates and Survey of Doctoral Recipients, we analyzed the demographics of the population of interest, specifically those from URM backgrounds with an interest in biomedical sciences. We examined the URM population from high school graduates through undergraduate, graduate, and postdoctoral training as well as the URM population in basic science tenure track faculty positions at medical schools. We find that URM and non-URM trainees are equally likely to transition into doctoral programs, to receive their doctoral degree, and to secure a postdoctoral position. However, the analysis reveals that the diversions from developing a faculty career are found primarily at two clearly identifiable places, specifically during undergraduate education and in transition from postdoctoral fellowship to tenure track faculty in the basic sciences at medical schools. We suggest focusing additional interventions on these two stages along the educational pathway.
Distribution and demographic characteristics of moose in the Refuge. This is a very low density population.
Approximately 25% of mammals are currently threatened with extinction, a risk that is amplified under climate change. Species persistence under climate change is determined by the combined effects of climatic factors on multiple demographic rates (survival, development, reproduction), and hence, population dynamics. Thus, to quantify which species and regions on Earth are most vulnerable to climate-driven extinction, a global understanding of how different demographic rates respond to climate is urgently needed. Here, we perform a systematic review of literature on demographic responses to climate, focusing on terrestrial mammals, for which extensive demographic data are available. To assess the full spectrum of responses, we synthesize information from studies that quantitatively link climate to multiple demographic rates. We find only 106 such studies, corresponding to 87 mammal species. These 87 species constitute < 1% of all terrestrial mammals. Our synthesis reveals a strong m..., For each mammal species i with available life-history information, we searched SCOPUS for studies (published before 2018) where the title, abstract, or keywords contained the following search terms:Â
Scientific species namei AND (demograph* OR population OR life-history OR "life history" OR model) AND (climat* OR precipitation OR rain* OR temperature OR weather) AND (surv* OR reprod* OR recruit* OR brood OR breed* OR mass OR weight OR size OR grow* OR offspring OR litter OR lambda OR birth OR mortality OR body OR hatch* OR fledg* OR productiv* OR age OR inherit* OR sex OR nest* OR fecund* OR progression OR pregnan* OR newborn OR longevity).
We used the R package taxize (Chamberlain and Szöcs 2013) to resolve discrepancies in scientific names or taxonomic identifiers and, where applicable, searched SCOPUS using all scientific names associated with a species in the Integrated Taxonomic Information System (ITIS; http://www.itis.gov).
We did not extract information on demographic-r..., ReadMe File uploaded
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset is the total global population estimation from the World Bank organization.
Sources: ( 1 ) United Nations Population Division. World Population Prospects: 2019 Revision. ( 2 ) Census reports and other statistical publications from national statistical offices ( 3 ) Eurostat: Demographic Statistics, ( 4 ) United Nations Statistical Division. Population and Vital Statistics Report ( 5 ) U.S. Census Bureau: International Database ( 6 ) Secretariat of the Pacific Community: Statistics and Demography Programmer
Columns: year: The year of the observation population: Total global population estimation in billions
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
A shift in scientific publishing from paper-based to knowledge-based practices promotes reproducibility, machine actionability and knowledge discovery. This is important for disciplines like social science, as study indicators are often social constructs such as race or education; hypothesis tests are challenging to compare in demographic research due to their limited temporal and spatial coverage; and natural language in research papers is often imprecise and ambiguous. Therefore, we present the MIRA-KG, consisting of: (1) an ontology for capturing social demography research, which links hypotheses and findings to evidence, (2) annotations of papers on health inequality in terms of the ontology, gathered by (i) prompting a Large Language Model to annotate paper abstracts using the ontology, (ii) mapping concepts to terms from NCBO BioPortal ontologies and GeoNames, and (iii) refining the final graph by a set of SHACL constraints, developed according to data quality criteria. The utility of the resource lies in its use for formally representing social demography research hypotheses, discovering research biases, discovery of knowledge, and the derivation of novel questions.
This dataset was generated using the code available on Github at https://w3id.org/mira/ at version v1.0. It uses the following ontology: https://w3id.org/mira/ontology/. A dump of the requirement stories and other resources used to generate the resource can be found on the drive: https://drive.google.com/drive/folders/1QKAOVV0TXfF4vYQ7b5dkHkXQjBqnh75W?usp=sharing.