Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
A shift in scientific publishing from paper-based to knowledge-based practices promotes reproducibility, machine actionability and knowledge discovery. This is important for disciplines like social science, as study indicators are often social constructs such as race or education; hypothesis tests are challenging to compare in demographic research due to their limited temporal and spatial coverage; and natural language in research papers is often imprecise and ambiguous. Therefore, we present the MIRA-KG, consisting of: (1) an ontology for capturing social demography research, which links hypotheses and findings to evidence, (2) annotations of papers on health inequality in terms of the ontology, gathered by (i) prompting a Large Language Model to annotate paper abstracts using the ontology, (ii) mapping concepts to terms from NCBO BioPortal ontologies and GeoNames, and (iii) refining the final graph by a set of SHACL constraints, developed according to data quality criteria. The utility of the resource lies in its use for formally representing social demography research hypotheses, discovering research biases, discovery of knowledge, and the derivation of novel questions.
This dataset was generated using the code available on Github at https://w3id.org/mira/ at version v1.0. It uses the following ontology: https://w3id.org/mira/ontology/. A dump of the requirement stories and other resources used to generate the resource can be found on the drive: https://drive.google.com/drive/folders/1QKAOVV0TXfF4vYQ7b5dkHkXQjBqnh75W?usp=sharing.
Not seeing a result you expected?
Learn how you can add new datasets to our index.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
A shift in scientific publishing from paper-based to knowledge-based practices promotes reproducibility, machine actionability and knowledge discovery. This is important for disciplines like social science, as study indicators are often social constructs such as race or education; hypothesis tests are challenging to compare in demographic research due to their limited temporal and spatial coverage; and natural language in research papers is often imprecise and ambiguous. Therefore, we present the MIRA-KG, consisting of: (1) an ontology for capturing social demography research, which links hypotheses and findings to evidence, (2) annotations of papers on health inequality in terms of the ontology, gathered by (i) prompting a Large Language Model to annotate paper abstracts using the ontology, (ii) mapping concepts to terms from NCBO BioPortal ontologies and GeoNames, and (iii) refining the final graph by a set of SHACL constraints, developed according to data quality criteria. The utility of the resource lies in its use for formally representing social demography research hypotheses, discovering research biases, discovery of knowledge, and the derivation of novel questions.
This dataset was generated using the code available on Github at https://w3id.org/mira/ at version v1.0. It uses the following ontology: https://w3id.org/mira/ontology/. A dump of the requirement stories and other resources used to generate the resource can be found on the drive: https://drive.google.com/drive/folders/1QKAOVV0TXfF4vYQ7b5dkHkXQjBqnh75W?usp=sharing.