The Health Survey for England, 2003-2005: Multilevel Modelling Teaching Dataset has been prepared as a resource for those interested in learning multilevel modelling techniques. It was first presented as part of a workshop entitled 'Introducing multilevel models and applying them to the Health Survey for England using MLwiN'. The HSE teaching dataset is available in both Stata and MLwIN formats and is accompanied by a practical guide that includes the multilevel modelling practical exercises. A separate document provides information on the teaching dataset and materials.
The main dataset is an edited version of the Health Survey for England (HSE) data from 2003, 2004 and 2005 (the full HSEs are at the UK Data Archive under SNs 5098, 5439 and 5675). Details of the recoding of HSE variables for the teaching dataset and how the aggregate data were produced can be found in the documentation.
WARNING – Users should note that this dataset is intended as a learning resource and should not be used for research purposes. In particular the dataset uses adult measures of Body Mass Index (BMI) for children and so the results from the data should not be reported in research contexts.
https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
What is the relationship between environment and democracy? The framework of cultural evolution suggests that societal development is an adaptation to ecological threats. Pertinent theories assume that democracy emerges as societies adapt to ecological factors such as higher economic wealth, lower pathogen threats, less demanding climates, and fewer natural disasters. However, previous research confused within-country processes with between-country processes and erroneously interpreted between-country findings as if they generalize to within-country mechanisms. In this article, we analyze a time-series cross-sectional dataset to study the dynamic relationship between environment and democracy (1949-2016), accounting for previous misconceptions in levels of analysis. By separating within-country processes from between-country processes, we find that the relationship between environment and democracy not only differs by countries but also depends on the level of analysis. Economic wealth predicts increasing levels of democracy in between-country comparisons, but within-country comparisons show that democracy declines as countries become wealthier over time. This relationship is only prevalent among historically wealthy countries but not among historically poor countries, whose wealth also increased over time. By contrast, pathogen prevalence predicts lower levels of democracy in both between-country and within-country comparisons. Our longitudinal analyses identifying temporal precedence reveal that not only reductions in pathogen prevalence drive future democracy, but also democracy reduces future pathogen prevalence and increases future wealth. These nuanced results contrast with previous analyses using narrow, cross-sectional data. As a whole, our findings illuminate the dynamic process by which environment and democracy shape each other.
Methods Our Time-Series Cross-Sectional data combine various online databases. Country names were first identified and matched using R-package “countrycode” (Arel-Bundock, Enevoldsen, & Yetman, 2018) before all datasets were merged. Occasionally, we modified unidentified country names to be consistent across datasets. We then transformed “wide” data into “long” data and merged them using R’s Tidyverse framework (Wickham, 2014). Our analysis begins with the year 1949, which was occasioned by the fact that one of the key time-variant level-1 variables, pathogen prevalence was only available from 1949 on. See our Supplemental Material for all data, Stata syntax, R-markdown for visualization, supplemental analyses and detailed results (available at https://osf.io/drt8j/).
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The analysis of change is central to the study of kidney research. In the past 25 years, newer and more sophisticated methods for the analysis of change have been developed; however, as of yet these newer methods are underutilized in the field of kidney research. Repeated measures ANOVA is the traditional model that is easy to understand and simpler to interpret, but it may not be valid in complex real-world situations. Problems with the assumption of sphericity, unit of analysis, lack of consideration for different types of change, and missing data, in the repeated measures ANOVA context are often encountered. Multilevel modeling, a newer and more sophisticated method for the analysis of change, overcomes these limitations and provides a better framework for understanding the true nature of change. The present article provides a primer on the use of multilevel modeling to study change. An example from a clinical study is detailed and the method for implementation in SAS is provided.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Researchers in comparative research increasingly use multilevel models to test effects of country level factors on individual behavior and preferences. However, the justification of widely employed estimation strategies is asymptotic and applications in comparative politics routinely involve only a small number of countries. Thus researchers and reviewers often wonder if these models are applicable at all. In other words, how many countries do we need for multilevel modeling? I present results from a large scale Monte Carlo experiment comparing the performance of multilevel models when few countries are available. I find that maximum likelihood estimates and confidence intervals can be severely biased, especially in models including cross-level interactions. In contrast, the Bayesian approach proves to be far more robust, and yields considerably more conservative tests.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Normal 0
7.8 磅 0 2
false false false
EN-US ZH-CN X-NONE
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Example data sets for the book chapter titled "Missing Data in the Analysis of Multilevel and Dependent Data" submitted for publication in the second edition of "Dependent Data in Social Science Research" (Stemmler et al., 2015). This repository includes the data sets used in both example analyses (Examples 1 and 2) in two file formats (binary ".rda" for use in R; plain-text ".dat").
The data sets contain simulated data from 23,376 (Example 1) and 23,072 (Example 2) individuals from 2,000 groups on four variables:
ID
= group identifier (1-2000)
x
= numeric (Level 1)
y
= numeric (Level 1)
w
= binary (Level 2)
In all data sets, missing values are coded as "NA".
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Legal Reasoning Dataset with Multilevel Human and Model-Annotated Explanations
Prepared by Mst Rafia Islam, Umong Sain, Azmine Toushik Wasi Prepared as a part of Reasoning Datasets Competition by Bespoke Labs, Hugging Face, and Together.ai.
🧭 Purpose and Scope
The Legal Reasoning Dataset aims to support the evaluation and training of legal reasoning systems, particularly in multilingual or jurisdiction-agnostic contexts. It focuses on international acts and treaties… See the full description on the dataset page: https://huggingface.co/datasets/ciol-research/multilevel-legal-reasoning.
https://data.gov.tw/licensehttps://data.gov.tw/license
The multi-level marketing business has not yet prepared the required reporting materials, and according to Article 6, Paragraph 1 of the Multi-level Marketing Management Measures, it is considered as not having been reported.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Replication materials
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Nowadays, Artificial Intelligence (AI) is playing a rapidly increasing role in several fields of research and in almost all sectors of real life. However, few studies have assessed the effects of AI applications on training needs. This paper proposes an innovative multilevel modeling in order to investigate Awareness, Attitude and Trust towards AI and their reflections on learning needs. In particular, it is shown how a machine learning variable selection algorithm can support the definition of the optimal subset of all relevant covariates with respect to the outcome variable and improve the multilevel model performance for estimating the probability of educational needs. Thus, starting from a complex web survey to European citizens distributed in eight countries, the estimation of a multilevel binary model, defined on the basis of covariates selected through the Boruta random forest algorithm, is proposed. A discussion on the gender differences of the related estimated multilevel logit models is presented. A sensitivity analysis is also included in order to assess the prediction accuracy of the proposed multilevel logit modeling.
This repository contains data generated for the manuscript: " A two-stage procedure for optimal modeling of the probability of training needs in artificial intelligence". It comprehends: (1) the dataset Data_Boruta_Random_Forest used to estimate the variables importance. (2) the dataset Data_Multilevel to perform the comparison among different multilevel binary models proposed in the paper.
U.S. Government Workshttps://www.usa.gov/government-works
License information was derived automatically
A collection of analysis-ready Multilevel Monitoring System (MLMS) datasets for wells in the U.S. Geological Survey (USGS) aquifer-monitoring network, Idaho National Laboratory (INL), Idaho. Administered by the USGS INL Project Office in cooperation with the U.S. Department of Energy.
This article provides an overview of multilevel regression and post-stratification (MRP). It reviews the stages in estimating opinion for small areas, identifies circumstances in which MRP can go wrong, or go right, and provides a worked example for the UK using publicly available data sources and a previously published post-stratification frame. This archive contains two R source code files and one post-stratification matrix in CSV format.
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
The objective of this work was to develop a multilevel (hierarchical) model based on isocratic-reversed-phase-high-performance-chromatographic data collected in methanol and acetonitrile for 58 chemical compounds. Such a multilevel model is a regression model of the analyte-specific chromatographic measurements, in which all the regression parameters are given a probability model. It is a fundamentally different approach from the most common approach, where parameters are separately estimated for each analyte (without sharing information across analytes and different organic modifiers). The statistical analysis was done with Stan software implementing the Bayesian-statistics inference with Markov-chain Monte Carlo sampling. During the model-building process, a series of multilevel models of different complexity were obtained, such as (1) a model with no pooling (separate models were fitted for each analyte), (2) a model with partial pooling (a common distribution was used for analyte-specific parameters), and (3) a model with partial pooling as well as a regression model relating analyte-specific parameters and analyte-specific properties (QSRR equations). All the models were compared with each other using 10-fold cross-validation. The benefits of multilevel models in inference and predictions were shown. In particular the obtained models allowed us to (i) better understand the data and (ii) solve many routine analytical problems, such as obtaining well-calibrated predictions of retention factors for an analyte in acetonitrile-containing mobile phases given zero, one, or several measurements in methanol-containing mobile phases and vice versa.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Previous studies mainly focused on individual-level factors that influence the adoption and usage of mobile technology and social networking sites, with little emphasis paid to the influences of household situations. Using multilevel modelling approach, this study merges household- (n1 = 1,455) and individual-level (n2 = 2,570) data in the U.K. context to investigate (a) whether a household economic capital (HEC) can affect its members’ Twitter adoption, (b) whether the influences are mediated by the member’s activity variety and self-reported efficacy with mobile technology, and (c) whether the members’ traits, including educational level, gross income and residential area, moderate the relationship between HEC and Twitter adoption. Significant direct and indirect associations were discovered between HEC and its members’ Twitter adoption. The educational level and gross income of household members moderated the influence of HEC on individuals’ Twitter adoption.
This is the dataset to recreate Figure3.4 in Chapter 3 of my PhD thesis.
Software for the solution of elliptic partial differential equations using finite elements with adaptive mesh refinement and multigrid techniques.
This is the R script to recreate Figure 3.4 in Chapter 3 of my PhD thesis.
https://www.icpsr.umich.edu/web/ICPSR/studies/37603/termshttps://www.icpsr.umich.edu/web/ICPSR/studies/37603/terms
The National Institute on Drug Abuse (NIDA) funded RADAR in 2014 to collect multilevel, longitudinal data and biospecimens from an ethnically and racially diverse cohort of young, sexual and gender minorities (SGM; e.g., men who have sex with men (MSM), transgender women, gender non-conforming individuals) who were assigned male at birth (AMAB) (current core cohort n=1,113). The primary objective of this study is to apply a multilevel perspective to a syndemic of health issues associated with human immunodeficiency virus (HIV) in this population. The multilevel design focuses on individual, dyadic (i.e., sexual and romantic relationships), network (i.e., social, drug, and sexual connections) and biologic factors that may be associated with HIV. The cohort contains both HIV-negative and HIV-positive individuals, which allows for the development of a repository of biospecimens and HIV sequence data from both pre-infection and post-infection visits that will help facilitate future projects evaluating substance use, HIV risk, and pathogenesis. A multiple cohort, accelerated longitudinal design was utilized by initially enrolling two existing SGM cohorts and then expanded through the use of convenience and snowball sampling methods. Enrollment criteria varied slightly based on the recruitment method, but overall inclusion criteria required participants to be AMAB, between 16 and 29 years of age, report having had sex with a man in the prior year or identify as a SGM, live in the Chicago metropolitan area, and be an English speaker. Study recruitment opened in February 2015. Participants are followed through the developmental period of late adolescence to early adulthood, which is a critical period of initiation and acceleration of sexual behavior and substance use. Study visits occur every six months.
The multilevel hidden Markov model (MHMM) is a promising vehicle to investigate latent dynamics over time in social and behavioral processes. By including continuous individual random effects, the model accommodates variability between individuals, providing individual-specific trajectories and facilitating the study of individual differences. However, the performance of the MHMM has not been sufficiently explored. Currently, there are no practical guidelines on the sample size needed to obtain reliable estimates related to categorical data characteristics We performed an extensive simulation to assess the effect of the number of dependent variables (1-4), the number of individuals (5-90), and the number of observations per individual (100-1600) on the estimation performance of group-level parameters and between-individual variability on a Bayesian MHMM with categorical data of various levels of complexity. We found that using multivariate data generally alleviates the sample size needed and improves the stability of the results. Regarding the estimation of group-level parameters, the number of individuals and observations largely compensate for each other. Meanwhile, only the former drives the estimation of between-individual variability. We conclude with guidelines on the sample size necessary based on the complexity of the data and the study objectives of the practitioners. This repository contains data generated for the manuscript: "Go multivariate: a Monte Carlo study of a multilevel hidden Markov model with categorical data of varying complexity". It comprehends: (1) model outputs (maximum a posteriori estimates) for each repetition (n=100) of each scenario (n=324) of the main simulation, (2) complete model outputs (including estimates for 4000 MCMC iterations) for two chains of each repetition (n=3) of each scenario (n=324). Please note that the empirical data used in the manuscript is not available as part of this repository. A subsample of the data used in the empirical example are openly available as an example data set in the R package mHMMbayes on CRAN. The full data set is available on request from the authors.
This study includes five data files and corresponding exercise instructions.
Four of the five data files and instructions were produced from the National Child Development Study datasets for an ESRC-funded workshop on Multilevel Event History Analysis, held in February 2005. The workshop data includes three files in ASCII DAT format and one in SPSS SAV format. Further information and documentation beyond that included in this study, and MLwiN software downloads are available from the Centre for Multilevel Modelling web site.
In addition, for the second edition of the study, example data and documentation for fitting multilevel multiprocess event history models using aML software were added to the dataset (the data file 'amlex.raw'). The aML syntax file that accompanies these data can also be found at the Centre for Multilevel Modelling web site noted above.
The project from which these data were produced was conducted under the ESRC Research Methods programme. It involved the development of multilevel simultaneous equations models for the analysis of correlated event histories. The research was motivated by a study of the interrelationships between partnership (marriage or cohabitation) durations and decisions about childbearing, using event history data from the 1958 and 1970 British Birth Cohort studies (in the case of this dataset, NCDS).
Additional aims and objectives of the project were to develop methodology for the analysis of complex event history data; provide means for implementing methodology in existing software; and provide social scientists with practical training in advanced event history analysis.
The Health Survey for England, 2003-2005: Multilevel Modelling Teaching Dataset has been prepared as a resource for those interested in learning multilevel modelling techniques. It was first presented as part of a workshop entitled 'Introducing multilevel models and applying them to the Health Survey for England using MLwiN'. The HSE teaching dataset is available in both Stata and MLwIN formats and is accompanied by a practical guide that includes the multilevel modelling practical exercises. A separate document provides information on the teaching dataset and materials.
The main dataset is an edited version of the Health Survey for England (HSE) data from 2003, 2004 and 2005 (the full HSEs are at the UK Data Archive under SNs 5098, 5439 and 5675). Details of the recoding of HSE variables for the teaching dataset and how the aggregate data were produced can be found in the documentation.
WARNING – Users should note that this dataset is intended as a learning resource and should not be used for research purposes. In particular the dataset uses adult measures of Body Mass Index (BMI) for children and so the results from the data should not be reported in research contexts.