100+ datasets found
  1. H

    Replication data for: Unpacking the Black Box of Causality: Learning about...

    • dataverse.harvard.edu
    Updated Aug 20, 2011
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kosuke Imai; Luke Keele; Dustin Tingley; Teppei Yamamoto (2011). Replication data for: Unpacking the Black Box of Causality: Learning about Causal Mechanisms from Experimental and Observational Studies [Dataset]. http://doi.org/10.7910/DVN/X73I3J
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Aug 20, 2011
    Dataset provided by
    Harvard Dataverse
    Authors
    Kosuke Imai; Luke Keele; Dustin Tingley; Teppei Yamamoto
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Identifying causal mechanisms is a fundamental goal of social science. Researchers seek to study not only whether one variable affects another but also how such a causal relationship arises. Yet, commonly used statistical methods for identifying causal mechanisms rely upon untestable assumptions and are often inappropriate even under those assumptions. Randomizing treatment and intermediate variables is also insufficient. Despite these difficulties, study of causal mechanisms is too important to abandon. We make three contributions to improve research on causal mechanisms. First, we present a minimum set of assumptions required under standard designs of experimental and observational studies and develop a general algorithm for estimating causal mediation effects. Second, we provide a method to assess sensitivity of conclusions to potential violations of a key assumption. Third, we offer alternative research designs for identifying causal mechanisms under weaker assumptions. The proposed approach is illustrated using media framing experiments and incumbency advantage studies

  2. H

    Replication data for: Understanding the Past: Statistical Analysis of Causal...

    • dataverse.harvard.edu
    Updated Aug 7, 2011
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Teppei Yamamoto (2011). Replication data for: Understanding the Past: Statistical Analysis of Causal Attribution [Dataset]. http://doi.org/10.7910/DVN/RDAJDB
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Aug 7, 2011
    Dataset provided by
    Harvard Dataverse
    Authors
    Teppei Yamamoto
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Would the third-wave democracies have been democratized without prior modernization? What proportion of the past militarized disputes between non-democracies would have been prevented had those dyads been democratic? Although political scientists often ask these questions of causal attribution, existing quantitative methods fail to address them. This paper proposes an alternative statistical methodology based on the widely accepted counterfactual framework of causal inference. The contribution of this paper is threefold. First, the paper clarifies differences between causal attribution and causal effects by specifying the type of research questions to which each quantity is relevant. Second, it provides a clear resolution of the long-standing methodological debate on "selection on the dependent variable." Third, the paper derives new nonparametric identification results, showing that the complier probability of causal attribution can be identified using an instrumental variable. The proposed framework is illustrated via empirical examples from three subfields of political science.

  3. f

    Data from: Detecting Causality by Combined Use of Multiple Methods: Climate...

    • datasetcatalog.nlm.nih.gov
    • plos.figshare.com
    Updated Sep 28, 2016
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Amigó, José M.; Yokota, Ryo; Aihara, Kazuyuki; Hirata, Yoshito; Mushiake, Hajime; Matsuzaka, Yoshiya (2016). Detecting Causality by Combined Use of Multiple Methods: Climate and Brain Examples [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0001820955
    Explore at:
    Dataset updated
    Sep 28, 2016
    Authors
    Amigó, José M.; Yokota, Ryo; Aihara, Kazuyuki; Hirata, Yoshito; Mushiake, Hajime; Matsuzaka, Yoshiya
    Description

    Identifying causal relations from time series is the first step to understanding the behavior of complex systems. Although many methods have been proposed, few papers have applied multiple methods together to detect causal relations based on time series generated from coupled nonlinear systems with some unobserved parts. Here we propose the combined use of three methods and a majority vote to infer causality under such circumstances. Two of these methods are proposed here for the first time, and all of the three methods can be applied even if the underlying dynamics is nonlinear and there are hidden common causes. We test our methods with coupled logistic maps, coupled Rössler models, and coupled Lorenz models. In addition, we show from ice core data how the causal relations among the temperature, the CH4 level, and the CO2 level in the atmosphere changed in the last 800,000 years, a conclusion also supported by irregularly sampled data analysis. Moreover, these methods show how three regions of the brain interact with each other during the visually cued, two-choice arm reaching task. Especially, we demonstrate that this is due to bottom up influences at the beginning of the task, while there exist mutual influences between the posterior medial prefrontal cortex and the presupplementary motor area. Based on our results, we conclude that identifying causality with an appropriate ensemble of multiple methods ensures the validity of the obtained results more firmly.

  4. d

    Data repository for \"Causal Analysis\"

    • dataone.org
    • dataverse.harvard.edu
    Updated Oct 29, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Huber, Martin (2025). Data repository for \"Causal Analysis\" [Dataset]. http://doi.org/10.7910/DVN/HLC4EW
    Explore at:
    Dataset updated
    Oct 29, 2025
    Dataset provided by
    Harvard Dataverse
    Authors
    Huber, Martin
    Description

    Datasets as well as R and Python code of the empirical examples in the book "Causal Analysis" by Martin Huber (2023), published by MIT Press.

  5. 4

    Code repository for Ph.D. dissertation "Safer Causal Inference: Theory &...

    • data.4tu.nl
    zip
    Updated Nov 3, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rickard Karlsson (2025). Code repository for Ph.D. dissertation "Safer Causal Inference: Theory & Algorithms for Falsification, Trial Augmentation and Policy Evaluation" [Dataset]. http://doi.org/10.4121/6aded155-99fe-44c3-880d-690ded500ccc.v3
    Explore at:
    zipAvailable download formats
    Dataset updated
    Nov 3, 2025
    Dataset provided by
    4TU.ResearchData
    Authors
    Rickard Karlsson
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Time period covered
    Jan 9, 2025
    Description

    Collection of source code implementing methods and for reproducing experiments included in each chapter of the Ph.D. dissertation "Safer Causal Inference: Theory & Algorithms for Falsification, Trial Augmentation and Policy Evaluation". The source code also includes methods for generating simulated datasets used in the evaluation of the methods. The goal of the of the research was to develop methods to improve treatment effect estimation, this includes: methods to detect unmeasured confounding from observational data, methods to integrate historical data into randomized experiments to improve data efficiency, methods to evaluate treatment policies under treatment interference.

  6. Data from: Replication Package

    • figshare.com
    zip
    Updated Nov 2, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Costanza Tortù (2022). Replication Package [Dataset]. http://doi.org/10.6084/m9.figshare.21463092.v1
    Explore at:
    zipAvailable download formats
    Dataset updated
    Nov 2, 2022
    Dataset provided by
    figshare
    Figsharehttp://figshare.com/
    Authors
    Costanza Tortù
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Replication package related to the paper "Estimating Causal effects of Multi-valued Treatments accounting for Network Interference: Immigration policies and crime rates"

  7. H

    Replication data for: `Reasoning about Interference Between Units: A General...

    • data.niaid.nih.gov
    • dataverse.harvard.edu
    • +1more
    application/warc
    Updated Oct 2, 2014
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jake Bowers (2014). Replication data for: `Reasoning about Interference Between Units: A General Framework' [Dataset]. http://doi.org/10.7910/DVN/3EGVBB
    Explore at:
    application/warcAvailable download formats
    Dataset updated
    Oct 2, 2014
    Dataset provided by
    University of Illinois at Urbana-Champaign
    Authors
    Jake Bowers
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Area covered
    NA
    Description

    If an experimental treatment is experienced by both treated and control group units, tests of hypotheses about causal effects may be difficult to conceptualize let alone execute. In this paper, we show how counterfactual causal models may be written and tested when theories suggest spillover or other network-based interference among experimental units. We show that the ``no interference'' assumption need not constrain scholars who have interesting questions about interference. We offer researchers the ability to model theories about how treatment given to some units may come to influence outcomes for other units. We further show how to test hypotheses about these causal effects, and we provide tools to enable researchers to assess the operating characteristics of their tests given their own models, designs, test statistics, and data. The conceptual and methodological framework we develop here is particularly applicable to social networks, but may be usefully deployed whenever a researcher wonders about interference between units. Interference between units need not be an untestable assumption; instead, interference is an opportunity to ask meaningful questions about theoretically interesting phenomena.

  8. N

    Replication Data for: Causal decomposition in the mutual causation system

    • dataverse.lib.nycu.edu.tw
    Updated Jun 1, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    NYCU Dataverse (2022). Replication Data for: Causal decomposition in the mutual causation system [Dataset]. http://doi.org/10.57770/JWMMD0
    Explore at:
    bin(21504), text/x-matlab(262), bin(16896), application/matlab-mat(1836), text/x-matlab(7871), bin(22140), text/x-matlab(331), text/x-matlab(2708), text/x-matlab(3317), text/x-matlab(127)Available download formats
    Dataset updated
    Jun 1, 2022
    Dataset provided by
    NYCU Dataverse
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Inference of causality in time series has been principally based on the prediction paradigm. Nonetheless, the predictive causality approach may underestimate the simultaneous and reciprocal nature of causal interactions observed in real-world phenomena. Here, we present a causal-decomposition approach that is not based on prediction, but based on the covariation of cause and effect: cause is that which put, the effect follows; and removed, the effect is removed. Using empirical mode decomposition, we show that causal interaction is encoded in instantaneous phase dependency at a specific time scale, and this phase dependency is diminished when the causal-related intrinsic component is removed from the effect. Furthermore, we demonstrate the generic applicability of our method to both stochastic and deterministic systems, and show the consistency of causal-decomposition method compared to existing methods, and finally uncover the key mode of causal interactions in both modelled and actual predator–prey systems.

  9. Z

    Causal Dataset for cause-effect pairs from Tubingen repository

    • data.niaid.nih.gov
    • zenodo.org
    Updated May 3, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Drton, Mathias; Haug, Stephan; Reifferscheidt, David; Zadorozhnyi, Oleksandr (2023). Causal Dataset for cause-effect pairs from Tubingen repository [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_7709406
    Explore at:
    Dataset updated
    May 3, 2023
    Dataset provided by
    Technical University of Munich
    Authors
    Drton, Mathias; Haug, Stephan; Reifferscheidt, David; Zadorozhnyi, Oleksandr
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Tübingen
    Description

    Cause-effect is a two dimensional database with two-variable cause-effect pairs chosen from the different datasets created by Max-Planck-Institute for Biological Cybernetics in Tuebingen, Germany.

    Size: 83 datasets of various sizes

    Number of features: 2 in every datasets

    Ground truth: avalaible for every dataset

    Type of Graph: directed

    Extension of the datasets used in CauseEffectPairs task. Each dataset consists of samples of a pair of statistically dependent random variables, where one variable is known to cause the other one. The task is to identify for each pair which of the two variables is the cause and which one the effect, using the observed samples only

    More information about the dataset is contained in causal_description.html file.

    Reference

    J. M. Mooij, J. Peters, D. Janzing, J. Zscheischler, B. Schoelkopf: “Distinguishing cause from effect using observational data: methods and benchmarks”, Journal of Machine Learning Research 17(32):1-102, 2016

  10. Causal AI Market By Application (Service, Supply Chain Optimization,...

    • verifiedmarketresearch.com
    Updated Sep 15, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    VERIFIED MARKET RESEARCH (2024). Causal AI Market By Application (Service, Supply Chain Optimization, Marketing & Sales Optimization), Vertical (Healthcare, BFSI, Manufacturing, Retail & E-commerce, Transportation, Automotive), & Region for 2024-2031 [Dataset]. https://www.verifiedmarketresearch.com/product/causal-ai-market/
    Explore at:
    Dataset updated
    Sep 15, 2024
    Dataset provided by
    Verified Market Researchhttps://www.verifiedmarketresearch.com/
    Authors
    VERIFIED MARKET RESEARCH
    License

    https://www.verifiedmarketresearch.com/privacy-policy/https://www.verifiedmarketresearch.com/privacy-policy/

    Time period covered
    2024 - 2031
    Area covered
    Global
    Description

    Causal AI Market size was valued at USD 11.77 Million in 2024 and is projected to reach USD 256.73 Million by 2031, growing at a CAGR of 47.1% during the forecast period 2024-2031.

    Causal AI also known as causal artificial intelligence is a significant innovation in the fields of artificial intelligence and machine learning that focuses on identifying and harnessing cause-and-effect linkages in data. Traditional AI models generally use correlation-based methods to detect patterns and generate predictions. While these methods can be quite useful in specific applications, they frequently fall short in situations where understanding the underlying causal mechanisms is critical. Causal AI overcomes this issue by incorporating principles from causal inference, a branch of statistics and philosophy that investigates how to infer causal correlations from data.

    Causal AI is a huge leap in the field of artificial intelligence allowing us to go beyond correlation to discover the true drivers of observed occurrences. Its applications are broad and diverse including healthcare, finance, marketing, policymaking, operations, education, the environment, and social sciences. Causal AI improves decision-making and allows for the development of focused solutions to meet difficult situations by offering a richer grasp of causality.

    Causal AI (Artificial Intelligence) has the potential to change a wide range of domains by providing more precise and actionable insights than typical machine learning models. Causal AI differs from traditional AI in that it focuses on understanding the cause-and-effect relationships underlying data rather than correlations and patterns. This change from correlation to causation is a huge step forward with the potential to improve decision-making processes make better forecasts, and maximize outcomes in a variety of industries including healthcare, finance, marketing, and others.

  11. H

    Replication Data for: Causal Inference with Latent Treatments

    • dataverse.harvard.edu
    Updated Apr 26, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Christian Fong; Justin Grimmer (2023). Replication Data for: Causal Inference with Latent Treatments [Dataset]. http://doi.org/10.7910/DVN/MVDWCS
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Apr 26, 2023
    Dataset provided by
    Harvard Dataverse
    Authors
    Christian Fong; Justin Grimmer
    License

    https://dataverse.harvard.edu/api/datasets/:persistentId/versions/1.1/customlicense?persistentId=doi:10.7910/DVN/MVDWCShttps://dataverse.harvard.edu/api/datasets/:persistentId/versions/1.1/customlicense?persistentId=doi:10.7910/DVN/MVDWCS

    Description

    Social scientists are interested in the effects of low-dimensional latent treatments within texts, such as the effect of an attack on a candidate in a political advertisement. We provide a framework for causal inference with latent treatments in high-dimensional interventions. Using this framework, we show that the randomization of texts alone is insufficient to identify the causal effects of latent treatments, because other unmeasured treatments in the text could confound the measured treatment's effect. We provide a set of assumptions that is sufficient to identify the effect of latent treatments and a set of strategies to make these assumptions more plausible, including explicitly adjusting for potentially confounding text features and non-traditional experimental designs involving many versions of the text. We apply our framework to a survey experiment and an observational study, demonstrating how our framework makes text-based causal inferences more credible.

  12. Semiparametric Causal Inference Methods for Adaptive Statistical Learning in...

    • icpsr.umich.edu
    Updated Aug 26, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Hubbard, Alan (2025). Semiparametric Causal Inference Methods for Adaptive Statistical Learning in Trauma Patient-Centered Outcomes Research [Methods Study], 2013-2018 [Dataset]. http://doi.org/10.3886/ICPSR39471.v1
    Explore at:
    Dataset updated
    Aug 26, 2025
    Dataset provided by
    Inter-university Consortium for Political and Social Researchhttps://www.icpsr.umich.edu/web/pages/
    Authors
    Hubbard, Alan
    License

    https://www.icpsr.umich.edu/web/ICPSR/studies/39471/termshttps://www.icpsr.umich.edu/web/ICPSR/studies/39471/terms

    Time period covered
    2013 - 2018
    Area covered
    United States
    Description

    Electronic health records store a lot of data about a patient. These data often include age, health problems, current medicines, and lab results. Looking at these data may help doctors treating patients after a trauma predict how likely it is that they will respond well to a treatment and survive. This information can help doctors make better treatment decisions. But first, researchers need to figure out how to combine and analyze data to make accurate predictions. In this study, the research team created new statistical methods to combine data from patient records. They used these methods to predict patient health outcomes. Then the team used health record data collected from patients in hospital trauma centers to test their predictions. To access the methods and software, please visit the following GitHubs: origami varimpact opttx

  13. bnlearn datasets

    • zenodo.org
    • data.niaid.nih.gov
    zip
    Updated Jan 29, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Zenodo (2025). bnlearn datasets [Dataset]. http://doi.org/10.5281/zenodo.7676616
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jan 29, 2025
    Dataset provided by
    Zenodohttp://zenodo.org/
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    This collection consists of 5 structure learning datasets from the Bayesian Network Repository (Scutari, 2010).

    Task: The dataset collection can be used to study causal discovery algorithms.

    Summary:

    • Size of collection: 5 datasets with 3 - 56 columns of various sizes
    • Task: Causal Discovery
    • Data Type: Discrete
    • Dataset Scope: Collection
    • Ground Truth: Known / Estimated
    • Temporal Structure: No
    • License: TBD
    • Missing Values: No

    Missingness Statement: There are no missing values.

    Collection:

    The alarm dataset contains the following 37 variables:

    • CVP (central venous pressure): a three-level factor with levels LOW, NORMAL and HIGH.
    • PCWP (pulmonary capillary wedge pressure): a three-level factor with levels LOW, NORMAL and HIGH.
    • HIST (history): a two-level factor with levels TRUE and FALSE.
    • TPR (total peripheral resistance): a three-level factor with levels LOW, NORMAL and HIGH.
    • ... (33 more variables, see the corresponding .html file)

    The binary synthetic asia dataset:

    • D (dyspnoea), a two-level factor with levels yes and no.
    • T (tuberculosis), a two-level factor with levels yes and no.
    • L (lung cancer), a two-level factor with levels yes and no.
    • B (bronchitis), a two-level factor with levels yes and no.
    • A(visit to Asia), a two-level factor with levels yes and no.
    • S (smoking), a two-level factor with levels yes and no.
    • X (chest X-ray), a two-level factor with levels yes and no.
    • E (tuberculosis versus lung cancer/bronchitis), a two-level factor with levels yes and no.

    The binary coronary dataset:

    • Smoking (smoking): a two-level factor with levels no and yes.
    • M. Work (strenuous mental work): a two-level factor with levels no and yes.
    • P. Work (strenuous physical work): a two-level factor with levels no and yes.
    • Pressure (systolic blood pressure): a two-level factor with levels <140 and >140.
    • Proteins (ratio of beta and alpha lipoproteins): a two-level factor with levels <3 and >3.
    • Family (family anamnesis of coronary heart disease): a two-level factor with levels neg and pos.

    The hailfinder dataset contains the following 56 variables:

    • N07muVerMo (10.7mu vertical motion): a four-level factor with levels StrongUp, WeakUp, Neutral and Down.
    • SubjVertMo (subjective judgment of vertical motion): a four-level factor with levels StrongUp, WeakUp, Neutral and Down.
    • QGVertMotion (quasigeostrophic vertical motion): a four-level factor with levels StrongUp, WeakUp, Neutral and Down.
    • CombVerMo (combined vertical motion): a four-level factor with levels StrongUp, WeakUp, Neutral and Down.
    • AreaMesoALS (area of meso-alpha): a four-level factor with levels StrongUp, WeakUp, Neutral and Down.
    • SatContMoist (satellite contribution to moisture): a four-level factor with levels VeryWet, Wet, Neutral and Dry.
    • ... (49 more variables are in the correspondent .html file)

    The lizards dataset contains the following 3 variables:

    • Species (the species of the lizard): a two-level factor with levels Sagrei and Distichus.
    • Height (perch height): a two-level factor with levels high (greater than 4.75 feet) and low (lesser or equal to 4.75 feet).
    • Diameter (perch diameter): a two-level factor with levels narrow (greater than 4 inches) and wide (lesser or equal to 4 inches).
  14. f

    Data from: Confounded Local Inference: Extending Local Moran Statistics to...

    • tandf.figshare.com
    png
    Updated Dec 9, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Levi John Wolf (2024). Confounded Local Inference: Extending Local Moran Statistics to Handle Confounding [Dataset]. http://doi.org/10.6084/m9.figshare.25594934.v1
    Explore at:
    pngAvailable download formats
    Dataset updated
    Dec 9, 2024
    Dataset provided by
    Taylor & Francis
    Authors
    Levi John Wolf
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Local statistical analysis has long been of interest to social and environmental scientists who analyze geographic data. Research into local spatial statistics experienced a step-change in the mid-1990s, which provided a large class of local statistical methods and models. The local Moran statistic is one commonly used local indicator of spatial association, able to detect both areas of similarity and observations that are very dissimilar from their surroundings. From this, many further local statistics have been developed to characterize spatial clusters and outliers. These statistics have seen limited adoption because they do not sufficiently model the relationships involved in confounded spatial data, where the analyst seeks to understand the local spatial structure of a given outcome variable that is influenced by one or more additional factors. Recent innovations used to do joint multivariate local analysis also do not model this kind of conditional local structure in data. This article provides tools to rigorously characterize confounded local inference and a new and different class of multivariate conditional local Moran statistics that can account for confounding. To do this, we return to the Moran scatterplot as the critical tool for local Moran-style covariance statistics. Extending this concept, a new method is available directly from a “Moran-form” multiple regression. We show the empirical and theoretical properties of this statistic, show how some existing heuristic approaches arise naturally from this framework, and show how the use of conditional inference can change interpretations in an empirical analysis of rent and housing stock in a rapidly changing neighborhood.

  15. H

    Data from: Causal Inference with Spatially Disaggregated Data: Some...

    • dataverse.harvard.edu
    • search.dataone.org
    Updated May 1, 2014
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Marion Dumas; Johannes Castner; Petr Gocev (2014). Causal Inference with Spatially Disaggregated Data: Some Potentials and Limits [Dataset]. http://doi.org/10.7910/DVN/PRTGLC
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 1, 2014
    Dataset provided by
    Harvard Dataverse
    Authors
    Marion Dumas; Johannes Castner; Petr Gocev
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    In studies of civil strife, the ecological fallacy seems to befall all large-$n$ studies and thus there has been a big push, by several researchers, in recent years to gather disaggregated, spatially explicit data. However, while such efforts are heroic and are likely to lead to better information, we find that the resulting data can not be analysed in conventional ways, if the estimation of causal effects is the goal. The reason is that such data brings about other dangers: the violation of the Stable Unit Treatment Value Assumption (SUTVA). To be specific, one ``treated'' group's enemy could hardly be its control. We get around this problem by changing the causal effect of interest and by carefully re-aggregating the lower level data so as to preserve its most salient information. Restricting our analysis to groups that are excluded from power, we find some tentative evidence that such groups are less likely to engage in conflict if they are more spatially integrated with other groups.

  16. Empathy dataset

    • zenodo.org
    • data.niaid.nih.gov
    • +1more
    bin, csv, html
    Updated Dec 18, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Zenodo (2024). Empathy dataset [Dataset]. http://doi.org/10.5281/zenodo.7683907
    Explore at:
    bin, html, csvAvailable download formats
    Dataset updated
    Dec 18, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Description

    The database for this study (Briganti et al. 2018; the same for the Braun study analysis) was composed of 1973 French-speaking students in several universities or schools for higher education in the following fields: engineering (31%), medicine (18%), nursing school (16%), economic sciences (15%), physiotherapy, (4%), psychology (11%), law school (4%) and dietetics (1%). The subjects were 17 to 25 years old (M = 19.6 years, SD = 1.6 years), 57% were females and 43% were males. Even though the full dataset was composed of 1973 participants, only 1270 answered the full questionnaire: missing data are handled using pairwise complete observations in estimating a Gaussian Graphical Model, meaning that all available information from every subject are used.

    The feature set is composed of 28 items meant to assess the four following components: fantasy, perspective taking, empathic concern and personal distress. In the questionnaire, the items are mixed; reversed items (items 3, 4, 7, 12, 13, 14, 15, 18, 19) are present. Items are scored from 0 to 4, where “0” means “Doesn’t describe me very well” and “4” means “Describes me very well”; reverse-scoring is calculated afterwards. The questionnaires were anonymized. The reanalysis of the database in this retrospective study was approved by the ethical committee of the Erasmus Hospital.

    Size: A dataset of size 1973*28

    Number of features: 28

    Ground truth: No

    Type of Graph: Mixed graph

    The following gives the description of the variables:

    FeatureFeatureLabelDomainItem meaning from Davis 1980
    0011FSGreenI daydream and fantasize, with some regularity, about things that might happen to me.
    0022ECPurpleI often have tender, concerned feelings for people less fortunate than me.
    0033PT_RYellowI sometimes find it difficult to see things from the “other guy’s” point of view.
    0044EC_RPurpleSometimes I don’t feel very sorry for other people when they are having problems.
    0055FSGreenI really get involved with the feelings of the characters in a novel.
    0066PDRedIn emergency situations, I feel apprehensive and ill-at-ease.
    0077FS_RGreenI am usually objective when I watch a movie or play, and I don’t often get completely caught up in it.(Reversed)
    0088PTYellowI try to look at everybody’s side of a disagreement before I make a decision.
    0099ECPurpleWhen I see someone being taken advantage of, I feel kind of protective towards them.
    01010PDRedI sometimes feel helpless when I am in the middle of a very emotional situation.
    01111PTYellowsometimes try to understand my friends better by imagining how things look from their perspective
    01212FS_RGreenBecoming extremely involved in a good book or movie is somewhat rare for me. (Reversed)
    01313PD_RRedWhen I see someone get hurt, I tend to remain calm. (Reversed)
    01414EC_RPurpleOther people’s misfortunes do not usually disturb me a great deal. (Reversed)
    01515PT_RYellowIf I’m sure I’m right about something, I don’t waste much time listening to other people’s arguments. (Reversed)
    01616FSGreenAfter seeing a play or movie, I have felt as though I were one of the characters.
    01717PDRedBeing in a tense emotional situation scares me.
    01818EC_RPurpleWhen I see someone being treated unfairly, I sometimes don’t feel very much pity for them. (Reversed)
    01919PD_RRedI am usually pretty effective in dealing with emergencies. (Reversed)
    02020FSGreenI am often quite touched by things that I see happen.
    02121PTYellowI believe that there are two sides to every question and try to look at them both.
    02222ECPurpleI would describe myself as a pretty soft-hearted person.
    02323FSGreenWhen I watch a good movie, I can very easily put myself in the place of a leading character.
    02424PDRedI tend to lose control during emergencies.
    02525PTYellowWhen I’m upset at someone, I usually try to “put myself in his shoes” for a while.
    02626FSGreenWhen I am reading an interesting story or novel, I imagine how I would feel if the events in the story were happening to me.
    02727PDRedWhen I see someone who badly needs help in an emergency, I go to pieces.
    02828PTYellowBefore criticizing somebody, I try to imagine how I would feel if I were in their place

    More information about the dataset is contained in empathy_description.html file.

  17. G

    Causal Inference Platform Market Research Report 2033

    • growthmarketreports.com
    csv, pdf, pptx
    Updated Aug 21, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Growth Market Reports (2025). Causal Inference Platform Market Research Report 2033 [Dataset]. https://growthmarketreports.com/report/causal-inference-platform-market
    Explore at:
    csv, pdf, pptxAvailable download formats
    Dataset updated
    Aug 21, 2025
    Dataset authored and provided by
    Growth Market Reports
    Time period covered
    2024 - 2032
    Area covered
    Global
    Description

    Causal Inference Platform Market Outlook



    According to our latest research, the global causal inference platform market size reached USD 1.24 billion in 2024, reflecting a robust surge in adoption across multiple industries. The market is anticipated to expand at a compelling CAGR of 26.9% during the forecast period, with the market size projected to reach approximately USD 10.78 billion by 2033. This exceptional growth trajectory is underpinned by the increasing demand for advanced analytics solutions that can extract actionable insights from complex data, enabling organizations to make data-driven decisions with higher confidence.



    The primary growth factor propelling the causal inference platform market is the exponential increase in the volume and complexity of data generated across sectors such as healthcare, finance, retail, and manufacturing. As organizations strive to move beyond traditional correlation-based analytics, the need for platforms that can accurately determine causality has become paramount. Causal inference platforms are uniquely positioned to address this need, providing sophisticated statistical methodologies and machine learning algorithms that distinguish causation from mere correlation. This capability is critical for organizations seeking to optimize operations, improve customer experiences, and minimize risks, thereby fueling the widespread adoption of these platforms.



    Another significant driver is the rapid advancement of artificial intelligence (AI) and machine learning technologies, which have made causal inference platforms more accessible, scalable, and user-friendly. The integration of causal inference capabilities into existing data analytics ecosystems allows enterprises to unlock deeper insights, automate decision-making processes, and accelerate innovation. Furthermore, the growing emphasis on explainable AI and regulatory compliance is compelling organizations to adopt solutions that offer transparency and traceability in their analytical models. This trend is particularly evident in regulated industries such as healthcare and finance, where understanding the underlying causes of outcomes is essential for both operational effectiveness and legal compliance.



    The increasing collaboration between academia, research institutions, and commercial enterprises is also fueling the growth of the causal inference platform market. Academic advancements in causal inference methodologies are rapidly being translated into commercial applications, enabling a continuous cycle of innovation. Additionally, government organizations are investing in causal inference platforms to enhance policy-making, public health initiatives, and economic planning. The convergence of these factors is creating a fertile environment for the causal inference platform market to thrive, with vendors continuously enhancing their offerings to meet the evolving needs of diverse end-users.



    Regionally, North America dominates the causal inference platform market, driven by the presence of leading technology companies, high adoption of advanced analytics, and substantial investments in AI research. Europe follows closely, benefiting from a robust regulatory framework and a strong focus on data-driven innovation. The Asia Pacific region is emerging as a high-growth market, fueled by rapid digital transformation, expanding research activities, and increasing investments in AI infrastructure. Latin America and the Middle East & Africa are also witnessing steady growth, supported by government initiatives and the gradual modernization of enterprise IT environments. This global momentum underscores the transformative potential of causal inference platforms across industries and geographies.





    Component Analysis



    The component segment of the causal inference platform market is bifurcated into software and services, each playing a pivotal role in the market’s expansion. Software solutions constitute the backbone of causal inference platforms, providing users with advanced statistical modeling, simulation, and visualization

  18. Z

    Wikidata Causal Event Triple Data

    • data.niaid.nih.gov
    • zenodo.org
    Updated Feb 7, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sola; Debarun; Oktie (2023). Wikidata Causal Event Triple Data [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_7196048
    Explore at:
    Dataset updated
    Feb 7, 2023
    Dataset provided by
    Shirai
    Hassanzadeh
    Bhattacharjya
    Authors
    Sola; Debarun; Oktie
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset contains triples curated from Wikidata surrounding news events with causal relations, and is released as part of our WWW'23 paper, "Event Prediction using Case-Based Reasoning over Knowledge Graphs".

    Starting from a set of classes that we consider to be types of "events", we queried Wikidata to collect entities that were an instanceOf an event class and that were connected to another such event entity by a causal triple (https://www.wikidata.org/wiki/Wikidata:List_of_properties/causality). For all such cause-effect event pairs, we then collected a 3-hop neighborhood of outgoing triples.

  19. H

    Replication Data for: Generalized Synthetic Control Method: Causal Inference...

    • dataverse.harvard.edu
    Updated Aug 26, 2016
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yiqing Xu (2016). Replication Data for: Generalized Synthetic Control Method: Causal Inference with Interactive Fixed Effects Models [Dataset]. http://doi.org/10.7910/DVN/8AKACJ
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Aug 26, 2016
    Dataset provided by
    Harvard Dataverse
    Authors
    Yiqing Xu
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    This replication file contains data and source code to replicate the results in "Generalized Synthetic Control Method: Causal Inference with Interactive Fixed Effects Models" by Yiqing Xu

  20. t

    Doubly Robust Estimation in Missing Data and Causal Inference Models -...

    • service.tib.eu
    Updated Dec 16, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). Doubly Robust Estimation in Missing Data and Causal Inference Models - Dataset - LDM [Dataset]. https://service.tib.eu/ldmservice/dataset/doubly-robust-estimation-in-missing-data-and-causal-inference-models
    Explore at:
    Dataset updated
    Dec 16, 2024
    Description

    The paper discusses the doubly robust estimator for missing data and causal inference models.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Kosuke Imai; Luke Keele; Dustin Tingley; Teppei Yamamoto (2011). Replication data for: Unpacking the Black Box of Causality: Learning about Causal Mechanisms from Experimental and Observational Studies [Dataset]. http://doi.org/10.7910/DVN/X73I3J

Replication data for: Unpacking the Black Box of Causality: Learning about Causal Mechanisms from Experimental and Observational Studies

Related Article
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Aug 20, 2011
Dataset provided by
Harvard Dataverse
Authors
Kosuke Imai; Luke Keele; Dustin Tingley; Teppei Yamamoto
License

CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically

Description

Identifying causal mechanisms is a fundamental goal of social science. Researchers seek to study not only whether one variable affects another but also how such a causal relationship arises. Yet, commonly used statistical methods for identifying causal mechanisms rely upon untestable assumptions and are often inappropriate even under those assumptions. Randomizing treatment and intermediate variables is also insufficient. Despite these difficulties, study of causal mechanisms is too important to abandon. We make three contributions to improve research on causal mechanisms. First, we present a minimum set of assumptions required under standard designs of experimental and observational studies and develop a general algorithm for estimating causal mediation effects. Second, we provide a method to assess sensitivity of conclusions to potential violations of a key assumption. Third, we offer alternative research designs for identifying causal mechanisms under weaker assumptions. The proposed approach is illustrated using media framing experiments and incumbency advantage studies

Search
Clear search
Close search
Google apps
Main menu