100+ datasets found
  1. n

    Data from: Reliable species distributions are obtainable with sparse, patchy...

    • data.niaid.nih.gov
    • datadryad.org
    • +1more
    zip
    Updated May 31, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Samantha L. Peel; Nicole A. Hill; Scott D. Foster; Simon J. Wotherspoon; Claudio Ghiglione; Stefano Schiaparelli (2019). Reliable species distributions are obtainable with sparse, patchy and biased data by leveraging over species and data types [Dataset]. http://doi.org/10.5061/dryad.2226v8m
    Explore at:
    zipAvailable download formats
    Dataset updated
    May 31, 2019
    Dataset provided by
    University of Tasmania
    Commonwealth Scientific and Industrial Research Organisation
    Italian National Antarctic Museum (MNA, Section of Genoa) Genoa Italy
    Authors
    Samantha L. Peel; Nicole A. Hill; Scott D. Foster; Simon J. Wotherspoon; Claudio Ghiglione; Stefano Schiaparelli
    License

    https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html

    Description
    1. New methods for species distribution models (SDMs) utilise presence‐absence (PA) data to correct the sampling bias of presence‐only (PO) data in a spatial point process setting. These have been shown to improve species estimates when both data sets are large and dense. However, is a PA data set that is smaller and patchier than hitherto examined able to do the same? Furthermore, when both data sets are relatively small, is there enough information contained within them to produce a useful estimate of species’ distributions? These attributes are common in many applications.

    2. A stochastic simulation was conducted to assess the ability of a pooled data SDM to estimate the distribution of species from increasingly sparser and patchier data sets. The simulated data sets were varied by changing the number of presence‐absence sample locations, the degree of patchiness of these locations, the number of PO observations, and the level of sampling bias within the PO observations. The performance of the pooled data SDM was compared to a PA SDM and a PO SDM to assess the strengths and limitations of each SDM.

    3. The pooled data SDM successfully removed the sampling bias from the PO observations even when the presence‐absence data was sparse and patchy, and the PO observations formed the majority of the data. The pooled data SDM was, in general, more accurate and more precise than either the PA SDM or the PO SDM. All SDMs were more precise for the species responses than they were for the covariate coefficients.

    4. The emerging SDM methodology that pools PO and PA data will facilitate more certainty around species’ distribution estimates, which in turn will allow more relevant and concise management and policy decisions to be enacted. This work shows that it is possible to achieve this result even in relatively data‐poor regions.

  2. d

    Hate Crimes by County and Bias Type: Beginning 2010

    • catalog.data.gov
    • datadiscoverystudio.org
    • +3more
    Updated Nov 10, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    data.ny.gov (2023). Hate Crimes by County and Bias Type: Beginning 2010 [Dataset]. https://catalog.data.gov/dataset/hate-crimes-by-county-and-bias-type-beginning-2010
    Explore at:
    Dataset updated
    Nov 10, 2023
    Dataset provided by
    data.ny.gov
    Description

    Under New York State’s Hate Crime Law (Penal Law Article 485), a person commits a hate crime when one of a specified set of offenses is committed targeting a victim because of a perception or belief about their race, color, national origin, ancestry, gender, religion, religious practice, age, disability, or sexual orientation, or when such an act is committed as a result of that type of perception or belief. These types of crimes can target an individual, a group of individuals, or public or private property. DCJS submits hate crime incident data to the FBI’s Uniform Crime Reporting (UCR) Program. Information collected includes number of victims, number of offenders, type of bias motivation, and type of victim.

  3. Data from: Wide range screening of algorithmic bias in word embedding models...

    • zenodo.org
    • data.niaid.nih.gov
    bin, zip
    Updated Jun 2, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    David Rozado; David Rozado (2022). Data from: Wide range screening of algorithmic bias in word embedding models using large sentiment lexicons reveals underreported bias types [Dataset]. http://doi.org/10.5061/dryad.rbnzs7h7w
    Explore at:
    zip, binAvailable download formats
    Dataset updated
    Jun 2, 2022
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    David Rozado; David Rozado
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Concerns about gender bias in word embedding models have captured substantial attention in the algorithmic bias research literature. Other bias types however have received lesser amounts of scrutiny. This work describes a large-scale analysis of sentiment associations in popular word embedding models along the lines of gender and ethnicity but also along the less frequently studied dimensions of socioeconomic status, age, physical appearance, sexual orientation, religious sentiment and political leanings. Consistent with previous scholarly literature, this work has found systemic bias against given names popular among African-Americans in most embedding models examined. Gender bias in embedding models however appears to be multifaceted and often reversed in polarity to what has been regularly reported. Interestingly, using the common operationalization of the term bias in the fairness literature, novel types of so far unreported bias types in word embedding models have also been identified. Specifically, the popular embedding models analyzed here display negative biases against middle and working-class socioeconomic status, male children, senior citizens, plain physical appearance and intellectual phenomena such as Islamic religious faith, non-religiosity and conservative political orientation. Reasons for the paradoxical underreporting of these bias types in the relevant literature are probably manifold but widely held blind spots when searching for algorithmic bias and a lack of widespread technical jargon to unambiguously describe a variety of algorithmic associations could conceivably be playing a role. The causal origins for the multiplicity of loaded associations attached to distinct demographic groups within embedding models are often unclear but the heterogeneity of said associations and their potential multifactorial roots raises doubts about the validity of grouping them all under the umbrella term bias. Richer and more fine-grained terminology as well as a more comprehensive exploration of the bias landscape could help the fairness epistemic community to characterize and neutralize algorithmic discrimination more efficiently.

  4. Number of new AI farness and bias metrics worldwide 2016-2022, by type

    • statista.com
    Updated Jul 1, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). Number of new AI farness and bias metrics worldwide 2016-2022, by type [Dataset]. https://www.statista.com/statistics/1378864/ai-fairness-bias-metrics-growth-worlwide/
    Explore at:
    Dataset updated
    Jul 1, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    2022
    Area covered
    Worldwide
    Description

    There has been a continuous growth in the number of metrics used to analyze fairness and biases in artificial intelligence (AI) platforms since 2016. Diagnostic metrics have consistently been adapted more than benchmarks, with a peak of ** in 2019. It is quite likely that this is simply because more diagnostics need to be run to analyze data to create more accurate benchmarks, i.e. the diagnostics lead to benchmarks.

  5. H

    Replication Data for: Publication Biases in Replication Studies

    • dataverse.harvard.edu
    Updated Sep 28, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Adam J. Berinsky; James N. Druckman; Teppei Yamamoto (2022). Replication Data for: Publication Biases in Replication Studies [Dataset]. http://doi.org/10.7910/DVN/BJMZNR
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Sep 28, 2022
    Dataset provided by
    Harvard Dataverse
    Authors
    Adam J. Berinsky; James N. Druckman; Teppei Yamamoto
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    One of the strongest findings across the sciences is that publication bias occurs. Of particular note is a “file drawer bias” where statistically significant results are privileged over non-significant results. Recognition of this bias, along with increased calls for “open science,” has led to an emphasis on replication studies. Yet, few have explored publication bias and its consequences in replication studies. We offer a model of the publication process involving an initial study and a replication. We use the model to describe three types of publication biases: 1) file drawer bias, 2) a “repeat study” bias against the publication of replication studies, and 3) a “gotcha bias” where replication results that run contrary to a prior study are more likely to be published. We estimate the model’s parameters with a vignette experiment conducted with political science professors teaching at Ph.D.-granting institutions in the United States. We find evidence of all three types of bias, although those explicitly involving replication studies are notably smaller. This bodes well for the replication movement. That said, the aggregation of all of the biases increases the number of false positives in a literature. We conclude by discussing a path for future work on publication biases.

  6. d

    Data from: Sampling methodology influences habitat suitability modeling for...

    • datadryad.org
    • search.dataone.org
    • +1more
    zip
    Updated Sep 5, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sarah Gaulke; Tara Hohoff; Brittany Rogness; Mark Davis (2023). Sampling methodology influences habitat suitability modeling for Chiropteran species [Dataset]. http://doi.org/10.5061/dryad.t1g1jwt6r
    Explore at:
    zipAvailable download formats
    Dataset updated
    Sep 5, 2023
    Dataset provided by
    Dryad
    Authors
    Sarah Gaulke; Tara Hohoff; Brittany Rogness; Mark Davis
    Time period covered
    2023
    Description

    Reference Information

    Provenance for this README

    • File name: README.txt
    • Authors: Sarah M. Gaulke
    • Other contributors: Tara C. Hohoff, Brittany A. Rogness, Mark A. Davis
    • Date created: 2023-04-01
    • Date modified: 2023-04-01

    Dataset Version and Release History

    • Current Version:

      • Number: 1.0.0
      • Date: 2023-04-01
      • Persistent identifier: DOI: 10.5061/dryad.t1g1jwt6r
      • Summary of changes: n/a
    • Embargo Provenance: n/a

      • Scope of embargo: n/a
      • Embargo period: n/a

    Dataset Attribution and Usage

    • Dataset Title: Data for the article "Sampling Methodology Influences Habitat Suitability Modeling for Chiropteran Species"

    • Dataset Contributors:

      • Creators: Sarah M. Gaulke, Tara C. Hohoff, Brittany A. Rogness, Mark A. Davis
    • Date of Issue: 2023-01-16

    • Publisher: Ecology and Evolution

    • License: Use of these data is covered by the following license: ...

  7. d

    Hate Crimes in USA: Year-wise Victim Type by Bias Motivation

    • dataful.in
    Updated May 27, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dataful (Factly) (2025). Hate Crimes in USA: Year-wise Victim Type by Bias Motivation [Dataset]. https://dataful.in/datasets/19757
    Explore at:
    application/x-parquet, xlsx, csvAvailable download formats
    Dataset updated
    May 27, 2025
    Dataset authored and provided by
    Dataful (Factly)
    License

    https://dataful.in/terms-and-conditionshttps://dataful.in/terms-and-conditions

    Area covered
    United States
    Variables measured
    Count
    Description

    This dataset contains the yearly statistics on the victim types by bias motivation. Major categories of victim types include individuals, government, business/financial institution, religious organization, society/public and other or multiple victims. Major categories of bias motivations include Race/Ethnicity/Ancestry, Religion, Sexual Orientation, Disability, Gender and Gender Identity.

  8. f

    Different bias types studied in recent research.

    • plos.figshare.com
    xls
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Wenlong Sun; Olfa Nasraoui; Patrick Shafto (2023). Different bias types studied in recent research. [Dataset]. http://doi.org/10.1371/journal.pone.0235502.t001
    Explore at:
    xlsAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Wenlong Sun; Olfa Nasraoui; Patrick Shafto
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Iterated algorithmic bias happens when an algorithm interacts with human response continuously, and updates its model after receiving feedback from the human. Meanwhile, the algorithm interacts with the human by showing only selected items or options. Other types of bias are static, which means they have a one-time influence on an algorithm.

  9. Data from: Confirmation Bias in Web-Based Search: A Randomized Online Study...

    • zenodo.org
    zip
    Updated Jan 24, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Stefan Schweiger; Stefan Schweiger; Ulrike Cress; Aileen Oeberst; Ulrike Cress; Aileen Oeberst (2020). Confirmation Bias in Web-Based Search: A Randomized Online Study on the Effects of Expert Information and Social Tags on Information Search and Evaluation [Dataset]. http://doi.org/10.5281/zenodo.3358127
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jan 24, 2020
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Stefan Schweiger; Stefan Schweiger; Ulrike Cress; Aileen Oeberst; Ulrike Cress; Aileen Oeberst
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    ABSTRACT

    Background: The public typically believes psychotherapy to be more effective than pharmacotherapy for depression treatments. This is not consistent with current scientific evidence, which shows that both types of treatment are about equally effective.

    Objective: The study investigates whether this bias towards psychotherapy guides online information search and whether the bias can be reduced by explicitly providing expert information (in a blog entry) and by providing tag clouds that implicitly reveal experts’ evaluations.

    Methods: A total of 174 participants completed a fully automated Web-based study after we invited them via mailing lists. First, participants read two blog posts by experts that either challenged or supported the bias towards psychotherapy. Subsequently, participants searched for information about depression treatment in an online environment that provided more experts’ blog posts about the effectiveness of treatments based on alleged research findings. These blogs were organized in a tag cloud; both psychotherapy tags and pharmacotherapy tags were popular. We measured tag and blog post selection, efficacy ratings of the presented treatments, and participants’ treatment recommendation after information search.

    Results: Participants demonstrated a clear bias towards psychotherapy (mean 4.53, SD 1.99) compared to pharmacotherapy (mean 2.73, SD 2.41; t173=7.67, P<.001, d=0.81) when rating treatment efficacy prior to the experiment. Accordingly, participants exhibited biased information search and evaluation. This bias was significantly reduced, however, when participants were exposed to tag clouds with challenging popular tags. Participants facing popular tags challenging their bias (n=61) showed significantly less biased tag selection (F2,168=10.61, P<.001, partial eta squared=0.112), blog post selection (F2,168=6.55, P=.002, partial eta squared=0.072), and treatment efficacy ratings (F2,168=8.48, P<.001, partial eta squared=0.092), compared to bias-supporting tag clouds (n=56) and balanced tag clouds (n=57). Challenging (n=93) explicit expert information as presented in blog posts, compared to supporting expert information (n=81), decreased the bias in information search with regard to blog post selection (F1,168=4.32, P=.04, partial eta squared=0.025). No significant effects were found for treatment recommendation (Ps>.33).

    Conclusions: We conclude that the psychotherapy bias is most effectively attenuated—and even eliminated—when popular tags implicitly point to blog posts that challenge the widespread view. Explicit expert information (in a blog entry) was less successful in reducing biased information search and evaluation. Since tag clouds have the potential to counter biased information processing, we recommend their insertion.

  10. Data from: Integrated species distribution models to account for sampling...

    • zenodo.org
    • data.niaid.nih.gov
    • +1more
    bin, zip
    Updated Nov 27, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jussi Mäkinen; Jussi Mäkinen; Cory Merow; Walter Jetz; Cory Merow; Walter Jetz (2023). Data from: Integrated species distribution models to account for sampling biases and improve range wide occurrence predictions [Dataset]. http://doi.org/10.5061/dryad.k98sf7mdg
    Explore at:
    zip, binAvailable download formats
    Dataset updated
    Nov 27, 2023
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Jussi Mäkinen; Jussi Mäkinen; Cory Merow; Walter Jetz; Cory Merow; Walter Jetz
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Measurement technique
    <p>A detailed methodology associated with the environmental variables and species data can be found from the references used in the original publication.</p>
    Description

    Aim

    Species distribution models (SDMs) that integrate presence-only and presence-absence data offer a promising avenue to improve information on species' geographic distributions. The use of such 'integrated SDMs' on a species range-wide extent has been constrained by the often-limited presence-absence data and by the heterogeneous sampling of the presence-only data. Here, we evaluate integrated SDMs for studying species ranges with a novel expert range map-based evaluation. We build a new understanding about how integrated SDMs address issues of estimation accuracy and data deficiency and thereby offer advantages over traditional SDMs.

    Location

    South and Central America.

    Time period

    1979-2017.

    Major taxa studied

    Hummingbirds.

    Methods

    We build integrated SDMs by linking two observation models – one for each data type – to the same underlying spatial process. We validate SDMs with two schemes: i) cross-validation with presence-absence data and ii) comparison with respect to the species' whole range as defined with IUCN range maps. We also compare models relative to the estimated response curves and compute the association between the benefit of the data integration and the number of presence records in each data set.

    Results

    The integrated SDM accounting for the spatially varying sampling intensity of the presence-only data was one of the top-performing models in both model validation schemes. Presence-only data alleviated overly large niche estimates, and data integration was beneficial compared to modelling solely presence-only data for species that had few presence points when predicting the species' whole range. On the community level, integrated models improved the species richness prediction.

    Main conclusions

    Integrated SDMs combining presence-only and presence-absence data are successfully able to borrow strengths from both data types and offer improved predictions of species' ranges. Integrated SDMs can potentially alleviate the impacts of taxonomically and geographically uneven sampling and to leverage the detailed sampling information in presence-absence data.

  11. d

    Hate Crimes in USA: Year-wise Offenses by Offense Type and by Bias...

    • dataful.in
    Updated May 27, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dataful (Factly) (2025). Hate Crimes in USA: Year-wise Offenses by Offense Type and by Bias Motivation [Dataset]. https://dataful.in/datasets/19756
    Explore at:
    application/x-parquet, csv, xlsxAvailable download formats
    Dataset updated
    May 27, 2025
    Dataset authored and provided by
    Dataful (Factly)
    License

    https://dataful.in/terms-and-conditionshttps://dataful.in/terms-and-conditions

    Area covered
    United States
    Variables measured
    Count
    Description

    This dataset contains the yearly statistics on the number of offenses by offense types and by bias motivation. Major categories of offense types include crimes against persons, crimes against property and crimes against society. Each offense type is further categorized by type of crime such as murder, rape, trafficking, robbery etc. Major categories of bias motivations include Race/Ethnicity/Ancestry, Religion, Sexual Orientation, Disability, Gender and Gender Identity.

  12. Data from: Survey of Prosecutorial Response to Bias-Motivated Crime in the...

    • catalog.data.gov
    • icpsr.umich.edu
    Updated Mar 12, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    National Institute of Justice (2025). Survey of Prosecutorial Response to Bias-Motivated Crime in the United States, 1994-1995 [Dataset]. https://catalog.data.gov/dataset/survey-of-prosecutorial-response-to-bias-motivated-crime-in-the-united-states-1994-1995-96eb6
    Explore at:
    Dataset updated
    Mar 12, 2025
    Dataset provided by
    National Institute of Justicehttp://nij.ojp.gov/
    Area covered
    United States
    Description

    This national survey of prosecutors was undertaken to systematically gather information about the handling of bias or hate crime prosecutions in the United States. The goal was to use this information to identify needs and to enhance the ability of prosecutors to respond effectively to hate crimes by promoting effective practices. The survey aimed to address the following research questions: (1) What was the present level of bias crime prosecution in the United States? (2) What training had been provided to prosecutors to assist them in prosecuting hate- and bias-motivated crimes and what additional training would be beneficial? (3) What types of bias offenses were prosecuted in 1994-1995? (4) How were bias crime cases assigned and to what extent were bias crime cases given priority? and (5) What factors or issues inhibited a prosecutor's ability to prosecute bias crimes? In 1995, a national mail survey was sent to a stratified sample of prosecutor offices in three phases to solicit information about prosecutors' experiences with hate crimes. Questions were asked about size of jurisdiction, number of full-time staff, number of prosecutors and investigators assigned to bias crimes, and number of bias cases prosecuted. Additional questions measured training for bias-motivated crimes, such as whether staff received specialized training, whether there existed a written policy on bias crimes, how well prosecutors knew the bias statute, and whether there was a handbook on bias crime. Information elicited on case processing included the frequency with which certain criminal acts were charged and sentenced as bias crimes, the existence of a special bias unit, case tracking systems, preparation of witnesses, jury selection, and case disposition. Other topics specifically covered bias related to racial or ethnic differences, religious differences, sexual orientation, and violence against women.

  13. Data for: Peer Review Under Scrutiny: Systematic Evidence of Bias in...

    • zenodo.org
    bin
    Updated May 28, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Daniela Maciel Pinto; Daniela Maciel Pinto; Adriana Bin; Adriana Bin; Evandro Coggo Cristofoletti; Evandro Coggo Cristofoletti; Ana Carolina Spatti; Ana Carolina Spatti; Larissa Aparecida Prevato Lopes; Larissa Aparecida Prevato Lopes (2025). Data for: Peer Review Under Scrutiny: Systematic Evidence of Bias in Research Funding [Dataset]. http://doi.org/10.5281/zenodo.15536550
    Explore at:
    binAvailable download formats
    Dataset updated
    May 28, 2025
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Daniela Maciel Pinto; Daniela Maciel Pinto; Adriana Bin; Adriana Bin; Evandro Coggo Cristofoletti; Evandro Coggo Cristofoletti; Ana Carolina Spatti; Ana Carolina Spatti; Larissa Aparecida Prevato Lopes; Larissa Aparecida Prevato Lopes
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset supports the systematic literature review conducted in the study "Peer Review Under Scrutiny: Systematic Evidence of Bias in Research Funding". The data comprise a curated collection of empirical studies that investigated the existence of biases in peer review processes within research funding agencies worldwide.

    The dataset includes detailed categorizations based on the types of biases investigated, methodologies employed, data sources, and the confirmation status of each bias identified in the selected studies. The file was structured to facilitate further analyses, replications, and methodological reviews in the field of research evaluation and science policy studies.

    Data were collected through systematic searches in Scopus and Web of Science databases, followed by rigorous screening and classification procedures. The dataset may be particularly useful for researchers, policymakers, and evaluators interested in improving transparency and equity in research funding mechanisms.

  14. f

    Data from "Spatial filtering strategies for mitigating sampling bias in...

    • figshare.com
    txt
    Updated Aug 25, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yoan Fourcade; Quentin Lamboley (2023). Data from "Spatial filtering strategies for mitigating sampling bias in species distribution models" [Dataset]. http://doi.org/10.6084/m9.figshare.24032196.v1
    Explore at:
    txtAvailable download formats
    Dataset updated
    Aug 25, 2023
    Dataset provided by
    figshare
    Authors
    Yoan Fourcade; Quentin Lamboley
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Results obtained and analysed in Lamboley & Fourcade, Spatial filtering strategies for mitigating sampling bias in species distribution models. Briefly, we used two virtual species with contrasting levels of specialisation to explore the impact of spatial filtering distances on the performance of ecological niche models. This investigation was conducted across a spectrum of modelling conditions, encompassing diverse types and degrees of bias, as well as varying sample sizes.Results reporting the overlap between modelled and true distributions:Unbiased_distribution.csv: results for the models trained from unbiased, i.e. randomly sampled, datasetsBiased_corrected_distribution.csv: results for the models trained from biased datasets, corrected with various spatial filtering distancesResults reporting the overlap between modelled and true response curves:Unbiased_response_curves.csv: results for the models trained from unbiased, i.e. randomly sampled, datasetsBiased_corrected_response_curves.csv: results for the models trained from biased datasets, corrected with various spatial filtering distances

  15. f

    Data from: Bivariate Analysis of Distribution Functions Under Biased...

    • tandf.figshare.com
    txt
    Updated Apr 17, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Hsin-wen Chang; Shu-Hsiang Wang (2024). Bivariate Analysis of Distribution Functions Under Biased Sampling [Dataset]. http://doi.org/10.6084/m9.figshare.23998414.v1
    Explore at:
    txtAvailable download formats
    Dataset updated
    Apr 17, 2024
    Dataset provided by
    Taylor & Francis
    Authors
    Hsin-wen Chang; Shu-Hsiang Wang
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This article compares distribution functions among pairs of locations in their domains, in contrast to the typical approach of univariate comparison across individual locations. This bivariate approach is studied in the presence of sampling bias, which has been gaining attention in COVID-19 studies that over-represent more symptomatic people. In cases with either known or unknown sampling bias, we introduce Anderson–Darling-type tests based on both the univariate and bivariate formulation. A simulation study shows the superior performance of the bivariate approach over the univariate one. We illustrate the proposed methods using real data on the distribution of the number of symptoms suggestive of COVID-19.

  16. f

    A Comparison of Four Methods for the Analysis of N-of-1 Trials

    • figshare.com
    doc
    Updated Jun 2, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Xinlin Chen; Pingyan Chen (2023). A Comparison of Four Methods for the Analysis of N-of-1 Trials [Dataset]. http://doi.org/10.1371/journal.pone.0087752
    Explore at:
    docAvailable download formats
    Dataset updated
    Jun 2, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Xinlin Chen; Pingyan Chen
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    ObjectiveTo provide a practical guidance for the analysis of N-of-1 trials by comparing four commonly used models.MethodsThe four models, paired t-test, mixed effects model of difference, mixed effects model and meta-analysis of summary data were compared using a simulation study. The assumed 3-cycles and 4-cycles N-of-1 trials were set with sample sizes of 1, 3, 5, 10, 20 and 30 respectively under normally distributed assumption. The data were generated based on variance-covariance matrix under the assumption of (i) compound symmetry structure or first-order autoregressive structure, and (ii) no carryover effect or 20% carryover effect. Type I error, power, bias (mean error), and mean square error (MSE) of effect differences between two groups were used to evaluate the performance of the four models.ResultsThe results from the 3-cycles and 4-cycles N-of-1 trials were comparable with respect to type I error, power, bias and MSE. Paired t-test yielded type I error near to the nominal level, higher power, comparable bias and small MSE, whether there was carryover effect or not. Compared with paired t-test, mixed effects model produced similar size of type I error, smaller bias, but lower power and bigger MSE. Mixed effects model of difference and meta-analysis of summary data yielded type I error far from the nominal level, low power, and large bias and MSE irrespective of the presence or absence of carryover effect.ConclusionWe recommended paired t-test to be used for normally distributed data of N-of-1 trials because of its optimal statistical performance. In the presence of carryover effects, mixed effects model could be used as an alternative.

  17. Hate Crimes

    • kaggle.com
    Updated Jul 7, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Melissa Monfared (2024). Hate Crimes [Dataset]. https://www.kaggle.com/datasets/melissamonfared/hate-crimes
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 7, 2024
    Dataset provided by
    Kaggle
    Authors
    Melissa Monfared
    Description

    Overview:

    This dataset contains detailed information on cases where a hate or bias crime has been reported to the Bloomington Police Department. Hate crimes are criminal offenses motivated by bias against race, religion, ethnicity, sexual orientation, gender identity, or other protected characteristics. This dataset provides insights into the nature and demographics of hate crimes in Bloomington, aiding in understanding and addressing these incidents.

    Dataset Details:

    The dataset includes the following columns:

    Column NameDescriptionAPI Field NameData Type
    case_numberCase Numbercase_numberText
    dateDatedateFloating Timestamp
    weekdayDay of Weekday_of_weekText
    victimsTotal Number of VictimsvictimsNumber
    victim_raceVictim Racevictim_raceText
    victim_genderVictim Gendervictim_genderText
    victim_typeVictim Typevictim_typeText
    offendersTotal Number of OffendersoffendersNumber
    offender_raceOffender Raceoffender_raceText
    offender_genderOffender Genderoffender_genderText
    offenseOffense / CrimeoffenseText
    location_typeOffense / Crime Location Typelocation_typeText
    motivationOffense/Crime Bias MotivationmotivationText

    Key Features:

    • Comprehensive Crime Data: Provides detailed information on hate crimes, including demographics of victims and offenders, types of offenses, and bias motivations.
    • Temporal Analysis: Includes timestamps for each incident, allowing for analysis of trends over time.
    • Demographic Insights: Offers data on race and gender of both victims and offenders, helping to identify patterns and target interventions.
    • Location Information: Contains details about the type of location where the offense occurred, useful for spatial analysis and preventive measures.

    Usage:

    This dataset can be used for:

    • Crime Analysis: Analyzing trends and patterns in hate crimes to inform law enforcement strategies and policies.
    • Community Safety: Identifying high-risk areas and times to improve community policing and preventive measures.
    • Research and Advocacy: Supporting academic research and advocacy efforts focused on combating hate crimes and promoting social justice.
    • Policy Development: Assisting policymakers in developing targeted initiatives to reduce hate crimes and support affected communities.

    Data Maintenance:

    • Last Updated: July 7, 2024
    • Source: Bloomington Police Department Data Portal
    • Revisions: The dataset is annually updated to ensure the inclusion of the latest incidents and to maintain data accuracy. Historical data is preserved to support long-term analyses.

    Additional Notes

    • Data Accuracy: The Bloomington Police Department strives for accuracy in open data; however, errors may occur due to the nature of data collection from multiple sources.
    • Data Interpretation: Users should be aware that the dataset may change over time as new information becomes available or corrections are made.
    • Race and District Codes: The dataset uses specific codes for race and reading districts, which are detailed in the accompanying documentation to ensure proper interpretation.
    • License: Open Data Commons Public Domain Dedication and License
  18. Data from: Sampling biases shape our view of the natural world

    • zenodo.org
    • datadryad.org
    csv, txt, xls, zip
    Updated Jun 4, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Alice Hughes; Keping Ma; Mark Costello; Mark Costello; John Waller; John Waller; Pieter Provoost; Qinmin Yang; Chaodong Zhu; Huijie Qiao; Michael Orr; Michael Orr; Alice Hughes; Keping Ma; Pieter Provoost; Qinmin Yang; Chaodong Zhu; Huijie Qiao (2022). Sampling biases shape our view of the natural world [Dataset]. http://doi.org/10.5061/dryad.zw3r2287z
    Explore at:
    zip, xls, txt, csvAvailable download formats
    Dataset updated
    Jun 4, 2022
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Alice Hughes; Keping Ma; Mark Costello; Mark Costello; John Waller; John Waller; Pieter Provoost; Qinmin Yang; Chaodong Zhu; Huijie Qiao; Michael Orr; Michael Orr; Alice Hughes; Keping Ma; Pieter Provoost; Qinmin Yang; Chaodong Zhu; Huijie Qiao
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Spatial patterns of biodiversity are inextricably linked to their collection methods, yet no synthesis of bias patterns or their consequences exists. As such, views of organismal distribution and the ecosystems they make up may be incorrect, undermining countless ecological and evolutionary studies. Using 742 million records of 374,900 species, we explore the global patterns and impacts of biases related to taxonomy, accessibility, ecotype, and data type across terrestrial and marine systems. Pervasive sampling and observation biases exist across animals, with only 6.74% of the globe sampled, and disproportionately poor tropical sampling. High -elevations and deep -seas are particularly unknown. Over 50% of records in most groups account for under 2% of species, and citizen-science only exacerbates biases. Additional data will be needed to overcome many of these biases, but we must increasingly value data publication to bridge this gap and better represent species' distributions from more distant and inaccessible areas, and provide the necessary basis for conservation and management.

  19. f

    Systematic Review of the Empirical Evidence of Study Publication Bias and...

    • figshare.com
    • plos.figshare.com
    doc
    Updated Jan 18, 2016
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kerry Dwan; Carrol Gamble; Paula R. Williamson; Jamie J. Kirkham (2016). Systematic Review of the Empirical Evidence of Study Publication Bias and Outcome Reporting Bias — An Updated Review [Dataset]. http://doi.org/10.1371/journal.pone.0066844
    Explore at:
    docAvailable download formats
    Dataset updated
    Jan 18, 2016
    Dataset provided by
    PLOS ONE
    Authors
    Kerry Dwan; Carrol Gamble; Paula R. Williamson; Jamie J. Kirkham
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    BackgroundThe increased use of meta-analysis in systematic reviews of healthcare interventions has highlighted several types of bias that can arise during the completion of a randomised controlled trial. Study publication bias and outcome reporting bias have been recognised as a potential threat to the validity of meta-analysis and can make the readily available evidence unreliable for decision making.Methodology/Principal FindingsIn this update, we review and summarise the evidence from cohort studies that have assessed study publication bias or outcome reporting bias in randomised controlled trials. Twenty studies were eligible of which four were newly identified in this update. Only two followed the cohort all the way through from protocol approval to information regarding publication of outcomes. Fifteen of the studies investigated study publication bias and five investigated outcome reporting bias. Three studies have found that statistically significant outcomes had a higher odds of being fully reported compared to non-significant outcomes (range of odds ratios: 2.2 to 4.7). In comparing trial publications to protocols, we found that 40–62% of studies had at least one primary outcome that was changed, introduced, or omitted. We decided not to undertake meta-analysis due to the differences between studies.ConclusionsThis update does not change the conclusions of the review in which 16 studies were included. Direct empirical evidence for the existence of study publication bias and outcome reporting bias is shown. There is strong evidence of an association between significant results and publication; studies that report positive or significant results are more likely to be published and outcomes that are statistically significant have higher odds of being fully reported. Publications have been found to be inconsistent with their protocols. Researchers need to be aware of the problems of both types of bias and efforts should be concentrated on improving the reporting of trials.

  20. Data and script for "Detecting synthetic population bias using a...

    • figshare.com
    zip
    Updated May 15, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jessica Embury; Atsushi Nara; Sergio Rey; Ming-Hsiang Tsou; Sahar Ghanipoor Machiani (2024). Data and script for "Detecting synthetic population bias using a spatially-oriented framework and independent validation data" [Dataset]. http://doi.org/10.6084/m9.figshare.24664647.v1
    Explore at:
    zipAvailable download formats
    Dataset updated
    May 15, 2024
    Dataset provided by
    Figsharehttp://figshare.com/
    figshare
    Authors
    Jessica Embury; Atsushi Nara; Sergio Rey; Ming-Hsiang Tsou; Sahar Ghanipoor Machiani
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    This folder contains processed and derived data, and script for the manuscript, 'Detecting synthetic population bias using a spatially-oriented framework and independent validation data'.Abstract: Models of human mobility can be broadly applied to find solutions addressing diverse topics such as public health policy, transportation management, emergency management, and urban development. However, many mobility models require individual-level data that is limited in availability and accessibility. Synthetic populations are commonly used as the foundation for mobility models because they provide detailed individual-level data representing the different types and characteristics of people in a study area. Thorough evaluation of synthetic populations are required to detect data biases before the prejudices are transferred to subsequent applications. Although synthetic populations are commonly used for modeling mobility, they are conventionally validated by their sociodemographic characteristics, rather than mobility attributes. Mobility microdata provides an opportunity to independently/externally validate the mobility attributes of synthetic populations. This study demonstrates a spatially-oriented data validation framework and independent data validation to assess the mobility attributes of two synthetic populations at different spatial granularities. Validation using independent data (SafeGraph) and the validation framework replicated the spatial distribution of errors detected using source data (LODES) and total absolute error. Spatial clusters of error exposed the locations of underrepresented and overrepresented communities. This information can guide bias mitigation efforts to generate a more representative synthetic population.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Samantha L. Peel; Nicole A. Hill; Scott D. Foster; Simon J. Wotherspoon; Claudio Ghiglione; Stefano Schiaparelli (2019). Reliable species distributions are obtainable with sparse, patchy and biased data by leveraging over species and data types [Dataset]. http://doi.org/10.5061/dryad.2226v8m

Data from: Reliable species distributions are obtainable with sparse, patchy and biased data by leveraging over species and data types

Related Article
Explore at:
zipAvailable download formats
Dataset updated
May 31, 2019
Dataset provided by
University of Tasmania
Commonwealth Scientific and Industrial Research Organisation
Italian National Antarctic Museum (MNA, Section of Genoa) Genoa Italy
Authors
Samantha L. Peel; Nicole A. Hill; Scott D. Foster; Simon J. Wotherspoon; Claudio Ghiglione; Stefano Schiaparelli
License

https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html

Description
  1. New methods for species distribution models (SDMs) utilise presence‐absence (PA) data to correct the sampling bias of presence‐only (PO) data in a spatial point process setting. These have been shown to improve species estimates when both data sets are large and dense. However, is a PA data set that is smaller and patchier than hitherto examined able to do the same? Furthermore, when both data sets are relatively small, is there enough information contained within them to produce a useful estimate of species’ distributions? These attributes are common in many applications.

  2. A stochastic simulation was conducted to assess the ability of a pooled data SDM to estimate the distribution of species from increasingly sparser and patchier data sets. The simulated data sets were varied by changing the number of presence‐absence sample locations, the degree of patchiness of these locations, the number of PO observations, and the level of sampling bias within the PO observations. The performance of the pooled data SDM was compared to a PA SDM and a PO SDM to assess the strengths and limitations of each SDM.

  3. The pooled data SDM successfully removed the sampling bias from the PO observations even when the presence‐absence data was sparse and patchy, and the PO observations formed the majority of the data. The pooled data SDM was, in general, more accurate and more precise than either the PA SDM or the PO SDM. All SDMs were more precise for the species responses than they were for the covariate coefficients.

  4. The emerging SDM methodology that pools PO and PA data will facilitate more certainty around species’ distribution estimates, which in turn will allow more relevant and concise management and policy decisions to be enacted. This work shows that it is possible to achieve this result even in relatively data‐poor regions.

Search
Clear search
Close search
Google apps
Main menu