100+ datasets found
  1. Why Even More Clinical Research Studies May Be False: Effect of Asymmetrical...

    • plos.figshare.com
    • datasetcatalog.nlm.nih.gov
    xls
    Updated Jun 2, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Matthew James Shun-Shin; Darrel P. Francis (2023). Why Even More Clinical Research Studies May Be False: Effect of Asymmetrical Handling of Clinically Unexpected Values [Dataset]. http://doi.org/10.1371/journal.pone.0065323
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 2, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Matthew James Shun-Shin; Darrel P. Francis
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    BackgroundIn medical practice, clinically unexpected measurements might be quite properly handled by the remeasurement, removal, or reclassification of patients. If these habits are not prevented during clinical research, how much of each is needed to sway an entire study?Methods and ResultsBelieving there is a difference between groups, a well-intentioned clinician researcher addresses unexpected values. We tested how much removal, remeasurement, or reclassification of patients would be needed in most cases to turn an otherwise-neutral study positive. Remeasurement of 19 patients out of 200 per group was required to make most studies positive. Removal was more powerful: just 9 out of 200 was enough. Reclassification was most powerful, with 5 out of 200 enough. The larger the study, the smaller the proportion of patients needing to be manipulated to make the study positive: the percentages needed to be remeasured, removed, or reclassified fell from 45%, 20%, and 10% respectively for a 20 patient-per-group study, to 4%, 2%, and 1% for an 800 patient-per-group study. Dot-plots, but not bar-charts, make the perhaps-inadvertent manipulations visible. Detection is possible using statistical methods such as the Tadpole test.ConclusionsBehaviours necessary for clinical practice are destructive to clinical research. Even small amounts of selective remeasurement, removal, or reclassification can produce false positive results. Size matters: larger studies are proportionately more vulnerable. If observational studies permit selective unblinded enrolment, malleable classification, or selective remeasurement, then results are not credible. Clinical research is very vulnerable to “remeasurement, removal, and reclassification”, the 3 evil R's.

  2. BBC Datasets

    • brightdata.com
    .json, .csv, .xlsx
    Updated Feb 6, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bright Data (2024). BBC Datasets [Dataset]. https://brightdata.com/products/datasets/bbc
    Explore at:
    .json, .csv, .xlsxAvailable download formats
    Dataset updated
    Feb 6, 2024
    Dataset authored and provided by
    Bright Datahttps://brightdata.com/
    License

    https://brightdata.com/licensehttps://brightdata.com/license

    Area covered
    Worldwide
    Description

    Unlock the full potential of BBC broadcast data with our comprehensive dataset featuring transcripts, program schedules, headlines, topics, and multimedia resources. This all-in-one dataset is designed to empower media analysts, researchers, journalists, and advocacy groups with actionable insights for media analysis, transparency studies, and editorial assessments.

    Dataset Features

    Transcripts: Access detailed broadcast transcripts, including headlines, content, author details, and publication dates. Perfect for analyzing media framing, topic frequency, and news narratives across various programs. Program Schedules: Explore program schedules with accurate timing, show names, and related metadata to track news coverage patterns and identify trends. Topics and Keywords: Analyze categorized topics and keywords to understand content diversity, editorial focus, and recurring themes in news broadcasts. Multimedia Content: Gain access to videos, images, and related articles linked to each broadcast for a holistic understanding of the news presentation. Metadata: Includes critical data points like publication dates, last updates, content URLs, and unique IDs for easier referencing and cross-analysis.

    Customizable Subsets for Specific Needs Our CNN dataset is fully customizable to match your research or analytical goals. Focus on transcripts for in-depth media framing analysis, extract multimedia for content visualization studies, or dive into program schedules for broadcast trend analysis. Tailor the dataset to ensure it aligns with your objectives for maximum efficiency and relevance.

    Popular Use Cases

    Media Analysis: Evaluate news framing, content diversity, and topic coverage to assess editorial direction and media focus. Transparency Studies: Analyze journalistic standards, corrections, and retractions to assess media integrity and accountability. Audience Engagement: Identify recurring topics and trends in news content to understand audience preferences and behavior. Market Analysis: Track media coverage of key industries, companies, and topics to analyze public sentiment and industry relevance. Journalistic Integrity: Use transcripts and metadata to evaluate adherence to reporting practices, fairness, and transparency in news coverage. Research and Scholarly Studies: Leverage transcripts and multimedia to support academic studies in journalism, media criticism, and political discourse analysis.

    Whether you are evaluating transparency, conducting media criticism, or tracking broadcast trends, our BBC dataset provides you with the tools and insights needed for in-depth research and strategic analysis. Customize your access to focus on the most relevant data points for your unique needs.

  3. f

    Statistics of cricket dataset.

    • figshare.com
    • plos.figshare.com
    xls
    Updated Sep 20, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Shihab Ahmed; Moythry Manir Samia; Maksuda Haider Sayma; Md. Mohsin Kabir; M. F. Mridha (2024). Statistics of cricket dataset. [Dataset]. http://doi.org/10.1371/journal.pone.0308050.t002
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Sep 20, 2024
    Dataset provided by
    PLOS ONE
    Authors
    Shihab Ahmed; Moythry Manir Samia; Maksuda Haider Sayma; Md. Mohsin Kabir; M. F. Mridha
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    In recent years, the surge in reviews and comments on newspapers and social media has made sentiment analysis a focal point of interest for researchers. Sentiment analysis is also gaining popularity in the Bengali language. However, Aspect-Based Sentiment Analysis is considered a difficult task in the Bengali language due to the shortage of perfectly labeled datasets and the complex variations in the Bengali language. This study used two open-source benchmark datasets of the Bengali language, Cricket, and Restaurant, for our Aspect-Based Sentiment Analysis task. The original work was based on the Random Forest, Support Vector Machine, K-Nearest Neighbors, and Convolutional Neural Network models. In this work, we used the Bidirectional Encoder Representations from Transformers, the Robustly Optimized BERT Approach, and our proposed hybrid transformative Random Forest and Bidirectional Encoder Representations from Transformers (tRF-BERT) models to compare the results with the existing work. After comparing the results, we can clearly see that all the models used in our work achieved better results than any of the previous works on the same dataset. Amongst them, our proposed transformative Random Forest and Bidirectional Encoder Representations from Transformers achieved the highest F1 score and accuracy. The accuracy and F1 score of aspect detection for the Cricket dataset were 0.89 and 0.85, respectively, and for the Restaurant dataset were 0.92 and 0.89 respectively.

  4. Data from: Sizing the Problem of Improving Discovery and Access to...

    • figshare.com
    xlsx
    Updated Jan 19, 2016
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kevin Read (2016). Sizing the Problem of Improving Discovery and Access to NIH-funded Data: A preliminary study [Dataset]. http://doi.org/10.6084/m9.figshare.1285515.v1
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Jan 19, 2016
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Kevin Read
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    To inform efforts to improve the discoverability of and access to biomedical datasets by providing a preliminary estimate of the number and type of datasets generated annually by National Institutes of Health (NIH)-funded researchers. Of particular interest is characterizing those datasets that are not deposited in a known data repository or registry, e.g., those for which a related journal article does not indicate that underlying data have been deposited in a known repository. Such “invisible” datasets comprise the “long tail” of biomedical data and pose significant practical challenges to ongoing efforts to improve discoverability of and access to biomedical research data. This study identified datasets used to support the NIH-funded research reported in articles published in 2011 and cited in PubMed® and deposited in PubMed Central® (PMC). After searching for all articles that acknowledged NIH support, we first identified articles that contained explicit mention of datasets being deposited in recognized repositories. Thirty members of the NIH staff then analyzed a random sample of the remaining articles to estimate how many and what types of datasets were used per article. Two reviewers independently examined each paper. Each dataset is titled Bigdata_randomsample_xxxx_xx. The xxxx refers to the set of articles the annotator looked at, while the xxidentifies the annotator that did the analysis. Within each dataset, the author has listed the number of datasets they identified within the articles that they looked at. For every dataset that was found, the annotators were asked to insert a new row into the spreadsheet, and then describe the dataset they found (e.g., type of data, subject of study, etc.). Each row in the spreadsheet was always prepended by the PubMed Identifier (PMID) where the dataset was found. Finally, the files 2013-08-07_Bigdatastudy_dataanalysis, Dataanalysis_ack_si_datasets, and Datasets additional random sample mention vs deposit 20150313 refer to the analysis that was performed based on each annotator's analysis of the publications they were assigned, and the data deposits identified from the analysis.

  5. d

    Data release for solar-sensor angle analysis subset associated with the...

    • catalog.data.gov
    • data.usgs.gov
    • +1more
    Updated Nov 27, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Geological Survey (2025). Data release for solar-sensor angle analysis subset associated with the journal article "Solar and sensor geometry, not vegetation response, drive satellite NDVI phenology in widespread ecosystems of the western United States" [Dataset]. https://catalog.data.gov/dataset/data-release-for-solar-sensor-angle-analysis-subset-associated-with-the-journal-article-so
    Explore at:
    Dataset updated
    Nov 27, 2025
    Dataset provided by
    United States Geological Surveyhttp://www.usgs.gov/
    Area covered
    Western United States, United States
    Description

    This dataset provides geospatial location data and scripts used to analyze the relationship between MODIS-derived NDVI and solar and sensor angles in a pinyon-juniper ecosystem in Grand Canyon National Park. The data are provided in support of the following publication: "Solar and sensor geometry, not vegetation response, drive satellite NDVI phenology in widespread ecosystems of the western United States". The data and scripts allow users to replicate, test, or further explore results. The file GrcaScpnModisCellCenters.csv contains locations (latitude-longitude) of all the 250-m MODIS (MOD09GQ) cell centers associated with the Grand Canyon pinyon-juniper ecosystem that the Southern Colorado Plateau Network (SCPN) is monitoring through its land surface phenology and integrated upland monitoring programs. The file SolarSensorAngles.csv contains MODIS angle measurements for the pixel at the phenocam location plus a random 100 point subset of pixels within the GRCA-PJ ecosystem. The script files (folder: 'Code') consist of 1) a Google Earth Engine (GEE) script used to download MODIS data through the GEE javascript interface, and 2) a script used to calculate derived variables and to test relationships between solar and sensor angles and NDVI using the statistical software package 'R'. The file Fig_8_NdviSolarSensor.JPG shows NDVI dependence on solar and sensor geometry demonstrated for both a single pixel/year and for multiple pixels over time. (Left) MODIS NDVI versus solar-to-sensor angle for the Grand Canyon phenocam location in 2018, the year for which there is corresponding phenocam data. (Right) Modeled r-squared values by year for 100 randomly selected MODIS pixels in the SCPN-monitored Grand Canyon pinyon-juniper ecosystem. The model for forward-scatter MODIS-NDVI is log(NDVI) ~ solar-to-sensor angle. The model for back-scatter MODIS-NDVI is log(NDVI) ~ solar-to-sensor angle + sensor zenith angle. Boxplots show interquartile ranges; whiskers extend to 10th and 90th percentiles. The horizontal line marking the average median value for forward-scatter r-squared (0.835) is nearly indistinguishable from the back-scatter line (0.833). The dataset folder also includes supplemental R-project and packrat files that allow the user to apply the workflow by opening a project that will use the same package versions used in this study (eg, .folders Rproj.user, and packrat, and files .RData, and PhenocamPR.Rproj). The empty folder GEE_DataAngles is included so that the user can save the data files from the Google Earth Engine scripts to this location, where they can then be incorporated into the r-processing scripts without needing to change folder names. To successfully use the packrat information to replicate the exact processing steps that were used, the user should refer to packrat documentation available at https://cran.r-project.org/web/packages/packrat/index.html and at https://www.rdocumentation.org/packages/packrat/versions/0.5.0. Alternatively, the user may also use the descriptive documentation phenopix package documentation, and description/references provided in the associated journal article to process the data to achieve the same results using newer packages or other software programs.

  6. f

    Summary statistics of variables used in analyses.

    • datasetcatalog.nlm.nih.gov
    • figshare.com
    • +1more
    Updated Jan 10, 2014
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Hipp, John R.; Wickes, Rebecca; Li, Tiebei; Corcoran, Jonathan (2014). Summary statistics of variables used in analyses. [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0001207119
    Explore at:
    Dataset updated
    Jan 10, 2014
    Authors
    Hipp, John R.; Wickes, Rebecca; Li, Tiebei; Corcoran, Jonathan
    Description

    Note: Sample size is 4,351 respondents in 146 neighborhoods.

  7. f

    Descriptive statistics.

    • datasetcatalog.nlm.nih.gov
    • plos.figshare.com
    Updated Apr 18, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Blaschke, Steffen (2024). Descriptive statistics. [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0001368005
    Explore at:
    Dataset updated
    Apr 18, 2024
    Authors
    Blaschke, Steffen
    Description

    Bibliometric studies offer numerous ways of analyzing scientific work. For example, co-citation and bibliographic coupling networks have been widely used since the 1960s to describe the segmentation of research and to look the development of the scientific frontier. In addition, co-authorship and collaboration networks have been employed for more than 30 years to explore the social dimension of scientific work. This paper introduces publication authorship as a complement to these established approaches. Three data sets of academic articles from accounting, astronomy, and gastroenterology are used to illustrate the benefits of publication authorship for bibliometric studies. In comparison to bibliographic coupling, publication authorship produces significantly better intra-cluster cosine similarities across all data sets, which in the end yields a more fine-grained picture of the research field in question. Beyond this finding, publication authorship lends itself to other types of documents such as corporate reports or meeting minutes to study organizations, movements, or any other concerted activity.

  8. Z

    CT-FAN: A Multilingual dataset for Fake News Detection

    • data.niaid.nih.gov
    • zenodo.org
    Updated Oct 23, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Gautam Kishore Shahi; Julia Maria Struß; Thomas Mandl; Juliane Köhler; Michael Wiegand; Melanie Siegel (2022). CT-FAN: A Multilingual dataset for Fake News Detection [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_4714516
    Explore at:
    Dataset updated
    Oct 23, 2022
    Dataset provided by
    University of Applied Sciences Potsdam
    University of Duisburg-Essen
    University of Hildesheim
    Darmstadt University of Applied Sciences
    University of Klagenfurt
    Authors
    Gautam Kishore Shahi; Julia Maria Struß; Thomas Mandl; Juliane Köhler; Michael Wiegand; Melanie Siegel
    Description

    By downloading the data, you agree with the terms & conditions mentioned below:

    Data Access: The data in the research collection may only be used for research purposes. Portions of the data are copyrighted and have commercial value as data, so you must be careful to use them only for research purposes.

    Summaries, analyses and interpretations of the linguistic properties of the information may be derived and published, provided it is impossible to reconstruct the information from these summaries. You may not try identifying the individuals whose texts are included in this dataset. You may not try to identify the original entry on the fact-checking site. You are not permitted to publish any portion of the dataset besides summary statistics or share it with anyone else.

    We grant you the right to access the collection's content as described in this agreement. You may not otherwise make unauthorised commercial use of, reproduce, prepare derivative works, distribute copies, perform, or publicly display the collection or parts of it. You are responsible for keeping and storing the data in a way that others cannot access. The data is provided free of charge.

    Citation

    Please cite our work as

    @InProceedings{clef-checkthat:2022:task3, author = {K{"o}hler, Juliane and Shahi, Gautam Kishore and Stru{\ss}, Julia Maria and Wiegand, Michael and Siegel, Melanie and Mandl, Thomas}, title = "Overview of the {CLEF}-2022 {CheckThat}! Lab Task 3 on Fake News Detection", year = {2022}, booktitle = "Working Notes of CLEF 2022---Conference and Labs of the Evaluation Forum", series = {CLEF~'2022}, address = {Bologna, Italy},}

    @article{shahi2021overview, title={Overview of the CLEF-2021 CheckThat! lab task 3 on fake news detection}, author={Shahi, Gautam Kishore and Stru{\ss}, Julia Maria and Mandl, Thomas}, journal={Working Notes of CLEF}, year={2021} }

    Problem Definition: Given the text of a news article, determine whether the main claim made in the article is true, partially true, false, or other (e.g., claims in dispute) and detect the topical domain of the article. This task will run in English and German.

    Task 3: Multi-class fake news detection of news articles (English) Sub-task A would detect fake news designed as a four-class classification problem. Given the text of a news article, determine whether the main claim made in the article is true, partially true, false, or other. The training data will be released in batches and roughly about 1264 articles with the respective label in English language. Our definitions for the categories are as follows:

    False - The main claim made in an article is untrue.

    Partially False - The main claim of an article is a mixture of true and false information. The article contains partially true and partially false information but cannot be considered 100% true. It includes all articles in categories like partially false, partially true, mostly true, miscaptioned, misleading etc., as defined by different fact-checking services.

    True - This rating indicates that the primary elements of the main claim are demonstrably true.

    Other- An article that cannot be categorised as true, false, or partially false due to a lack of evidence about its claims. This category includes articles in dispute and unproven articles.

    Cross-Lingual Task (German)

    Along with the multi-class task for the English language, we have introduced a task for low-resourced language. We will provide the data for the test in the German language. The idea of the task is to use the English data and the concept of transfer to build a classification model for the German language.

    Input Data

    The data will be provided in the format of Id, title, text, rating, the domain; the description of the columns is as follows:

    ID- Unique identifier of the news article

    Title- Title of the news article

    text- Text mentioned inside the news article

    our rating - class of the news article as false, partially false, true, other

    Output data format

    public_id- Unique identifier of the news article

    predicted_rating- predicted class

    Sample File

    public_id, predicted_rating 1, false 2, true

    IMPORTANT!

    We have used the data from 2010 to 2022, and the content of fake news is mixed up with several topics like elections, COVID-19 etc.

    Baseline: For this task, we have created a baseline system. The baseline system can be found at https://zenodo.org/record/6362498

    Related Work

    Shahi GK. AMUSED: An Annotation Framework of Multi-modal Social Media Data. arXiv preprint arXiv:2010.00502. 2020 Oct 1.https://arxiv.org/pdf/2010.00502.pdf

    G. K. Shahi and D. Nandini, “FakeCovid – a multilingual cross-domain fact check news dataset for covid-19,” in workshop Proceedings of the 14th International AAAI Conference on Web and Social Media, 2020. http://workshop-proceedings.icwsm.org/abstract?id=2020_14

    Shahi, G. K., Dirkson, A., & Majchrzak, T. A. (2021). An exploratory study of covid-19 misinformation on twitter. Online Social Networks and Media, 22, 100104. doi: 10.1016/j.osnem.2020.100104

    Shahi, G. K., Struß, J. M., & Mandl, T. (2021). Overview of the CLEF-2021 CheckThat! lab task 3 on fake news detection. Working Notes of CLEF.

    Nakov, P., Da San Martino, G., Elsayed, T., Barrón-Cedeno, A., Míguez, R., Shaar, S., ... & Mandl, T. (2021, March). The CLEF-2021 CheckThat! lab on detecting check-worthy claims, previously fact-checked claims, and fake news. In European Conference on Information Retrieval (pp. 639-649). Springer, Cham.

    Nakov, P., Da San Martino, G., Elsayed, T., Barrón-Cedeño, A., Míguez, R., Shaar, S., ... & Kartal, Y. S. (2021, September). Overview of the CLEF–2021 CheckThat! Lab on Detecting Check-Worthy Claims, Previously Fact-Checked Claims, and Fake News. In International Conference of the Cross-Language Evaluation Forum for European Languages (pp. 264-291). Springer, Cham.

  9. f

    Data from: HOW TO PERFORM A META-ANALYSIS: A PRACTICAL STEP-BY-STEP GUIDE...

    • datasetcatalog.nlm.nih.gov
    • scielo.figshare.com
    Updated May 27, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Helito, Camilo Partezani; Gonçalves, Romeu Krause; de Lima, Lana Lacerda; Clazzer, Renata; de Lima, Diego Ariel; de Camargo, Olavo Pires (2022). HOW TO PERFORM A META-ANALYSIS: A PRACTICAL STEP-BY-STEP GUIDE USING R SOFTWARE AND RSTUDIO [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0000403452
    Explore at:
    Dataset updated
    May 27, 2022
    Authors
    Helito, Camilo Partezani; Gonçalves, Romeu Krause; de Lima, Lana Lacerda; Clazzer, Renata; de Lima, Diego Ariel; de Camargo, Olavo Pires
    Description

    ABSTRACT Meta-analysis is an adequate statistical technique to combine results from different studies, and its use has been growing in the medical field. Thus, not only knowing how to interpret meta-analysis, but also knowing how to perform one, is fundamental today. Therefore, the objective of this article is to present the basic concepts and serve as a guide for conducting a meta-analysis using R and RStudio software. For this, the reader has access to the basic commands in the R and RStudio software, necessary for conducting a meta-analysis. The advantage of R is that it is a free software. For a better understanding of the commands, two examples were presented in a practical way, in addition to revising some basic concepts of this statistical technique. It is assumed that the data necessary for the meta-analysis has already been collected, that is, the description of methodologies for systematic review is not a discussed subject. Finally, it is worth remembering that there are many other techniques used in meta-analyses that were not addressed in this work. However, with the two examples used, the article already enables the reader to proceed with good and robust meta-analyses. Level of Evidence V, Expert Opinion.

  10. f

    Data_Sheet_1_Integrating Multiple Data Types to Connect Ecological Theory...

    • frontiersin.figshare.com
    • datasetcatalog.nlm.nih.gov
    pdf
    Updated Jun 3, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jian D. L. Yen; Zeb Tonkin; Jarod Lyon; Wayne Koster; Adrian Kitchingman; Kasey Stamation; Peter A. Vesk (2023). Data_Sheet_1_Integrating Multiple Data Types to Connect Ecological Theory and Data Among Levels.pdf [Dataset]. http://doi.org/10.3389/fevo.2019.00095.s001
    Explore at:
    pdfAvailable download formats
    Dataset updated
    Jun 3, 2023
    Dataset provided by
    Frontiers
    Authors
    Jian D. L. Yen; Zeb Tonkin; Jarod Lyon; Wayne Koster; Adrian Kitchingman; Kasey Stamation; Peter A. Vesk
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Ecological theories often encompass multiple levels of biological organization, such as genes, individuals, populations, and communities. Despite substantial progress toward ecological theory spanning multiple levels, ecological data rarely are connected in this way. This is unfortunate because different types of ecological data often emerge from the same underlying processes and, therefore, are naturally connected among levels. Here, we describe an approach to integrate data collected at multiple levels (e.g., individuals, populations) in a single statistical analysis. The resulting integrated models make full use of existing data and might strengthen links between statistical ecology and ecological models and theories that span multiple levels of organization. Integrated models are increasingly feasible due to recent advances in computational statistics, which allow fast calculations of multiple likelihoods that depend on complex mechanistic models. We discuss recently developed integrated models and outline a simple application using data on freshwater fishes in south-eastern Australia. Available data on freshwater fishes include population survey data, mark-recapture data, and individual growth trajectories. We use these data to estimate age-specific survival and reproduction from size-structured data, accounting for imperfect detection of individuals. Given that such parameter estimates would be infeasible without an integrated model, we argue that integrated models will strengthen ecological theory by connecting theoretical and mathematical models directly to empirical data. Although integrated models remain conceptually and computationally challenging, integrating ecological data among levels is likely to be an important step toward unifying ecology among levels.

  11. MVUM Symbology - Motor Vehicle Use Map Roads (Labels)

    • catalog.data.gov
    • agdatacommons.nal.usda.gov
    • +2more
    Updated Aug 5, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Forest Service (2025). MVUM Symbology - Motor Vehicle Use Map Roads (Labels) [Dataset]. https://catalog.data.gov/dataset/mvum-symbology-motor-vehicle-use-map-roads-labels
    Explore at:
    Dataset updated
    Aug 5, 2025
    Dataset provided by
    U.S. Department of Agriculture Forest Servicehttp://fs.fed.us/
    Description

    The feature class indicates the specific types of motorized vehicles allowed on the designated routes and their seasons of use. The feature class is designed to be consistent with the Motor Vehicle Use Map (MVUM). Only roads with a SYMBOL attribute value of 1, 2, 3, 4, 11, and 12 are Forest Service System roads and contain data concerning their availability for OHV (Off Highway Vehicle) use. This data is published and refreshed on a unit by unit basis as needed. Information for each individual unit must be verified as to be consistent with the published MVUMs prior to inclusion in this data. Not every National Forest has data included in this feature class.

  12. Dataset for: Data generating models of dichotomous outcomes: Heterogeneity...

    • wiley.figshare.com
    application/gzip
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Konstantinos Pateras; Stavros Nikolakopoulos; Kit CB Roes (2023). Dataset for: Data generating models of dichotomous outcomes: Heterogeneity in simulation studies for a random-effects meta-analysis [Dataset]. http://doi.org/10.6084/m9.figshare.5588848
    Explore at:
    application/gzipAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    Wileyhttps://www.wiley.com/
    Authors
    Konstantinos Pateras; Stavros Nikolakopoulos; Kit CB Roes
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Simulation studies to evaluate performance of statistical methods require a well specified Data Generating Model. Details of these models are essential to interpret the results and arrive at proper conclusions. A case in point is random-effects meta-analysis of dichotomous outcomes. We reviewed a number of simulation studies that evaluated approximate normal models for meta-analysis of dichotomous outcomes and we assessed the data generating models that were used to generate events for a series of (heterogeneous) trials. We demonstrate that the performance of the statistical methods, as assessed by simulation, differs between these three alternative Data Generating Models, with larger differences apparent in the small population setting. Our findings are relevant to multilevel binomial models in general.

  13. Sweetener Market Data Historical Deliveries by Use - Multiple

    • catalog.data.gov
    • agdatacommons.nal.usda.gov
    • +1more
    Updated Apr 21, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Farm Service Agency, Department of Agriculture (2025). Sweetener Market Data Historical Deliveries by Use - Multiple [Dataset]. https://catalog.data.gov/dataset/sweetener-market-data-historical-deliveries-by-use-multiple
    Explore at:
    Dataset updated
    Apr 21, 2025
    Dataset provided by
    Farm Service Agencyhttps://www.fsa.usda.gov/
    United States Department of Agriculturehttp://usda.gov/
    Description

    Sweetener Market Data (SMD) report - beet and cane processors and cane refiners in the U.S. are required by the FAIR Act of 1996, as amended, to report data on physical quantities delivered by use for "Multiple and All Other Food Uses" on a monthly basis. Quantities are reported by region. Regions include: "New England", "Mid Atlantic", "North Central", "South", "West" and "Puerto Rico".

  14. Z

    Data from: Dataset used in article "A 2-dimensional guillotine cutting stock...

    • data-staging.niaid.nih.gov
    • produccioncientifica.ucm.es
    Updated Jul 10, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Terán-Viadero, Paula; Alonso-Ayuso, Antonio; Martín-Campo, F. Javier (2024). Dataset used in article "A 2-dimensional guillotine cutting stock problem with variable-sized stock for the honeycomb cardboard industry" [Dataset]. https://data-staging.niaid.nih.gov/resources?id=zenodo_8033003
    Explore at:
    Dataset updated
    Jul 10, 2024
    Dataset provided by
    Complutense University of Madrid
    Rey Juan Carlos University
    Authors
    Terán-Viadero, Paula; Alonso-Ayuso, Antonio; Martín-Campo, F. Javier
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The dataset presented is part of the one used in the article "A 2-dimensional guillotine cutting stock problem with variable-sized stock for the honeycomb cardboard industry" by P. Terán-Viadero, A. Alonso-Ayuso and F. Javier Martín-Campo, published in International Journal of Production Research (2023), doi: 10.1080/00207543.2023.2279129. In the paper mentioned above, two mathematical optimisation models are proposed for the Cutting Stock Problem in the honeycomb cardboard sector. This problem appears in a Spanish company and the models proposed have been tested with real orders received by the company, achieving a reduction of up to 50% in the leftover generated. The dataset presented here includes six of the twenty cases used in the paper (the rest cannot be presented for confidentiality reasons). For each case, the characteristics of the order and the solution obtained by the two models are provided for the different scenarios analysed in the paper.

    *Version 1.1 contains the same data but renamed according to the instances name in the final version of the article.*Version 1.2 adds the PDF with the accepted version of the article publised in International Journal of Production Research (2023), doi: 10.1080/00207543.2023.2279129.

  15. f

    Statistics KolmogorovSmirnov normality.txt

    • datasetcatalog.nlm.nih.gov
    • figshare.com
    Updated Dec 30, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Wahlbom, Anders (2019). Statistics KolmogorovSmirnov normality.txt [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0000112997
    Explore at:
    Dataset updated
    Dec 30, 2019
    Authors
    Wahlbom, Anders
    Description

    Kolmogorov Smirnov test for normality for the data sets in the article "Focal neocortical lesions impair distant neuronal information processing".Test 1: All stimulation patterns pre-stroke decoding abilityTest 2: All stimulation patterns post-stroke decoding abilityTest 3: Mean of all stiulation patterns pre-stroke decoding abilityTest 4: Mean of all stiulation patterns post-stroke decoding abilityTest 5: Prestroke Mean firing frequencyTest 6: Poststroke Mean firing frequencyTest 7: All stimulation patterns pre-stroke decoding ability, different stroke location, NOT USED in articleTest 8: All stimulation patterns post-stroke decoding ability, different stroke location, NOT USED in articleTest 9: Mean of all stiulation patterns pre-stroke decoding ability, different stroke location, NOT USED in articleTest 10: Mean of all stiulation patterns post-stroke decoding ability, different stroke location, NOT USED in article

  16. f

    Additional file 2: of Gene flow analysis method, the D-statistic, is robust...

    • datasetcatalog.nlm.nih.gov
    • springernature.figshare.com
    Updated Jan 9, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Zheng, Yichen; Janke, Axel (2018). Additional file 2: of Gene flow analysis method, the D-statistic, is robust in a wide parameter space [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0000701696
    Explore at:
    Dataset updated
    Jan 9, 2018
    Authors
    Zheng, Yichen; Janke, Axel
    Description

    Detailed results from each dataset. Description: Input parameters, sensitivity, significance information and linear regression of the f-statistics, from all datasets in three simulation schemes. (XLSX 214 kb)

  17. d

    Data from: Data release associated with the journal article "Solar and...

    • catalog.data.gov
    • data.usgs.gov
    • +1more
    Updated Nov 19, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Geological Survey (2025). Data release associated with the journal article "Solar and sensor geometry, not vegetation response, drive satellite NDVI phenology in widespread ecosystems of the western United States" [Dataset]. https://catalog.data.gov/dataset/data-release-associated-with-the-journal-article-solar-and-sensor-geometry-not-vegetation-
    Explore at:
    Dataset updated
    Nov 19, 2025
    Dataset provided by
    United States Geological Surveyhttp://www.usgs.gov/
    Area covered
    Western United States, United States
    Description

    This dataset supports the following publication: "Solar and sensor geometry, not vegetation response, drive satellite NDVI phenology in widespread ecosystems of the western United States" (DOI:10.1016/j.rse.2020.112013). The data release allows users to replicate, test, or further explore results. The dataset consists of 4 separate items based on the analysis approach used in the original publication 1) the 'Phenocam' dataset uses images from a phenocam in a pinyon juniper ecosystem in Grand Canyon National Park to determine phenological patterns of multiple plant species. The 'Phenocam' dataset consists of scripts and tabular data developed while performing analyses and includes the final NDVI values for all areas of interest (AOIs) described in the associated publication. 2) the 'SolarSensorAnalysis' dataset uses downloaded tabular MODIS data to explore relationships between NDVI and multiple solar and sensor angles. The 'SolarSensorAnalysis' dataset consists of download and analysis scripts in Google Earth Engine and R. The source MODIS data used in the analysis are too large to include but are provided through MODIS providers and can be accessed through Google Earth Engine using the included script. A csv file includes solar and sensor angle information for the MODIS pixel closest to the phenocam as well as for a sample of 100 randomly selected MODIS pixels within the GRCA-PJ ecosystem. 3) the 'WinterPeakExtent' dataset includes final geotiffs showing the temporal frequency extent and associated vegetation physiognomic types experiencing winter NDVI peaks in the western US. 4) the "SensorComparison" dataset contains the NDVI time series at the phenocam location from 4 other satellites as well as the code used to download these data.

  18. WIC Infant and Toddler Feeding Practices Study-2 (WIC ITFPS-2): Prenatal,...

    • agdatacommons.nal.usda.gov
    txt
    Updated Nov 21, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    USDA FNS Office of Policy Support (2025). WIC Infant and Toddler Feeding Practices Study-2 (WIC ITFPS-2): Prenatal, Infant Year 5 Year Datasets [Dataset]. http://doi.org/10.15482/USDA.ADC/1528196
    Explore at:
    txtAvailable download formats
    Dataset updated
    Nov 21, 2025
    Dataset provided by
    Food and Nutrition Servicehttps://www.fns.usda.gov/
    United States Department of Agriculturehttp://usda.gov/
    Authors
    USDA FNS Office of Policy Support
    License

    U.S. Government Workshttps://www.usa.gov/government-works
    License information was derived automatically

    Description

    The WIC Infant and Toddler Feeding Practices Study–2 (WIC ITFPS-2) (also known as the “Feeding My Baby Study”) is a national, longitudinal study that captures data on caregivers and their children who participated in the Special Supplemental Nutrition Program for Women, Infants, and Children (WIC) around the time of the child’s birth. The study addresses a series of research questions regarding feeding practices, the effect of WIC services on those practices, and the health and nutrition outcomes of children on WIC. Additionally, the study assesses changes in behaviors and trends that may have occurred over the past 20 years by comparing findings to the WIC Infant Feeding Practices Study–1 (WIC IFPS-1), the last major study of the diets of infants on WIC. This longitudinal cohort study has generated a series of reports. These datasets include data from caregivers and their children during the prenatal period and during the children’s first five years of life (child ages 1 to 60 months). A full description of the study design and data collection methods can be found in Chapter 1 of the Second Year Report (https://www.fns.usda.gov/wic/wic-infant-and-toddler-feeding-practices-st...). A full description of the sampling and weighting procedures can be found in Appendix B-1 of the Fourth Year Report (https://fns-prod.azureedge.net/sites/default/files/resource-files/WIC-IT...). Processing methods and equipment used Data in this dataset were primarily collected via telephone interview with caregivers. Children’s length/height and weight data were objectively collected while at the WIC clinic or during visits with healthcare providers. The study team cleaned the raw data to ensure the data were as correct, complete, and consistent as possible. Study date(s) and duration Data collection occurred between 2013 and 2019. Study spatial scale (size of replicates and spatial scale of study area) Respondents were primarily the caregivers of children who received WIC services around the time of the child’s birth. Data were collected from 80 WIC sites across 27 State agencies. Level of true replication Unknown Sampling precision (within-replicate sampling or pseudoreplication) This dataset includes sampling weights that can be applied to produce national estimates. A full description of the sampling and weighting procedures can be found in Appendix B-1 of the Fourth Year Report (https://fns-prod.azureedge.net/sites/default/files/resource-files/WIC-IT...). Level of subsampling (number and repeat or within-replicate sampling) A full description of the sampling and weighting procedures can be found in Appendix B-1 of the Fourth Year Report (https://fns-prod.azureedge.net/sites/default/files/resource-files/WIC-IT...). Study design (before–after, control–impacts, time series, before–after-control–impacts) Longitudinal cohort study. Description of any data manipulation, modeling, or statistical analysis undertaken Each entry in the dataset contains caregiver-level responses to telephone interviews. Also available in the dataset are children’s length/height and weight data, which were objectively collected while at the WIC clinic or during visits with healthcare providers. In addition, the file contains derived variables used for analytic purposes. The file also includes weights created to produce national estimates. The dataset does not include any personally-identifiable information for the study children and/or for individuals who completed the telephone interviews. Description of any gaps in the data or other limiting factors Please refer to the series of annual WIC ITFPS-2 reports (https://www.fns.usda.gov/wic/infant-and-toddler-feeding-practices-study-2-fourth-year-report) for detailed explanations of the study’s limitations. Outcome measurement methods and equipment used The majority of outcomes were measured via telephone interviews with children’s caregivers. Dietary intake was assessed using the USDA Automated Multiple Pass Method (https://www.ars.usda.gov/northeast-area/beltsville-md-bhnrc/beltsville-h...). Children’s length/height and weight data were objectively collected while at the WIC clinic or during visits with healthcare providers. Resources in this dataset:Resource Title: ITFP2 Year 5 Enroll to 60 Months Public Use Data CSV. File Name: itfps2_enrollto60m_publicuse.csvResource Description: ITFP2 Year 5 Enroll to 60 Months Public Use Data CSVResource Title: ITFP2 Year 5 Enroll to 60 Months Public Use Data Codebook. File Name: ITFPS2_EnrollTo60m_PUF_Codebook.pdfResource Description: ITFP2 Year 5 Enroll to 60 Months Public Use Data CodebookResource Title: ITFP2 Year 5 Enroll to 60 Months Public Use Data SAS SPSS STATA R Data. File Name: ITFP@_Year5_Enroll60_SAS_SPSS_STATA_R.zipResource Description: ITFP2 Year 5 Enroll to 60 Months Public Use Data SAS SPSS STATA R DataResource Title: ITFP2 Year 5 Ana to 60 Months Public Use Data CSV. File Name: ampm_1to60_ana_publicuse.csvResource Description: ITFP2 Year 5 Ana to 60 Months Public Use Data CSVResource Title: ITFP2 Year 5 Tot to 60 Months Public Use Data Codebook. File Name: AMPM_1to60_Tot Codebook.pdfResource Description: ITFP2 Year 5 Tot to 60 Months Public Use Data CodebookResource Title: ITFP2 Year 5 Ana to 60 Months Public Use Data Codebook. File Name: AMPM_1to60_Ana Codebook.pdfResource Description: ITFP2 Year 5 Ana to 60 Months Public Use Data CodebookResource Title: ITFP2 Year 5 Ana to 60 Months Public Use Data SAS SPSS STATA R Data. File Name: ITFP@_Year5_Ana_60_SAS_SPSS_STATA_R.zipResource Description: ITFP2 Year 5 Ana to 60 Months Public Use Data SAS SPSS STATA R DataResource Title: ITFP2 Year 5 Tot to 60 Months Public Use Data CSV. File Name: ampm_1to60_tot_publicuse.csvResource Description: ITFP2 Year 5 Tot to 60 Months Public Use Data CSVResource Title: ITFP2 Year 5 Tot to 60 Months Public Use SAS SPSS STATA R Data. File Name: ITFP@_Year5_Tot_60_SAS_SPSS_STATA_R.zipResource Description: ITFP2 Year 5 Tot to 60 Months Public Use SAS SPSS STATA R DataResource Title: ITFP2 Year 5 Food Group to 60 Months Public Use Data CSV. File Name: ampm_foodgroup_1to60m_publicuse.csvResource Description: ITFP2 Year 5 Food Group to 60 Months Public Use Data CSVResource Title: ITFP2 Year 5 Food Group to 60 Months Public Use Data Codebook. File Name: AMPM_FoodGroup_1to60m_Codebook.pdfResource Description: ITFP2 Year 5 Food Group to 60 Months Public Use Data CodebookResource Title: ITFP2 Year 5 Food Group to 60 Months Public Use SAS SPSS STATA R Data. File Name: ITFP@_Year5_Foodgroup_60_SAS_SPSS_STATA_R.zipResource Title: WIC Infant and Toddler Feeding Practices Study-2 Data File Training Manual. File Name: WIC_ITFPS-2_DataFileTrainingManual.pdf

  19. c

    EEG-BCI Dataset for Motor Imagery and Overt Spatial Attention EEG-BCI...

    • kilthub.cmu.edu
    • datasetcatalog.nlm.nih.gov
    zip
    Updated Aug 4, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dylan Forenzo; Bin He (2023). EEG-BCI Dataset for Motor Imagery and Overt Spatial Attention EEG-BCI Control [Dataset]. http://doi.org/10.1184/R1/23677098.v1
    Explore at:
    zipAvailable download formats
    Dataset updated
    Aug 4, 2023
    Dataset provided by
    Carnegie Mellon University
    Authors
    Dylan Forenzo; Bin He
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset consists of EEG recordings and Brain-Computer Interface (BCI) data from 25 different human subjects performing BCI experiments. More information can be found in the corresponding manuscript:

    Dylan Forenzo, Yixuan Liu, Jeehyun Kim, Yidan Ding, Taehyung Yoon, Bin He: “Integrating Simultaneous Motor Imagery and Spatial Attention for EEG-BCI Control”, IEEE Transactions on Biomedical Engineering (10.1109/TBME.2023.3298957).

    Please cite this paper if you use any data included in this dataset.

    The dataset was collected under the support of NIH grants AT009263, EB021027, EB029354, NS096761, NS124564 to Dr. Bin He at Carnegie Mellon University.

    Each file is a MATLAB object (.mat file) which contains data from a single run of BCI control. The MATLAB files are grouped into folders based on the Subject, one for each of the 25 subjects studied. Each subject completed 5 sessions of BCI experiments and each session consisted of either 18 (sessions 1 and 2) or 15 (sessions 3-5) runs, for a total of 81 runs per subject or 2025 total BCI runs.

    Each of the MATLAB files contains a single structure with the following fields:

    data: An array containing the EEG recordings with the size (channels x time points)

    times: A vector containing the timestamps in milisceonds with the size (1 x time points)

    fs: sampling frequency (1000 Hz)

    labels: A cell array containing the label for each channel

    targets: A list of target codes. For LR: 1 is right, 2 is left. For UD: 1 is up, 2 is down. For 2D: 1 is right, 2 is left, 3 is up, and 4 is down

    event: A structure of events from BCI2000. Each index corresponds to the start of a trial and includes the time (latency) of when the trial starts, and how long each trial lasted (duration).

    results: A vector of which target was hit for each trial (0 if the trial was aborted before a target was hit)

    outcome: A vector indicating the outcome of each trail (1: hit, 0: abort, -1: miss)

    subject: The coded subject number

    session: The session number. Please note that the session numbers are for specific tasks, so even though 2D sessions began on the third day of experiments, the 2D runs are listed as session 1, 2, and 3 as they are the first, second, and third 2D sessions.

    axis: The axis of control. Either LR (horizontal only, Left-Right), UD (vertical only, Up-Down), or 2D (both horizontal and vertical control).

    task: The control paradigm used. Options are MI (motor imagery), OSA (overt spatial attention), MIOSA (MI and OSA together), MIOSA1 (MI controls horizontal axis, OSA controls vertical. Referred to as MI/OSA in the paper), or MIOSA2 (MI controls vertical axis, OSA controls horizontal. Referred to as OSA/MI in the paper).

    run: The run number

  20. f

    Description of data used in cross-sectional analysis.

    • datasetcatalog.nlm.nih.gov
    • plos.figshare.com
    Updated Dec 18, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Stead, Martine; Adamson, Ashley J.; Ejlerskov, Katrine T.; Sharp, Stephen J.; Adams, Jean; White, Martin (2018). Description of data used in cross-sectional analysis. [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0000620646
    Explore at:
    Dataset updated
    Dec 18, 2018
    Authors
    Stead, Martine; Adamson, Ashley J.; Ejlerskov, Katrine T.; Sharp, Stephen J.; Adams, Jean; White, Martin
    Description

    Description of data used in cross-sectional analysis.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Matthew James Shun-Shin; Darrel P. Francis (2023). Why Even More Clinical Research Studies May Be False: Effect of Asymmetrical Handling of Clinically Unexpected Values [Dataset]. http://doi.org/10.1371/journal.pone.0065323
Organization logo

Why Even More Clinical Research Studies May Be False: Effect of Asymmetrical Handling of Clinically Unexpected Values

Explore at:
10 scholarly articles cite this dataset (View in Google Scholar)
xlsAvailable download formats
Dataset updated
Jun 2, 2023
Dataset provided by
PLOShttp://plos.org/
Authors
Matthew James Shun-Shin; Darrel P. Francis
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

BackgroundIn medical practice, clinically unexpected measurements might be quite properly handled by the remeasurement, removal, or reclassification of patients. If these habits are not prevented during clinical research, how much of each is needed to sway an entire study?Methods and ResultsBelieving there is a difference between groups, a well-intentioned clinician researcher addresses unexpected values. We tested how much removal, remeasurement, or reclassification of patients would be needed in most cases to turn an otherwise-neutral study positive. Remeasurement of 19 patients out of 200 per group was required to make most studies positive. Removal was more powerful: just 9 out of 200 was enough. Reclassification was most powerful, with 5 out of 200 enough. The larger the study, the smaller the proportion of patients needing to be manipulated to make the study positive: the percentages needed to be remeasured, removed, or reclassified fell from 45%, 20%, and 10% respectively for a 20 patient-per-group study, to 4%, 2%, and 1% for an 800 patient-per-group study. Dot-plots, but not bar-charts, make the perhaps-inadvertent manipulations visible. Detection is possible using statistical methods such as the Tadpole test.ConclusionsBehaviours necessary for clinical practice are destructive to clinical research. Even small amounts of selective remeasurement, removal, or reclassification can produce false positive results. Size matters: larger studies are proportionately more vulnerable. If observational studies permit selective unblinded enrolment, malleable classification, or selective remeasurement, then results are not credible. Clinical research is very vulnerable to “remeasurement, removal, and reclassification”, the 3 evil R's.

Search
Clear search
Close search
Google apps
Main menu