100+ datasets found
  1. Statistical Analysis of Individual Participant Data Meta-Analyses: A...

    • plos.figshare.com
    • datasetcatalog.nlm.nih.gov
    tiff
    Updated Jun 8, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Gavin B. Stewart; Douglas G. Altman; Lisa M. Askie; Lelia Duley; Mark C. Simmonds; Lesley A. Stewart (2023). Statistical Analysis of Individual Participant Data Meta-Analyses: A Comparison of Methods and Recommendations for Practice [Dataset]. http://doi.org/10.1371/journal.pone.0046042
    Explore at:
    tiffAvailable download formats
    Dataset updated
    Jun 8, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Gavin B. Stewart; Douglas G. Altman; Lisa M. Askie; Lelia Duley; Mark C. Simmonds; Lesley A. Stewart
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    BackgroundIndividual participant data (IPD) meta-analyses that obtain “raw” data from studies rather than summary data typically adopt a “two-stage” approach to analysis whereby IPD within trials generate summary measures, which are combined using standard meta-analytical methods. Recently, a range of “one-stage” approaches which combine all individual participant data in a single meta-analysis have been suggested as providing a more powerful and flexible approach. However, they are more complex to implement and require statistical support. This study uses a dataset to compare “two-stage” and “one-stage” models of varying complexity, to ascertain whether results obtained from the approaches differ in a clinically meaningful way. Methods and FindingsWe included data from 24 randomised controlled trials, evaluating antiplatelet agents, for the prevention of pre-eclampsia in pregnancy. We performed two-stage and one-stage IPD meta-analyses to estimate overall treatment effect and to explore potential treatment interactions whereby particular types of women and their babies might benefit differentially from receiving antiplatelets. Two-stage and one-stage approaches gave similar results, showing a benefit of using anti-platelets (Relative risk 0.90, 95% CI 0.84 to 0.97). Neither approach suggested that any particular type of women benefited more or less from antiplatelets. There were no material differences in results between different types of one-stage model. ConclusionsFor these data, two-stage and one-stage approaches to analysis produce similar results. Although one-stage models offer a flexible environment for exploring model structure and are useful where across study patterns relating to types of participant, intervention and outcome mask similar relationships within trials, the additional insights provided by their usage may not outweigh the costs of statistical support for routine application in syntheses of randomised controlled trials. Researchers considering undertaking an IPD meta-analysis should not necessarily be deterred by a perceived need for sophisticated statistical methods when combining information from large randomised trials.

  2. d

    Data from: A simple method for statistical analysis of intensity differences...

    • catalog.data.gov
    • healthdata.gov
    • +1more
    Updated Sep 7, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    National Institutes of Health (2025). A simple method for statistical analysis of intensity differences in microarray-derived gene expression data [Dataset]. https://catalog.data.gov/dataset/a-simple-method-for-statistical-analysis-of-intensity-differences-in-microarray-derived-ge
    Explore at:
    Dataset updated
    Sep 7, 2025
    Dataset provided by
    National Institutes of Health
    Description

    Background Microarray experiments offer a potent solution to the problem of making and comparing large numbers of gene expression measurements either in different cell types or in the same cell type under different conditions. Inferences about the biological relevance of observed changes in expression depend on the statistical significance of the changes. In lieu of many replicates with which to determine accurate intensity means and variances, reliable estimates of statistical significance remain problematic. Without such estimates, overly conservative choices for significance must be enforced. Results A simple statistical method for estimating variances from microarray control data which does not require multiple replicates is presented. Comparison of datasets from two commercial entities using this difference-averaging method demonstrates that the standard deviation of the signal scales at a level intermediate between the signal intensity and its square root. Application of the method to a dataset related to the β-catenin pathway yields a larger number of biologically reasonable genes whose expression is altered than the ratio method. Conclusions The difference-averaging method enables determination of variances as a function of signal intensities by averaging over the entire dataset. The method also provides a platform-independent view of important statistical properties of microarray data.

  3. f

    Statistical test of changes between experimental tasks.

    • datasetcatalog.nlm.nih.gov
    • plos.figshare.com
    Updated Oct 2, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Simpson, David M.; Faes, Luca; Beda, Alessandro (2017). Statistical test of changes between experimental tasks. [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0001799846
    Explore at:
    Dataset updated
    Oct 2, 2017
    Authors
    Simpson, David M.; Faes, Luca; Beda, Alessandro
    Description

    Statistical test of changes between experimental tasks.

  4. YouTube Video and Channel Analysis

    • kaggle.com
    zip
    Updated Dec 19, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Devastator (2023). YouTube Video and Channel Analysis [Dataset]. https://www.kaggle.com/datasets/thedevastator/youtube-video-and-channel-analysis/discussion
    Explore at:
    zip(85613002 bytes)Available download formats
    Dataset updated
    Dec 19, 2023
    Authors
    The Devastator
    Area covered
    YouTube
    Description

    YouTube Video and Channel Analysis

    YouTube Video and Channel Statistics

    By VISHWANATH SESHAGIRI [source]

    About this dataset

    This dataset contains valuable information about YouTube videos and channels, including various metrics related to views, likes, dislikes, comments, and other related statistics. The dataset consists of 9 direct features and 13 indirect features. The direct features include the ratio of comments on a video to the number of views on the video (comments/views), the total number of subscribers of the channel (subscriberCount), the ratio of likes on a video to the number of subscribers of the channel (likes/subscriber), the total number of views on the channel (channelViewCount), and several other informative ratios such as views/elapsedtime, totalviews/channelelapsedtime, comments/subscriber, views/subscribers, dislikes/subscriber.

    The dataset also includes indirect features that are derived from YouTube's API. These indirect features provide additional insights into videos and channels by considering factors such as dislikes/views ratio, channelCommentCount (total number of comments on the channel), likes/dislikes ratio, totviews/totsubs ratio (total views on a video to total subscribers of a channel), and more.

    The objective behind analyzing this dataset is to establish statistical relationships between videos and channels within YouTube. Furthermore, this analysis aims to form a topic tree based on these statistical relations.

    For further exploration or utilization purposes beyond this dataset description document itself, you can refer to relevant repositories such as the GitHub repository associated with this dataset where you might find useful resources that complement or expand upon what is available in this dataset.

    Overall,this comprehensive collection provides diverse insights into YouTube video and channel metadata for conducting statistical analyses in order to better understand viewer engagement patterns varies parameters across different channels. With its range from basic counts like subscriber counts,counting no.of viewership per minute , timing vs viewership rate ,text related user responses etc.,this detailed Youtube Dataset will assist in making informed decisions regarding channel optimization,more effective targeting and creation of content that will appeal to the target audience

    How to use the dataset

    This dataset provides valuable information about YouTube videos and their corresponding channels. With this data, you can perform statistical analysis to gain insights into various aspects of YouTube video and channel performance. Here is a guide on how to effectively use this dataset for your analysis:

    • Understanding the Columns:
      • totalviews/channelelapsedtime: The ratio of total views of a video to the elapsed time of the channel.
      • channelViewCount: The total number of views on the channel.
      • likes/subscriber: The ratio of likes on a video to the number of subscribers of the channel.
      • views/subscribers: The ratio of views on a video to the number of subscribers of the channel.
      • subscriberCount: The total number of subscribers of the channel.
      • dislikes/views: The ratio

    Research Ideas

    • Predicting the popularity of YouTube videos: By analyzing the various ratios and metrics in this dataset, such as comments/views, likes/subscriber, and views/subscribers, one can build predictive models to estimate the popularity or engagement level of YouTube videos. This can help content creators or businesses understand which types of videos are likely to be successful and tailor their content accordingly.
    • Analyzing channel performance: The dataset provides information about the total number of views on a channel (channelViewCount), the number of subscribers (subscriberCount), and other related statistics. By examining metrics like views/elapsedtime and totalviews/channelelapsedtime, one can assess how well a channel is performing over time. This analysis can help content creators identify trends or patterns in their viewership and make informed decisions about their video strategies.
    • Understanding audience engagement: Ratios like comments/subscriber, likes/dislikes, dislikes/subscriber provide insights into how engaged a channel's subscribers are with its content. By examining these ratios across multiple videos or channels, one can identify trends in audience behavior and preferences. For example, a high ratio of comments/subscriber may indicate strong community participation and active discussion around the videos posted by a particular YouTuber or channel

    Acknowledgements

    If you use this dataset in y...

  5. Ad hoc statistical analysis: 2020/21 Quarter 4

    • gov.uk
    • s3.amazonaws.com
    Updated Sep 25, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Department for Digital, Culture, Media & Sport (2024). Ad hoc statistical analysis: 2020/21 Quarter 4 [Dataset]. https://www.gov.uk/government/statistical-data-sets/ad-hoc-statistical-analysis-202021-quarter-4
    Explore at:
    Dataset updated
    Sep 25, 2024
    Dataset provided by
    GOV.UKhttp://gov.uk/
    Authors
    Department for Digital, Culture, Media & Sport
    Description

    This page lists ad-hoc statistics released during the period January - March 2021. These are additional analyses not included in any of the Department for Digital, Culture, Media and Sport’s standard publications.

    If you would like any further information please contact evidence@dcms.gov.uk.

    January 2021 - Employment in DCMS sectors by socio-economic background: July 2020 to September 2020

    This analysis provides estimates of employment in DCMS sectors based on socio-economic background, using the Labour Force Survey (LFS) for July 2020 to September 2020. The LFS asks respondents the job of main earner at age 14, and then matches this to a socio-economic group.

    Revision note:

    25 September 2024: Employment in DCMS sectors by socio-economic background: July to September 2020 data has been revised and re-published here: DCMS Economic Estimates: Employment, April 2023 to March 2024

    February 2021 - GVA by industries in DCMS clusters, 2019

    This analysis provides the Gross Value Added (GVA) in 2019 for DCMS clusters and for Civil Society. The figures show that in 2019, the DCMS Clusters contributed £291.9 bn to the UK economy, accounting for 14.8% of UK GVA (expressed in current prices). The largest cluster was Digital, which added £116.3 bn in GVA in 2019, and the smallest was Gambling (£8.3 bn).

    https://assets.publishing.service.gov.uk/media/602d27ebd3bf7f722294d195/DCMS_Clusters_GVA_Tables.xlsx">GVA by industries in DCMS clusters, 2019

     <p class="gem-c-attachment_metadata"><span class="gem-c-attachment_attribute">MS Excel Spreadsheet</span>, <span class="gem-c-attachment_attribute">111 KB</span></p>
    

    March 2021 - Provisional monthly Gross Value Added for DCMS sectors in 2019 and 2020

    This analysis provides provisional estimates of Gross Value Added (adjusted for inflation) for DCMS sectors (excluding Civil Society) for every month in 2019 and 2020. These timely estimates should only be used to illustrate general trends, rather than be taken as definitive figures. These figures will not be as accurate as our annual National Statistics release of gross value added for DCMS sectors (which will be published in Winter 2021).

    We estimate that the gross value added of DCMS sectors (excluding Civil Society) shrank by 18% in real terms for March to December 2020 (a loss of £41 billion), compared to the same period in 2019. By sector this varied from -5% (Telecoms) to -37% (Tourism). In comparison, the UK economy as a whole shrank by 11%.

  6. Statistical tests of various types of degree distributions.

    • plos.figshare.com
    xls
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Petter Holme; Mikael Huss; Sang Hoon Lee (2023). Statistical tests of various types of degree distributions. [Dataset]. http://doi.org/10.1371/journal.pone.0019759.t001
    Explore at:
    xlsAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Petter Holme; Mikael Huss; Sang Hoon Lee
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Statistics of the reactions in the bipartite representation are omitted since they are not fat-tailed. “Y” (“N”) indicates that the data set is consistent (inconsistent) with the tested hypothesis. “PL” stands for “power-law” (i.e., testing for a power-law hypothesis); “LN” means “log-normal”.

  7. m

    COVID-19 Combined Data-set with Improved Measurement Errors

    • data.mendeley.com
    Updated May 13, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Afshin Ashofteh (2020). COVID-19 Combined Data-set with Improved Measurement Errors [Dataset]. http://doi.org/10.17632/nw5m4hs3jr.3
    Explore at:
    Dataset updated
    May 13, 2020
    Authors
    Afshin Ashofteh
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Public health-related decision-making on policies aimed at controlling the COVID-19 pandemic outbreak depends on complex epidemiological models that are compelled to be robust and use all relevant available data. This data article provides a new combined worldwide COVID-19 dataset obtained from official data sources with improved systematic measurement errors and a dedicated dashboard for online data visualization and summary. The dataset adds new measures and attributes to the normal attributes of official data sources, such as daily mortality, and fatality rates. We used comparative statistical analysis to evaluate the measurement errors of COVID-19 official data collections from the Chinese Center for Disease Control and Prevention (Chinese CDC), World Health Organization (WHO) and European Centre for Disease Prevention and Control (ECDC). The data is collected by using text mining techniques and reviewing pdf reports, metadata, and reference data. The combined dataset includes complete spatial data such as countries area, international number of countries, Alpha-2 code, Alpha-3 code, latitude, longitude, and some additional attributes such as population. The improved dataset benefits from major corrections on the referenced data sets and official reports such as adjustments in the reporting dates, which suffered from a one to two days lag, removing negative values, detecting unreasonable changes in historical data in new reports and corrections on systematic measurement errors, which have been increasing as the pandemic outbreak spreads and more countries contribute data for the official repositories. Additionally, the root mean square error of attributes in the paired comparison of datasets was used to identify the main data problems. The data for China is presented separately and in more detail, and it has been extracted from the attached reports available on the main page of the CCDC website. This dataset is a comprehensive and reliable source of worldwide COVID-19 data that can be used in epidemiological models assessing the magnitude and timeline for confirmed cases, long-term predictions of deaths or hospital utilization, the effects of quarantine, stay-at-home orders and other social distancing measures, the pandemic’s turning point or in economic and social impact analysis, helping to inform national and local authorities on how to implement an adaptive response approach to re-opening the economy, re-open schools, alleviate business and social distancing restrictions, design economic programs or allow sports events to resume.

  8. Pre and Post-Exercise Heart Rate Analysis

    • kaggle.com
    zip
    Updated Sep 29, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Abdullah M Almutairi (2024). Pre and Post-Exercise Heart Rate Analysis [Dataset]. https://www.kaggle.com/datasets/abdullahmalmutairi/pre-and-post-exercise-heart-rate-analysis
    Explore at:
    zip(3857 bytes)Available download formats
    Dataset updated
    Sep 29, 2024
    Authors
    Abdullah M Almutairi
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Description

    Dataset Overview:

    This dataset contains simulated (hypothetical) but almost realistic (based on AI) data related to sleep, heart rate, and exercise habits of 500 individuals. It includes both pre-exercise and post-exercise resting heart rates, allowing for analyses such as a dependent t-test (Paired Sample t-test) to observe changes in heart rate after an exercise program. The dataset also includes additional health-related variables, such as age, hours of sleep per night, and exercise frequency.

    The data is designed for tasks involving hypothesis testing, health analytics, or even machine learning applications that predict changes in heart rate based on personal attributes and exercise behavior. It can be used to understand the relationships between exercise frequency, sleep, and changes in heart rate.

    File: Filename: heart_rate_data.csv File Format: CSV

    - Features (Columns):

    Age: Description: The age of the individual. Type: Integer Range: 18-60 years Relevance: Age is an important factor in determining heart rate and the effects of exercise.

    Sleep Hours: Description: The average number of hours the individual sleeps per night. Type: Float Range: 3.0 - 10.0 hours Relevance: Sleep is a crucial health metric that can impact heart rate and exercise recovery.

    Exercise Frequency (Days/Week): Description: The number of days per week the individual engages in physical exercise. Type: Integer Range: 1-7 days/week Relevance: More frequent exercise may lead to greater heart rate improvements and better cardiovascular health.

    Resting Heart Rate Before: Description: The individual’s resting heart rate measured before beginning a 6-week exercise program. Type: Integer Range: 50 - 100 bpm (beats per minute) Relevance: This is a key health indicator, providing a baseline measurement for the individual’s heart rate.

    Resting Heart Rate After: Description: The individual’s resting heart rate measured after completing the 6-week exercise program. Type: Integer Range: 45 - 95 bpm (lower than the "Resting Heart Rate Before" due to the effects of exercise). Relevance: This variable is essential for understanding how exercise affects heart rate over time, and it can be used to perform a dependent t-test analysis.

    Max Heart Rate During Exercise: Description: The maximum heart rate the individual reached during exercise sessions. Type: Integer Range: 120 - 190 bpm Relevance: This metric helps in understanding cardiovascular strain during exercise and can be linked to exercise frequency or fitness levels.

    Potential Uses: Dependent T-Test Analysis: The dataset is particularly suited for a dependent (paired) t-test where you compare the resting heart rate before and after the exercise program for each individual.

    Exploratory Data Analysis (EDA):Investigate relationships between sleep, exercise frequency, and changes in heart rate. Potential analyses include correlations between sleep hours and resting heart rate improvement, or regression analyses to predict heart rate after exercise.

    Machine Learning: Use the dataset for predictive modeling, and build a beginner regression model to predict post-exercise heart rate using age, sleep, and exercise frequency as features.

    Health and Fitness Insights: This dataset can be useful for studying how different factors like sleep and age influence heart rate changes and overall cardiovascular health.

    License: Choose an appropriate open license, such as:

    CC BY 4.0 (Attribution 4.0 International).

    Inspiration for Kaggle Users: How does exercise frequency influence the reduction in resting heart rate? Is there a relationship between sleep and heart rate improvements post-exercise? Can we predict the post-exercise heart rate using other health variables? How do age and exercise frequency interact to affect heart rate?

    Acknowledgments: This is a simulated dataset for educational purposes, generated to demonstrate statistical and machine learning applications in the field of health analytics.

  9. Comparison of the capabilities between the existing statistical test and the...

    • plos.figshare.com
    xls
    Updated Oct 2, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bernat Salbanya; Carlos Carrasco-Farré; Jordi Nin (2024). Comparison of the capabilities between the existing statistical test and the expanded methods. [Dataset]. http://doi.org/10.1371/journal.pone.0309005.t002
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Oct 2, 2024
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Bernat Salbanya; Carlos Carrasco-Farré; Jordi Nin
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Comparison of the capabilities between the existing statistical test and the expanded methods.

  10. A tabular summary of power analysis of two-sided t-test for two independent...

    • plos.figshare.com
    txt
    Updated Jun 16, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Wei Zhuang; Luísa Camacho; Camila S. Silva; Michael Thomson; Kevin Snyder (2023). A tabular summary of power analysis of two-sided t-test for two independent groups. [Dataset]. http://doi.org/10.1371/journal.pone.0263070.s004
    Explore at:
    txtAvailable download formats
    Dataset updated
    Jun 16, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Wei Zhuang; Luísa Camacho; Camila S. Silva; Michael Thomson; Kevin Snyder
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The sample size is 5 in each group. The population mean mu1 varies from 20 to 30 in Group 1, while the population mean mu2 varies from 32 to 40 in Group 2. For simplicity, the population standard deviation is fixed to be 1 in each group. The significance levels (alpha) are set to be 0.001, 0.005, 0.05, or 0.2 for analytical illustration. (CSV)

  11. f

    Statistical Analysis of the individual foraging on resource varying in...

    • datasetcatalog.nlm.nih.gov
    • plos.figshare.com
    Updated Mar 17, 2014
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Konzmann, Sabine; Lunau, Klaus (2014). Statistical Analysis of the individual foraging on resource varying in quality and quantity experiment. [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0001211596
    Explore at:
    Dataset updated
    Mar 17, 2014
    Authors
    Konzmann, Sabine; Lunau, Klaus
    Description

    Results of the repeated measures Anova applied to the series of tests of individual foraging on resources varying in quality and quantity experiment shown in Figs. 6 and 7.

  12. Numpy , pandas and matplot lib practice

    • kaggle.com
    zip
    Updated Jul 16, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    pratham saraf (2023). Numpy , pandas and matplot lib practice [Dataset]. https://www.kaggle.com/datasets/prathamsaraf1389/numpy-pandas-and-matplot-lib-practise/suggestions
    Explore at:
    zip(385020 bytes)Available download formats
    Dataset updated
    Jul 16, 2023
    Authors
    pratham saraf
    License

    https://cdla.io/permissive-1-0/https://cdla.io/permissive-1-0/

    Description

    The dataset has been created specifically for practicing Python, NumPy, Pandas, and Matplotlib. It is designed to provide a hands-on learning experience in data manipulation, analysis, and visualization using these libraries.

    Specifics of the Dataset:

    The dataset consists of 5000 rows and 20 columns, representing various features with different data types and distributions. The features include numerical variables with continuous and discrete distributions, categorical variables with multiple categories, binary variables, and ordinal variables. Each feature has been generated using different probability distributions and parameters to introduce variations and simulate real-world data scenarios. The dataset is synthetic and does not represent any real-world data. It has been created solely for educational purposes.

    One of the defining characteristics of this dataset is the intentional incorporation of various real-world data challenges:

    Certain columns are randomly selected to be populated with NaN values, effectively simulating the common challenge of missing data. - The proportion of these missing values in each column varies randomly between 1% to 70%. - Statistical noise has been introduced in the dataset. For numerical values in some features, this noise adheres to a distribution with mean 0 and standard deviation 0.1. - Categorical noise is introduced in some features', with its categories randomly altered in about 1% of the rows. Outliers have also been embedded in the dataset, resonating with the Interquartile Range (IQR) rule

    Context of the Dataset:

    The dataset aims to provide a comprehensive playground for practicing Python, NumPy, Pandas, and Matplotlib. It allows learners to explore data manipulation techniques, perform statistical analysis, and create visualizations using the provided features. By working with this dataset, learners can gain hands-on experience in data cleaning, preprocessing, feature engineering, and visualization. Sources of the Dataset:

    The dataset has been generated programmatically using Python's random number generation functions and probability distributions. No external sources or real-world data have been used in creating this dataset.

  13. f

    Table S05 Results of statistical test

    • datasetcatalog.nlm.nih.gov
    • figshare.com
    Updated Jun 14, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Simons, Kai; Spiegel, Aleksandra; Sarov, Mihail; Klose, Christian; Gerl, Mathias J.; Bachmann, Mandy; Heninger, Anne-Kristin; Lauber, Chris (2022). Table S05 Results of statistical test [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0000265536
    Explore at:
    Dataset updated
    Jun 14, 2022
    Authors
    Simons, Kai; Spiegel, Aleksandra; Sarov, Mihail; Klose, Christian; Gerl, Mathias J.; Bachmann, Mandy; Heninger, Anne-Kristin; Lauber, Chris
    Description

    A set of gene knockouts as a resource for global lipidomic changes Aleksandra Spiegel, Chris Lauber, Mandy Bachmann, Anne-Kristin Heninger, Christian Klose, Kai Simons, Mihail Sarov, Mathias J. Gerl https://doi.org/10.1038/s41598-022-14690-0 Table S05 Results of statistical tests. Tab “Gene names”: Uniprot IDs and Protein names of the genes knocked out in this study. Tab “Overview comparisons”: Overview of all comparisons displayed in the file, including the tab names, IDs used, number of replicates, and number of lipids compared. Tabs on statistical tests contain 4 sets of comparisons: lipid class, fatty acid (FA), fatty acids with a lipid class (class FA), and (sub-)species, with mean value, standard deviation, number of replicates of the feature (n), fold change of means, p-value and the p-value corrected for multiple testing based on the set (BH).

  14. Trending YouTube Statistics Dataset

    • kaggle.com
    zip
    Updated Nov 18, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sonali Sheth (2021). Trending YouTube Statistics Dataset [Dataset]. https://www.kaggle.com/sonalisheth/trending-youtube-statistics-dataset
    Explore at:
    zip(183678686 bytes)Available download formats
    Dataset updated
    Nov 18, 2021
    Authors
    Sonali Sheth
    Area covered
    YouTube
    Description

    UPDATE: Source code used for collecting this data released here

    Context YouTube (the world-famous video sharing website) maintains a list of the top trending videos on the platform. According to Variety magazine, “To determine the year’s top-trending videos, YouTube uses a combination of factors including measuring users interactions (number of views, shares, comments and likes). Note that they’re not the most-viewed videos overall for the calendar year”. Top performers on the YouTube trending list are music videos (such as the famously virile “Gangam Style”), celebrity and/or reality TV performances, and the random dude-with-a-camera viral videos that YouTube is well-known for.

    This dataset is a daily record of the top trending YouTube videos.

    Note that this dataset is a structurally improved version of this dataset.

    Content This dataset includes several months (and counting) of data on daily trending YouTube videos. Data is included for the US, GB, DE, CA, and FR regions (USA, Great Britain, Germany, Canada, and France, respectively), with up to 200 listed trending videos per day.

    EDIT: Now includes data from RU, MX, KR, JP and IN regions (Russia, Mexico, South Korea, Japan and India respectively) over the same time period.

    Each region’s data is in a separate file. Data includes the video title, channel title, publish time, tags, views, likes and dislikes, description, and comment count.

    The data also includes a category_id field, which varies between regions. To retrieve the categories for a specific video, find it in the associated JSON. One such file is included for each of the five regions in the dataset.

    For more information on specific columns in the dataset refer to the column metadata.

    Acknowledgements This dataset was collected using the YouTube API.

    Inspiration Possible uses for this dataset could include:

    Sentiment analysis in a variety of forms Categorising YouTube videos based on their comments and statistics. Training ML algorithms like RNNs to generate their own YouTube comments. Analysing what factors affect how popular a YouTube video will be. Statistical analysis over time . For further inspiration, see the kernels on this dataset!

    Content

    What's inside is more than just rows and columns. Make it easy for others to get started by describing how you acquired the data and what time period it represents, too.

    Acknowledgements

    We wouldn't be here without the help of others. If you owe any attributions or thanks, include them here along with any citations of past research.

    Inspiration

    Your data will be in front of the world's largest data science community. What questions do you want to see answered?

  15. V

    Data from: Normalization and analysis of DNA microarray data by...

    • data.virginia.gov
    • healthdata.gov
    • +1more
    html
    Updated Sep 6, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    National Institutes of Health (2025). Normalization and analysis of DNA microarray data by self-consistency and local regression [Dataset]. https://data.virginia.gov/dataset/normalization-and-analysis-of-dna-microarray-data-by-self-consistency-and-local-regression
    Explore at:
    htmlAvailable download formats
    Dataset updated
    Sep 6, 2025
    Dataset provided by
    National Institutes of Health
    Description

    A robust semi-parametric normalization technique has been developed, based on the assumption that the large majority of genes will not have their relative expression levels changed from one treatment group to the next, and on the assumption that departures of the response from linearity are small and slowly varying. The method was tested using data simulated under various error models and it performs well.

  16. undefined undefined: undefined | undefined (undefined)

    • data.census.gov
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    United States Census Bureau, undefined undefined: undefined | undefined (undefined) [Dataset]. https://data.census.gov/table/ACSST1Y2013.S0901?q=S0901&g=040XX00US47
    Explore at:
    Dataset provided by
    United States Census Bureauhttp://census.gov/
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Supporting documentation on code lists, subject definitions, data accuracy, and statistical testing can be found on the American Community Survey website in the Data and Documentation section...Sample size and data quality measures (including coverage rates, allocation rates, and response rates) can be found on the American Community Survey website in the Methodology section..Although the American Community Survey (ACS) produces population, demographic and housing unit estimates, it is the Census Bureau''s Population Estimates Program that produces and disseminates the official estimates of the population for the nation, states, counties, cities and towns and estimates of housing units for states and counties..Explanation of Symbols:An ''**'' entry in the margin of error column indicates that either no sample observations or too few sample observations were available to compute a standard error and thus the margin of error. A statistical test is not appropriate..An ''-'' entry in the estimate column indicates that either no sample observations or too few sample observations were available to compute an estimate, or a ratio of medians cannot be calculated because one or both of the median estimates falls in the lowest interval or upper interval of an open-ended distribution..An ''-'' following a median estimate means the median falls in the lowest interval of an open-ended distribution..An ''+'' following a median estimate means the median falls in the upper interval of an open-ended distribution..An ''***'' entry in the margin of error column indicates that the median falls in the lowest interval or upper interval of an open-ended distribution. A statistical test is not appropriate..An ''*****'' entry in the margin of error column indicates that the estimate is controlled. A statistical test for sampling variability is not appropriate. .An ''N'' entry in the estimate and margin of error columns indicates that data for this geographic area cannot be displayed because the number of sample cases is too small..An ''(X)'' means that the estimate is not applicable or not available..Estimates of urban and rural population, housing units, and characteristics reflect boundaries of urban areas defined based on Census 2010 data. As a result, data for urban and rural areas from the ACS do not necessarily reflect the results of ongoing urbanization..While the 2013 American Community Survey (ACS) data generally reflect the February 2013 Office of Management and Budget (OMB) definitions of metropolitan and micropolitan statistical areas; in certain instances the names, codes, and boundaries of the principal cities shown in ACS tables may differ from the OMB definitions due to differences in the effective dates of the geographic entities..Public assistance includes receipt of Supplemental Security Income (SSI), cash public assistance income, or Food Stamps..The Census Bureau introduced a new set of disability questions in the 2008 ACS questionnaire. Accordingly, comparisons of disability data from 2008 or later with data from prior years are not recommended. For more information on these questions and their evaluation in the 2006 ACS Content Test, see the Evaluation Report Covering Disability..Excludes householders, spouses, and unmarried partners..Foreign born excludes people born outside the United States to a parent who is a U.S. citizen..In data year 2013, there were a series of changes to data collection operations that could have affected some estimates. These changes include the addition of Internet as a mode of data collection, the end of the content portion of Failed Edit Follow-Up interviewing, and the loss of one monthly panel due to the Federal Government shut down in October 2013. For more information, see: User Notes.Data are based on a sample and are subject to sampling variability. The degree of uncertainty for an estimate arising from sampling variability is represented through the use of a margin of error. The value shown here is the 90 percent margin of error. The margin of error can be interpreted roughly as providing a 90 percent probability that the interval defined by the estimate minus the margin of error and the estimate plus the margin of error (the lower and upper confidence bounds) contains the true value. In addition to sampling variability, the ACS estimates are subject to nonsampling error (for a discussion of nonsampling variability, see Accuracy of the Data). The effect of nonsampling error is not represented in these tables..Source: U.S. Census Bureau, 2013 American Community Survey

  17. DCMS Economic Estimates: Ad-hoc statistical releases

    • gov.uk
    • s3.amazonaws.com
    Updated Mar 29, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Department for Culture, Media and Sport (2023). DCMS Economic Estimates: Ad-hoc statistical releases [Dataset]. https://www.gov.uk/government/statistical-data-sets/dcms-economic-estimates-ad-hoc-statistical-releases
    Explore at:
    Dataset updated
    Mar 29, 2023
    Dataset provided by
    GOV.UKhttp://gov.uk/
    Authors
    Department for Culture, Media and Sport
    Description

    The table below lists links to ad hoc statistical analyses on Economic Estimates that have not been included in our standard publications.

    DateAd-hoc
    March 2023https://gov.uk/government/statistical-data-sets/ad-hoc-statistical-analysis-202223-jan-to-mar-quarter-4" class="govuk-link">Digital Sector Economic Estimates 2022: Business Demographics
    March 2023https://gov.uk/government/statistical-data-sets/ad-hoc-statistical-analysis-202223-jan-to-mar-quarter-4" class="govuk-link">DCMS Sectors Economic Estimates 2022: Business Demographics
    February 2023https://gov.uk/government/statistical-data-sets/ad-hoc-statistical-analysis-202223-jan-to-mar-quarter-4" class="govuk-link">DCMS Sectors Economic Estimates: Total Employment, January to December, 2011 - 2021
    June 2022https://gov.uk/government/statistical-data-sets/ad-hoc-statistical-analysis-202223-quarter-1" class="govuk-link">DCMS Civil Society sector: Employment (Number of filled jobs) estimates by Local Authority, 2018 to 2021 (pooled data)
    May 2022https://gov.uk/government/statistical-data-sets/ad-hoc-statistical-analysis-202223-quarter-1" class="govuk-link">Employment, Welsh Creative Wales Creative Industries, 2019 and 2020
    March 2021Provisional monthly GVA for DCMS sectors in 2019 and 2020
    February 2021GVA by industries in DCMS clusters, 2019
    January 2021Employment in DCMS sectors by socio-economic background: July 2020 to September 2020
    September 2020Small and medium enterprises and proportion of employment in DCMS clusters, 2018
    August 2020Proportion of standard industrial categories accounted for by DCMS Sectors
    July 2020Number and GVA generated by DCMS Sector businesses, by turnover band, 2018
    June 2020Turnover and resilience of businesses in DCMS subsectors
    May 2020Number of businesses in DCMS clusters by standard industrial section, 2017
    May 2020Regional GVA by industries in DCMS clusters, 2018
    May 2020Employment in DCMS clusters by home and work location (NUTS2), 2019
    May 2020GVA by industries in DCMS clusters, 2018
    May 2020Employment in DCMS clusters by various demographic characteristics, 2019
    May 2020"https://www.gov.uk/government/statistical-data-sets/ad-hoc-statistical-analysis-202021-quarter-1" class="govuk-link">Employees in DCMS clusters by industry, 2019, with

  18. d

    Protected Areas Database of the United States (PAD-US) 3.0 Vector Analysis...

    • catalog.data.gov
    • data.usgs.gov
    • +1more
    Updated Oct 22, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Geological Survey (2025). Protected Areas Database of the United States (PAD-US) 3.0 Vector Analysis and Summary Statistics [Dataset]. https://catalog.data.gov/dataset/protected-areas-database-of-the-united-states-pad-us-3-0-vector-analysis-and-summary-stati
    Explore at:
    Dataset updated
    Oct 22, 2025
    Dataset provided by
    United States Geological Surveyhttp://www.usgs.gov/
    Area covered
    United States
    Description

    Spatial analysis and statistical summaries of the Protected Areas Database of the United States (PAD-US) provide land managers and decision makers with a general assessment of management intent for biodiversity protection, natural resource management, and recreation access across the nation. The PAD-US 3.0 Combined Fee, Designation, Easement feature class (with Military Lands and Tribal Areas from the Proclamation and Other Planning Boundaries feature class) was modified to remove overlaps, avoiding overestimation in protected area statistics and to support user needs. A Python scripted process ("PADUS3_0_CreateVectorAnalysisFileScript.zip") associated with this data release prioritized overlapping designations (e.g. Wilderness within a National Forest) based upon their relative biodiversity conservation status (e.g. GAP Status Code 1 over 2), public access values (in the order of Closed, Restricted, Open, Unknown), and geodatabase load order (records are deliberately organized in the PAD-US full inventory with fee owned lands loaded before overlapping management designations, and easements). The Vector Analysis File ("PADUS3_0VectorAnalysisFile_ClipCensus.zip") associated item of PAD-US 3.0 Spatial Analysis and Statistics ( https://doi.org/10.5066/P9KLBB5D ) was clipped to the Census state boundary file to define the extent and serve as a common denominator for statistical summaries. Boundaries of interest to stakeholders (State, Department of the Interior Region, Congressional District, County, EcoRegions I-IV, Urban Areas, Landscape Conservation Cooperative) were incorporated into separate geodatabase feature classes to support various data summaries ("PADUS3_0VectorAnalysisFileOtherExtents_Clip_Census.zip") and Comma-separated Value (CSV) tables ("PADUS3_0SummaryStatistics_TabularData_CSV.zip") summarizing "PADUS3_0VectorAnalysisFileOtherExtents_Clip_Census.zip" are provided as an alternative format and enable users to explore and download summary statistics of interest (Comma-separated Table [CSV], Microsoft Excel Workbook [.XLSX], Portable Document Format [.PDF] Report) from the PAD-US Lands and Inland Water Statistics Dashboard ( https://www.usgs.gov/programs/gap-analysis-project/science/pad-us-statistics ). In addition, a "flattened" version of the PAD-US 3.0 combined file without other extent boundaries ("PADUS3_0VectorAnalysisFile_ClipCensus.zip") allow for other applications that require a representation of overall protection status without overlapping designation boundaries. The "PADUS3_0VectorAnalysis_State_Clip_CENSUS2020" feature class ("PADUS3_0VectorAnalysisFileOtherExtents_Clip_Census.gdb") is the source of the PAD-US 3.0 raster files (associated item of PAD-US 3.0 Spatial Analysis and Statistics, https://doi.org/10.5066/P9KLBB5D ). Note, the PAD-US inventory is now considered functionally complete with the vast majority of land protection types represented in some manner, while work continues to maintain updates and improve data quality (see inventory completeness estimates at: http://www.protectedlands.net/data-stewards/ ). In addition, changes in protected area status between versions of the PAD-US may be attributed to improving the completeness and accuracy of the spatial data more than actual management actions or new acquisitions. USGS provides no legal warranty for the use of this data. While PAD-US is the official aggregation of protected areas ( https://www.fgdc.gov/ngda-reports/NGDA_Datasets.html ), agencies are the best source of their lands data.

  19. r

    Data from: Statistical analysis of second order effects variation with the...

    • resodate.org
    • scielo.figshare.com
    Updated Jan 1, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    D.M. OLIVEIRA; N.A. SILVA; C.C. RIBEIRO; S.E.C. RIBEIRO (2021). Statistical analysis of second order effects variation with the stories height of reinforced concrete buildings [Dataset]. http://doi.org/10.6084/M9.FIGSHARE.14307212
    Explore at:
    Dataset updated
    Jan 1, 2021
    Dataset provided by
    SciELO journals
    Authors
    D.M. OLIVEIRA; N.A. SILVA; C.C. RIBEIRO; S.E.C. RIBEIRO
    Description

    Abstract In this paper the simplified method to evaluate final efforts using γ z coefficient is studied considering the variation of the second order effects with the height of the buildings. With this purpose, several reinforced concrete buildings of medium height are analyzed in first and second order using ANSYS software. Initially, it was checked that the (z coefficient should be used as magnifier of first order moments to evaluate final second order moments. Therefore, the study is developed considering the relation (final second order moments/ first order moments), calculated for each story of the structures. This moments relation is called magnifier of first order moments, "γ", and, in the ideal situation, it must coincide with the γ z value. However, it is observed that the reason γ /γ z varies with the height of the buildings. Furthermore, using an statistical analysis, it was checked that γ /γ z relation is generally lower than 1.05 and varies significantly in accordance with the considered building and with the presence or not of symmetry in the structure.

  20. r

    Evaluation through follow-up - pupils born in 1953

    • researchdata.se
    Updated Aug 15, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kjell Härnqvist; Sven-Erik Reuterberg; Allan Svensson; Airi Rovio-Johansson (2024). Evaluation through follow-up - pupils born in 1953 [Dataset]. https://researchdata.se/en/catalogue/dataset/snd0480-2
    Explore at:
    Dataset updated
    Aug 15, 2024
    Dataset provided by
    University of Gothenburg
    Authors
    Kjell Härnqvist; Sven-Erik Reuterberg; Allan Svensson; Airi Rovio-Johansson
    Time period covered
    1966 - 1973
    Area covered
    Sweden
    Description

    Since the beginning of the 1960s, Statistics Sweden, in collaboration with various research institutions, has carried out follow-up surveys in the school system. These surveys have taken place within the framework of the IS project (Individual Statistics Project) at the University of Gothenburg and the UGU project (Evaluation through follow-up of students) at the University of Teacher Education in Stockholm, which since 1990 have been merged into a research project called 'Evaluation through Follow-up'. The follow-up surveys are part of the central evaluation of the school and are based on large nationally representative samples from different cohorts of students.

    Evaluation through follow-up (UGU) is one of the country's largest research databases in the field of education. UGU is part of the central evaluation of the school and is based on large nationally representative samples from different cohorts of students. The longitudinal database contains information on nationally representative samples of school pupils from ten cohorts, born between 1948 and 2004. The sampling process was based on the student's birthday for the first two and on the school class for the other cohorts.

    For each cohort, data of mainly two types are collected. School administrative data is collected annually by Statistics Sweden during the time that pupils are in the general school system (primary and secondary school), for most cohorts starting in compulsory school year 3. This information is provided by the school offices and, among other things, includes characteristics of school, class, special support, study choices and grades. Information obtained has varied somewhat, e.g. due to changes in curricula. A more detailed description of this data collection can be found in reports published by Statistics Sweden and linked to datasets for each cohort.

    Survey data from the pupils is collected for the first time in compulsory school year 6 (for most cohorts). Questionnaire in survey in year 6 includes questions related to self-perception and interest in learning, attitudes to school, hobbies, school motivation and future plans. For some cohorts, questionnaire data are also collected in year 3 and year 9 in compulsory school and in upper secondary school.

    Furthermore, results from various intelligence tests and standartized knowledge tests are included in the data collection year 6. The intelligence tests have been identical for all cohorts (except cohort born in 1987 from which questionnaire data were first collected in year 9). The intelligence test consists of a verbal, a spatial and an inductive test, each containing 40 tasks and specially designed for the UGU project. The verbal test is a vocabulary test of the opposite type. The spatial test is a so-called ‘sheet metal folding test’ and the inductive test are made up of series of numbers. The reliability of the test, intercorrelations and connection with school grades are reported by Svensson (1971).

    For the first three cohorts (1948, 1953 and 1967), the standartized knowledge tests in year 6 consist of the standard tests in Swedish, mathematics and English that up to and including the beginning of the 1980s were offered to all pupils in compulsory school year 6. For the cohort 1972, specially prepared tests in reading and mathematics were used. The test in reading consists of 27 tasks and aimed to identify students with reading difficulties. The mathematics test, which was also offered for the fifth cohort, (1977) includes 19 assignments. After a changed version of the test, caused by the previously used test being judged to be somewhat too simple, has been used for the cohort born in 1982. Results on the mathematics test are not available for the 1987 cohort. The mathematics test was not offered to the students in the cohort in 1992, as the test did not seem to fully correspond with current curriculum intentions in mathematics. For further information, see the description of the dataset for each cohort.

    For several of the samples, questionnaires were also collected from the students 'parents and teachers in year 6. The teacher questionnaire contains questions about the teacher, class size and composition, the teacher's assessments of the class' knowledge level, etc., school resources, working methods and parental involvement and questions about the existence of evaluations. The questionnaire for the guardians includes questions about the child's upbringing conditions, ambitions and wishes regarding the child's education, views on the school's objectives and the parents' own educational and professional situation.

    The students are followed up even after they have left primary school. Among other things, data collection is done during the time they are in high school. Then school administrative data such as e.g. choice of upper secondary school line / program and grades after completing studies. For some of the cohorts, in addition to school administrative data, questionnaire data were also collected from the students.

    he sample consisted of students born on the 5th, 15th and 25th of any month in 1953, a total of 10,723 students.

    The data obtained in 1966 were: 1. School administrative data (school form, class type, year and grades). 2. Information about the parents' profession and education, number of siblings, the distance between home and school, etc.

    This information was collected for 93% of all born on the current days. The reason for this is reduced resources for Statistics Sweden for follow-up work - reminders etc. Annual data for cohorts in 1953 were collected by Statistics Sweden up to and including academic year 1972/73.

    1. Answers to certain questions that shed light on students' school motivation, leisure activities and study and career plans. Some of the questions changed significantly compared to the cohort in 1948 due to the fact that they did not function satisfactorily from a metrological point of view.
    2. Results on three aptitude tests, one verbal, one spatial and one inductive.
    3. Standard test results in reading, writing, mathematics and English, which were offered to the students who belonged to year 6.

    Response rate for test and questionnaire data is 88% Standard test results were received for just over 85% of those who took the tests.

    The sample included a total of 9955 students, for whom some form of information was obtained.

    Part of the "Individual Statistics Project" together with cohort 1953.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Gavin B. Stewart; Douglas G. Altman; Lisa M. Askie; Lelia Duley; Mark C. Simmonds; Lesley A. Stewart (2023). Statistical Analysis of Individual Participant Data Meta-Analyses: A Comparison of Methods and Recommendations for Practice [Dataset]. http://doi.org/10.1371/journal.pone.0046042
Organization logo

Statistical Analysis of Individual Participant Data Meta-Analyses: A Comparison of Methods and Recommendations for Practice

Explore at:
108 scholarly articles cite this dataset (View in Google Scholar)
tiffAvailable download formats
Dataset updated
Jun 8, 2023
Dataset provided by
PLOShttp://plos.org/
Authors
Gavin B. Stewart; Douglas G. Altman; Lisa M. Askie; Lelia Duley; Mark C. Simmonds; Lesley A. Stewart
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

BackgroundIndividual participant data (IPD) meta-analyses that obtain “raw” data from studies rather than summary data typically adopt a “two-stage” approach to analysis whereby IPD within trials generate summary measures, which are combined using standard meta-analytical methods. Recently, a range of “one-stage” approaches which combine all individual participant data in a single meta-analysis have been suggested as providing a more powerful and flexible approach. However, they are more complex to implement and require statistical support. This study uses a dataset to compare “two-stage” and “one-stage” models of varying complexity, to ascertain whether results obtained from the approaches differ in a clinically meaningful way. Methods and FindingsWe included data from 24 randomised controlled trials, evaluating antiplatelet agents, for the prevention of pre-eclampsia in pregnancy. We performed two-stage and one-stage IPD meta-analyses to estimate overall treatment effect and to explore potential treatment interactions whereby particular types of women and their babies might benefit differentially from receiving antiplatelets. Two-stage and one-stage approaches gave similar results, showing a benefit of using anti-platelets (Relative risk 0.90, 95% CI 0.84 to 0.97). Neither approach suggested that any particular type of women benefited more or less from antiplatelets. There were no material differences in results between different types of one-stage model. ConclusionsFor these data, two-stage and one-stage approaches to analysis produce similar results. Although one-stage models offer a flexible environment for exploring model structure and are useful where across study patterns relating to types of participant, intervention and outcome mask similar relationships within trials, the additional insights provided by their usage may not outweigh the costs of statistical support for routine application in syntheses of randomised controlled trials. Researchers considering undertaking an IPD meta-analysis should not necessarily be deterred by a perceived need for sophisticated statistical methods when combining information from large randomised trials.

Search
Clear search
Close search
Google apps
Main menu