100+ datasets found
  1. V

    Pooling data for Number Needed to Treat: no problems for apples

    • data.virginia.gov
    • catalog.data.gov
    html
    Updated Jul 23, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    National Institutes of Health (2025). Pooling data for Number Needed to Treat: no problems for apples [Dataset]. https://data.virginia.gov/dataset/pooling-data-for-number-needed-to-treat-no-problems-for-apples
    Explore at:
    htmlAvailable download formats
    Dataset updated
    Jul 23, 2025
    Dataset provided by
    National Institutes of Health
    Description

    Objective To consider the problem of the calculation of number needed to treat (NNT) derived from risk difference, odds ratio, and raw pooled events shown to give different results using data from a review of nursing interventions for smoking cessation.

       Discussion
       A review of nursing interventions for smoking cessation from the Cochrane Library provided different values for NNT depending on how NNTs were calculated. The Cochrane review was evaluated for clinical heterogeneity using L'Abbé plot and subsequent analysis by secondary and primary care settings.
       Three studies in primary care had low (4%) baseline quit rates, and nursing interventions were without effect. Seven trials in hospital settings with patients after cardiac surgery, or heart attack, or even with cancer, had high baseline quit rates (25%). Nursing intervention to stop smoking in the hospital setting was effective, with an NNT of 14 (95% confidence interval 9 to 26). The assumptions involved in using risk difference and odds ratio scales for calculating NNTs are discussed.
    
    
       Summary
       Clinical common sense and concentration on raw data helps to detect clinical heterogeneity. Once robust statistical tests have told us that an intervention works, we then need to know how well it works. The number needed to treat or harm is just one way of showing that, and when used sensibly can be a useful tool.
    
  2. h

    data-juicer-t2v-optimal-data-pool

    • huggingface.co
    Updated Jul 23, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Data-Juicer (2024). data-juicer-t2v-optimal-data-pool [Dataset]. https://huggingface.co/datasets/datajuicer/data-juicer-t2v-optimal-data-pool
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 23, 2024
    Dataset authored and provided by
    Data-Juicer
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Data-Juicer Sandbox: A Comprehensive Suite for Multimodal Data-Model Co-development

      Project description
    

    The emergence of large-scale multi-modal generative models has drastically advanced artificial intelligence, introducing unprecedented levels of performance and functionality. However, optimizing these models remains challenging due to historically isolated paths of model-centric and data-centric developments, leading to suboptimal outcomes and inefficient resource… See the full description on the dataset page: https://huggingface.co/datasets/datajuicer/data-juicer-t2v-optimal-data-pool.

  3. Data from: Estimation of pool construction and technical error

    • catalog.data.gov
    • datasets.ai
    • +1more
    Updated Apr 21, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Agricultural Research Service (2025). Data from: Estimation of pool construction and technical error [Dataset]. https://catalog.data.gov/dataset/data-from-estimation-of-pool-construction-and-technical-error-61e80
    Explore at:
    Dataset updated
    Apr 21, 2025
    Dataset provided by
    Agricultural Research Servicehttps://www.ars.usda.gov/
    Description

    Animals were incorporated into pools in different proportions to estimate error and evaluate factors influencing error. Animals were incorporated into 2 types of pools, sub-pools and super pools. Within phenotype, liver abscess or normal, 16 animals were combined into 4 sub-pools, 4 animals per sub-pool in parts of 1:2:3:4. Sub-pools were constructed based on crushed frozen liver tissue mass. Within phenotype, 4 sub-pools were incorporated into 2 super pools in parts of 1:2:3:4 for super pool 1 and 3:4:1:2 for super pool 2. Super pools were made based on DNA quantity. Errors in DNA quantification would create error in forming super pools from sub-pools and variation in cell content or DNA content of liver tissue would result in error in combining sub-pools from animals. Animal contributions to sub-pools for livers with abscess sub-pool 1A was 1:2:3:4 parts of 15A, 36A, 35A, and 23A. sub-pool 2A was 1:2:3:4 parts of 42A, 37A, 12A, and 22A. sub-pool 3A was 1:2:3:4 parts of 17A, 1A, 49A, and 48A . sub-pool 4A was 1:2:3:4 parts of 3A, 20A, 16A, and 13A. Each part was 0.1 g of pulverized frozen liver tissue. Animal contributions to livers without abscess sub-pool 1N was 1:2:3:4 parts of 46N, 23N, 17N, and 12N. sub-pool 2N was 1:2:3:4 parts of 1N, 31N, 6N, and 48N. sub-pool 3N was 1:2:3:4 parts of 36N, 43N, 32N, and 13N. sub-pool 4N was 1:2:3:4 parts of 34N, 19N, 41N, and 50N. Sub-pool contributions to super pools for livers with abscess super pool 1A was:1:2:3:4 parts sub-pool 1A, sub-pool 2A, sub-pool 3A, and sub-pool 4A. super pool 2A was 3:4:1:2 parts sub-pool 1A, sub-pool 2A, sub-pool 3A, and sub-pool 4A. Sub-pool contributions to super pools for livers with without abscess super pool 1N was:1:2:3:4 parts sub-pool 1N, sub-pool 2N, sub-pool 3N, and sub-pool 4N. super pool 2N was 3:4:1:2 parts sub-pool 1N, sub-pool 2N, sub-pool 3N, and sub-pool 4N. Funded by the USDA Agricultural Research Service, Developing a Systems Biology Approach to Enhance Efficiency and Sustainability of Beef and Lamb Production/ 3040-31000-100-000-D Resources in this dataset:Resource Title: xy data for individual animals. File Name: xyIndividuals.csv.gzResource Description: X (red) and Y (green) intensity data for 32 animals. There are 64 columns, an X and Y column for each animalResource Title: Genotypes, Number of copies of B allele for BovineHD 770K. File Name: g.csv.gzResource Description: Values are 0, 1 and 2 for 32 animals and 777,962 SNP, DNA was extracted from pulverized frozen liver tissueResource Title: x and y data for pools. File Name: xyPools.csv.gzResource Description: X (red) and Y (green) intensity for 12 pools. There are 2 columns per pool, first is X followed by Y. First 8 columns are super pools and second 16 are sub-pools. Examples superPool.1A.X is superPool 1 for abscess livers and X intensity sub-pool.1A.Y is sub-pools 1 for abscess livers and Y intensity

  4. d

    Stata Do Files for pooling and analysis

    • search.dataone.org
    Updated Nov 12, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sweeney, Sedona (2023). Stata Do Files for pooling and analysis [Dataset]. http://doi.org/10.7910/DVN/9XZBGE
    Explore at:
    Dataset updated
    Nov 12, 2023
    Dataset provided by
    Harvard Dataverse
    Authors
    Sweeney, Sedona
    Description

    Standardized do files to facilitate within- and across-country data pooling and analysis

  5. V

    Pooling, meta-analysis, and the evaluation of drug safety

    • odgavaprod.ogopendata.com
    • catalog.data.gov
    html
    Updated Jul 23, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    National Institutes of Health (2025). Pooling, meta-analysis, and the evaluation of drug safety [Dataset]. https://odgavaprod.ogopendata.com/dataset/pooling-meta-analysis-and-the-evaluation-of-drug-safety
    Explore at:
    htmlAvailable download formats
    Dataset updated
    Jul 23, 2025
    Dataset provided by
    National Institutes of Health
    Description

    Background The "integrated safety report" of the drug registration files submitted to health authorities usually summarizes the rates of adverse events observed for a new drug, placebo or active control drugs by pooling the safety data across the trials. Pooling consists of adding the numbers of events observed in a given treatment group across the trials and dividing the results by the total number of patients included in this group. Because it considers treatment groups rather than studies, pooling ignores validity of the comparisons and is subject to a particular kind of bias, termed "Simpson's paradox." In contrast, meta-analysis and other stratified analyses are less susceptible to bias.

       Methods
       We use a hypothetical, but not atypical, application to demonstrate that the results of a meta-analysis can differ greatly from those obtained by pooling the same data. In our hypothetical model, a new drug is compared to 1) a placebo in 4 relatively small trials in patients at high risk for a certain adverse event and 2) an active reference drug in 2 larger trials of patients at low risk for this event.
    
    
       Results
       Using meta-analysis, the relative risk of experiencing the adverse event with the new drug was 1.78 (95% confidence interval [1.02; 3.12]) compared to placebo and 2.20 [0.76; 6.32] compared to active control. By pooling the data, the results were, respectively, 1.00 [0.59; 1.70] and 5.20 [2.07; 13.08].
    
    
       Conclusions
       Because these findings could mislead health authorities and doctors, regulatory agencies should require meta-analyses or stratified analyses of safety data in drug registration files.
    
  6. d

    Replication Data for: Risk Pooling, Risk Preferences, and Social Networks

    • search.dataone.org
    • openicpsr.org
    • +1more
    Updated Nov 22, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Attanasio, Orazio; Barr, Abigail; Cardenas, Juan Camilo; Genicot, Garance; Meghir, Costas (2023). Replication Data for: Risk Pooling, Risk Preferences, and Social Networks [Dataset]. http://doi.org/10.7910/DVN/16OAH0
    Explore at:
    Dataset updated
    Nov 22, 2023
    Dataset provided by
    Harvard Dataverse
    Authors
    Attanasio, Orazio; Barr, Abigail; Cardenas, Juan Camilo; Genicot, Garance; Meghir, Costas
    Description

    Using data from an experiment conducted in 70 Colombian communities, we investigate who pools risk with whom when trust is crucial for enforcing risk pooling arrangements. We explore the roles played by risk attitudes and social networks. Both empirically and theoretically, we find that close friends and relatives group assortatively on risk attitudes and are more likely to join the same risk pooling group, while unfamiliar participants group less and rarely assort. These findings indicate that where there are advantages to grouping assortatively on risk attitudes those advantages may be inaccessible when trust is absent or low.

  7. f

    Baseline characteristics of patient populations prior to data pooling.

    • figshare.com
    xls
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Irene Kuepfer; Caecilia Schmid; Mpairwe Allan; Andrew Edielu; Emma P. Haary; Abbas Kakembo; Stafford Kibona; Johannes Blum; Christian Burri (2023). Baseline characteristics of patient populations prior to data pooling. [Dataset]. http://doi.org/10.1371/journal.pntd.0001695.t004
    Explore at:
    xlsAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    PLOS Neglected Tropical Diseases
    Authors
    Irene Kuepfer; Caecilia Schmid; Mpairwe Allan; Andrew Edielu; Emma P. Haary; Abbas Kakembo; Stafford Kibona; Johannes Blum; Christian Burri
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Note:1no suramin pre-treatment;2body mass index,3cerebrospinal fluid,4white blood cell.

  8. Annual Population Survey Three-Year Pooled Dataset, January 2020 - December...

    • beta.ukdataservice.ac.uk
    Updated 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Office For National Statistics (2024). Annual Population Survey Three-Year Pooled Dataset, January 2020 - December 2022 [Dataset]. http://doi.org/10.5255/ukda-sn-9119-2
    Explore at:
    Dataset updated
    2024
    Dataset provided by
    DataCitehttps://www.datacite.org/
    UK Data Servicehttps://ukdataservice.ac.uk/
    Authors
    Office For National Statistics
    Description
    The Annual Population Survey (APS) is a major survey series, which aims to provide data that can produce reliable estimates at the local authority level. Key topics covered in the survey include education, employment, health and ethnicity. The APS comprises key variables from the Labour Force Survey (LFS), all its associated LFS boosts and the APS boost. The APS aims to provide enhanced annual data for England, covering a target sample of at least 510 economically active persons for each Unitary Authority (UA)/Local Authority District (LAD) and at least 450 in each Greater London Borough. In combination with local LFS boost samples, the survey provides estimates for a range of indicators down to Local Education Authority (LEA) level across the United Kingdom.

    For further detailed information about methodology, users should consult the Labour Force Survey User Guide, included with the APS documentation. For variable and value labelling and coding frames that are not included either in the data or in the current APS documentation, users are advised to consult the latest versions of the LFS User Guides, which are available from the ONS Labour Force Survey - User Guidance webpages.

    Occupation data for 2021 and 2022
    The ONS has identified an issue with the collection of some occupational data in 2021 and 2022 data files in a number of their surveys. While they estimate any impacts will be small overall, this will affect the accuracy of the breakdowns of some detailed (four-digit Standard Occupational Classification (SOC)) occupations, and data derived from them. None of ONS' headline statistics, other than those directly sourced from occupational data, are affected and you can continue to rely on their accuracy. The affected datasets have now been updated. Further information can be found in the ONS article published on 11 July 2023: Revision of miscoded occupational data in the ONS Labour Force Survey, UK: January 2021 to September 2022

    APS Well-Being Datasets
    From 2012-2015, the ONS published separate APS datasets aimed at providing initial estimates of subjective well-being, based on the Integrated Household Survey. In 2015 these were discontinued. A separate set of well-being variables and a corresponding weighting variable have been added to the April-March APS person datasets from A11M12 onwards. Further information on the transition can be found in the Personal well-being in the UK: 2015 to 2016 article on the ONS website.

    APS disability variables
    Over time, there have been some updates to disability variables in the APS. An article explaining the quality assurance investigations on these variables that have been conducted so far is available on the ONS Methodology webpage.

    End User Licence and Secure Access APS data
    Users should note that there are two versions of each APS dataset. One is available under the standard End User Licence (EUL) agreement, and the other is a Secure Access version. The EUL version includes Government Office Region geography, banded age, 3-digit SOC and industry sector for main, second and last job. The Secure Access version contains more detailed variables relating to:
    • age: single year of age, year and month of birth, age completed full-time education and age obtained highest qualification, age of oldest dependent child and age of youngest dependent child
    • family unit and household: including a number of variables concerning the number of dependent children in the family according to their ages, relationship to head of household and relationship to head of family
    • nationality and country of origin
    • geography: including county, unitary/local authority, place of work, Nomenclature of Territorial Units for Statistics 2 (NUTS2) and NUTS3 regions, and whether lives and works in same local authority district
    • health: including main health problem, and current and past health problems
    • education and apprenticeship: including numbers and subjects of various qualifications and variables concerning apprenticeships
    • industry: including industry, industry class and industry group for main, second and last job, and industry made redundant from
    • occupation: including 4-digit Standard Occupational Classification (SOC) for main, second and last job and job made redundant from
    • system variables: including week number when interview took place and number of households at address

    The Secure Access data have more restrictive access conditions than those made available under the standard EUL. Prospective users will need to gain ONS Accredited Researcher status, complete an extra application form and demonstrate to the data owners exactly why they need access to the additional variables. Users are strongly advised to first obtain the standard EUL version of the data to see if they are sufficient for their research requirements.

    Latest edition information

    For the second edition (January 2024), a new version of the data file was deposited, with smoking variables added.

  9. d

    Pool

    • catalog.data.gov
    • dataworks.siouxfalls.gov
    • +2more
    Updated May 10, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    City of Sioux Falls GIS (2025). Pool [Dataset]. https://catalog.data.gov/dataset/pool
    Explore at:
    Dataset updated
    May 10, 2025
    Dataset provided by
    City of Sioux Falls GIS
    Description

    Feature layer containing Pool information in the City of Sioux Falls, South Dakota.

  10. e

    Data from: Dataset for: Pooling it all together – No Influence of Distractor...

    • b2find.eudat.eu
    Updated Jan 14, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2022). Dataset for: Pooling it all together – No Influence of Distractor Pool Size on Stimulus-Response Binding [Dataset]. https://b2find.eudat.eu/dataset/22952b52-087f-5a68-b611-01c3f128cf71
    Explore at:
    Dataset updated
    Jan 14, 2022
    Description

    Distractors and responses are integrated in an event file when they occur together. Further, when all or some features repeat, the whole event file is retrieved, affecting later action as observed in so-called binding effects. Previous research used varying distractor pool sizes (ranging from just two to well over 30) to choose distractors from, but it is unclear whether distractor pool size has an effect on the size of distractor-based binding effects. The present study investigates, if and how distractor pool size modulates binding effects. Using an adapted prime-probe design, participants were assigned to large (384 distractors) or small (2 distractors) distractor pool sizes, and distractor-response binding effects were measured. Binding effects were stronger for the large distractor pool condition compared to the small pool condition. We discuss these findings against the background of the negative priming literature and research on novelty. Dataset for: Philip Schmalbrock, Christian Frings & Birte Moeller (2022) Pooling it all together – the role of distractor pool size on stimulus-response binding, Journal of Cognitive Psychology, DOI: 10.1080/20445911.2022.2026363

  11. f

    Data from: Variable Selection with Multiply-Imputed Datasets: Choosing...

    • tandf.figshare.com
    pdf
    Updated Jun 3, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jiacong Du; Jonathan Boss; Peisong Han; Lauren J. Beesley; Michael Kleinsasser; Stephen A. Goutman; Stuart Batterman; Eva L. Feldman; Bhramar Mukherjee (2023). Variable Selection with Multiply-Imputed Datasets: Choosing Between Stacked and Grouped Methods [Dataset]. http://doi.org/10.6084/m9.figshare.19111441.v2
    Explore at:
    pdfAvailable download formats
    Dataset updated
    Jun 3, 2023
    Dataset provided by
    Taylor & Francis
    Authors
    Jiacong Du; Jonathan Boss; Peisong Han; Lauren J. Beesley; Michael Kleinsasser; Stephen A. Goutman; Stuart Batterman; Eva L. Feldman; Bhramar Mukherjee
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Penalized regression methods are used in many biomedical applications for variable selection and simultaneous coefficient estimation. However, missing data complicates the implementation of these methods, particularly when missingness is handled using multiple imputation. Applying a variable selection algorithm on each imputed dataset will likely lead to different sets of selected predictors. This article considers a general class of penalized objective functions which, by construction, force selection of the same variables across imputed datasets. By pooling objective functions across imputations, optimization is then performed jointly over all imputed datasets rather than separately for each dataset. We consider two objective function formulations that exist in the literature, which we will refer to as “stacked” and “grouped” objective functions. Building on existing work, we (i) derive and implement efficient cyclic coordinate descent and majorization-minimization optimization algorithms for continuous and binary outcome data, (ii) incorporate adaptive shrinkage penalties, (iii) compare these methods through simulation, and (iv) develop an R package miselect. Simulations demonstrate that the “stacked” approaches are more computationally efficient and have better estimation and selection properties. We apply these methods to data from the University of Michigan ALS Patients Biorepository aiming to identify the association between environmental pollutants and ALS risk. Supplementary materials for this article are available online.

  12. f

    S1 Data -

    • plos.figshare.com
    bin
    Updated Jun 12, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Julian Burtniak; Adam Hedley; Kerry Dust; Paul Van Caeseele; Jared Bullard; Derek R. Stein (2023). S1 Data - [Dataset]. http://doi.org/10.1371/journal.pgph.0001793.s001
    Explore at:
    binAvailable download formats
    Dataset updated
    Jun 12, 2023
    Dataset provided by
    PLOS Global Public Health
    Authors
    Julian Burtniak; Adam Hedley; Kerry Dust; Paul Van Caeseele; Jared Bullard; Derek R. Stein
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    PCR-based analysis is the gold standard for detection of SARS-CoV-2 and was used broadly throughout the pandemic. However, heightened demand for testing put strain on diagnostic resources and the adequate amount of PCR-based testing required exceeded existing testing capacity. Pooled testing strategies presented an effective method to increase testing capacity by decreasing the number of tests and resources required for laboratory PCR analysis of SARS-CoV-2. We sought to conduct an analysis of SARS-CoV-2 pooling schemes to determine the sensitivity of various sized Dorfman pooling strategies and evaluate the utility of using such pooling strategies in diagnostic laboratory settings. Overall, a trend of decreasing sensitivity with larger pool sizes was observed, with modest sensitivity losses in the largest pools tested, and high sensitivity in all other pools. Efficiency data was then calculated to determine the optimal Dorfman pool sizes based on test positivity rate. This was correlated with current presumptive test positivity to maximize the number of tests saved, thereby increasing testing capacity and resource efficiency in the community setting. Dorfman pooling methods were evaluated and found to offer a high-throughput solution to SARS-CoV-2 clinical testing that improve resource efficiency in low-resource environments.

  13. Prospective Multiple Issuer Pool Numbers

    • catalog.data.gov
    • datasets.ai
    • +2more
    Updated Mar 1, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Department of Housing and Urban Development (2024). Prospective Multiple Issuer Pool Numbers [Dataset]. https://catalog.data.gov/dataset/prospective-multiple-issuer-pool-numbers
    Explore at:
    Dataset updated
    Mar 1, 2024
    Dataset provided by
    United States Department of Housing and Urban Developmenthttp://www.hud.gov/
    Description

    Pool information for available Multiple Issuer Pool for future pooling use

  14. e

    Cold-air pooling characterization and forest composition, New England, USA

    • portal.edirepository.org
    csv
    Updated Mar 1, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Melissa Pastore; Aimée Classen; Anthony D'Amato; Marie English; Karin Rand; Jane Foster; E. Adair (2024). Cold-air pooling characterization and forest composition, New England, USA [Dataset]. http://doi.org/10.6073/pasta/d78f0f01a24c3590d591cc8c32a9c6c3
    Explore at:
    csv(2436840 byte), csv(2104757 byte), csv(5598 byte), csv(2038 byte), csv(2105424 byte)Available download formats
    Dataset updated
    Mar 1, 2024
    Dataset provided by
    EDI
    Authors
    Melissa Pastore; Aimée Classen; Anthony D'Amato; Marie English; Karin Rand; Jane Foster; E. Adair
    Time period covered
    2021 - 2022
    Area covered
    Variables measured
    T1, T2, T3, T4, T5, T6, CTI, GWC, SLR, Elev, and 19 more
    Description

    This dataset corresponds to a project investigating whether cold-air pooling influences forest composition and function. The data include hourly sub-canopy air temperatures (measured continuously via ibuttons) and forest forest composition data for 48 plots along 9 transects in 3 sites across New England, USA. The temperature data also include surface lapse rates and temperature gradients across transects, as well as a designation indicating the presence or absence of a temperature inversion. We found that sites with the most frequent temperature inversions also displayed vegetation inversions across slopes, with more cold-adapted species at low instead of high elevations.

  15. d

    Replication Data for: Scale Effects in Ridesplitting: A Case Study of the...

    • dataone.org
    • dataverse.harvard.edu
    Updated Nov 8, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Liu, Hao; Devunuri, Saipraneeth (2023). Replication Data for: Scale Effects in Ridesplitting: A Case Study of the City of Chicago [Dataset]. http://doi.org/10.7910/DVN/GZMBJG
    Explore at:
    Dataset updated
    Nov 8, 2023
    Dataset provided by
    Harvard Dataverse
    Authors
    Liu, Hao; Devunuri, Saipraneeth
    Area covered
    Chicago
    Description

    This repository contains modified TNC trip data obtained from the Chicago data portal for the year 2019. The raw trip data is first cleaned by removing trivial and erroneous records. This includes short trips with travel times of less than 2 minutes or distances shorter than 0.1 miles. We also exclude entries with missing pickup or dropoff census tract i.e trips originating or ending outside Chicago. Lastly, we remove trips marked as not authorized as shared trips but coded as shared trips. The filtered data is then aggregated by pickup_hour, 'pickup_date, pickup_day, and pickup_month. We also aggregate by census tracts in addition to the earlier ones for detour plots.

  16. s

    Pooling Shared Information and Knowledge Governance - Dataset - Skeena...

    • data.skeenasalmon.info
    Updated Aug 13, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2020). Pooling Shared Information and Knowledge Governance - Dataset - Skeena Salmon Data Catalogue [Dataset]. https://data.skeenasalmon.info/dataset/pooling-shared-information-and-knowledge-governance
    Explore at:
    Dataset updated
    Aug 13, 2020
    Description

    This document states the importance of trusted, shared information to create trusted processes, informed decisions and ultimately better outcomes for watersheds in British Columbia. A variety of models for managing and supporting local water and watershed information exist in B.C. They are examples of how Crown and Indigenous governments, local communities, and industry are collaborating to “pool” and integrate various forms of water and watershed data. These initiatives recognize a critical point: that for effective watershed planning, policy/ regulation, programs, and decision-making, there needs to be agreement on how a shared foundation of credible information and knowledge is effectively being built and managed. Three places in particular demonstrate the characteristics of effective knowledge creation and sharing in action are the Columbia Basin Water Monitoring Collaborative, Skeena Knowledge Trust (SKT), and Coast Information Team (CIT).

  17. N

    Outdoor Pools Session Information

    • data.cityofnewyork.us
    • catalog.data.gov
    application/rdfxml +5
    Updated Jan 3, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). Outdoor Pools Session Information [Dataset]. https://data.cityofnewyork.us/Recreation/Outdoor-Pools-Session-Information/82jf-bykm
    Explore at:
    csv, application/rdfxml, json, application/rssxml, xml, tsvAvailable download formats
    Dataset updated
    Jan 3, 2025
    Description

    The NYC Parks outdoor pool season typically runs from late June to the Sunday after Labor Day. During the season, Parks' staff record data via a mobile app survey at the end of each pool session. The survey includes questions on attendance, staffing, meals, issues, weather conditions, and closures for that specific session.

    NYC Parks operates two sessions at each pool every day of the pool season. First Session is from 11:00am - 3:00pm. Second Session is from 4:00pm - 7:00pm, with the requirement for Olympic / Intermediate pools to stay open for Extended Second Session from 7:00pm - 8:00pm when the City Heat Emergency Plan is activated.

    For each pool season, every pool will have at least two survey submissions per day - one submission for the first session, and one submission for the second session. A pool will have a third submission if it stays open for an extended second session.

    Data Dictionary: https://docs.google.com/spreadsheets/d/15lHSZF76W1cZnjwlWRSn7tzLh6EqVZVeZ2vwDFqXHMM/edit?usp=sharing

    For reference, pool geography from Open Data can be found here: https://data.cityofnewyork.us/City-Government/Pools/3vjv-6tf5

  18. H

    Replication Data for: Scale economies and decline of ride-pooling: A case...

    • dataverse.harvard.edu
    Updated Jul 15, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ayush Pandey; Lewis Lehe; Vikash Gayah (2025). Replication Data for: Scale economies and decline of ride-pooling: A case study of New York City [Dataset]. http://doi.org/10.7910/DVN/6C2XTQ
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 15, 2025
    Dataset provided by
    Harvard Dataverse
    Authors
    Ayush Pandey; Lewis Lehe; Vikash Gayah
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Area covered
    New York
    Description

    This the replication data for Scale economies and decline of ride-pooling: A case study of New York City. The data has a csv with weekly aggregates for all TNC trips in NYC, starting from Feb 2019 to Mar 2023. The data about TNC trips is obtained from NYC TLC High Volume For-Hire Vehicles Trip Records. Additionally, there are hourly aggregates for 2019 and 2023, alongside hourly precipitation obtained from the Open-Meteo dataset.

  19. d

    Data to assess silver and bighead carp pool to pool movements from 2012...

    • catalog.data.gov
    • data.usgs.gov
    • +2more
    Updated Oct 13, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Geological Survey (2024). Data to assess silver and bighead carp pool to pool movements from 2012 through 2019 in the Illinois River, USA through Bayesian multistate transition models (ver. 2.0, June 2024) [Dataset]. https://catalog.data.gov/dataset/data-to-assess-silver-and-bighead-carp-pool-to-pool-movements-from-2012-through-2019-in-th
    Explore at:
    Dataset updated
    Oct 13, 2024
    Dataset provided by
    U.S. Geological Survey
    Area covered
    Illinois River, United States
    Description

    The dataset and accompanying analysis scripts accompany the article "Bayesian multistate models allow incorporation of spatial dynamics to improve invasive species management". The data are summarized detections from acoustic telemetry receivers (69 KHz) from 353 silver carp (Hypophthalmichthys molitrix) and 170 bighead carp (H. nobilis) surgically implanted with transmitters in the Illinois River, USA. The analysis scripts assess probability of detection, probability of monthly movement between navigation pools on the river, probability of apparent survival, and probability of operable transmitter battery through a Bayesian multistate hidden Markov model.

  20. h

    deita-redundant-pool-data

    • huggingface.co
    Updated May 20, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    HKUST NLP Group (2025). deita-redundant-pool-data [Dataset]. https://huggingface.co/datasets/hkust-nlp/deita-redundant-pool-data
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 20, 2025
    Dataset authored and provided by
    HKUST NLP Group
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    hkust-nlp/deita-redundant-pool-data dataset hosted on Hugging Face and contributed by the HF Datasets community

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
National Institutes of Health (2025). Pooling data for Number Needed to Treat: no problems for apples [Dataset]. https://data.virginia.gov/dataset/pooling-data-for-number-needed-to-treat-no-problems-for-apples

Pooling data for Number Needed to Treat: no problems for apples

Explore at:
htmlAvailable download formats
Dataset updated
Jul 23, 2025
Dataset provided by
National Institutes of Health
Description

Objective To consider the problem of the calculation of number needed to treat (NNT) derived from risk difference, odds ratio, and raw pooled events shown to give different results using data from a review of nursing interventions for smoking cessation.

   Discussion
   A review of nursing interventions for smoking cessation from the Cochrane Library provided different values for NNT depending on how NNTs were calculated. The Cochrane review was evaluated for clinical heterogeneity using L'Abbé plot and subsequent analysis by secondary and primary care settings.
   Three studies in primary care had low (4%) baseline quit rates, and nursing interventions were without effect. Seven trials in hospital settings with patients after cardiac surgery, or heart attack, or even with cancer, had high baseline quit rates (25%). Nursing intervention to stop smoking in the hospital setting was effective, with an NNT of 14 (95% confidence interval 9 to 26). The assumptions involved in using risk difference and odds ratio scales for calculating NNTs are discussed.


   Summary
   Clinical common sense and concentration on raw data helps to detect clinical heterogeneity. Once robust statistical tests have told us that an intervention works, we then need to know how well it works. The number needed to treat or harm is just one way of showing that, and when used sensibly can be a useful tool.
Search
Clear search
Close search
Google apps
Main menu