100+ datasets found
  1. f

    Initial data analysis checklist for data screening in longitudinal studies.

    • plos.figshare.com
    • datasetcatalog.nlm.nih.gov
    xls
    Updated May 29, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Lara Lusa; Cécile Proust-Lima; Carsten O. Schmidt; Katherine J. Lee; Saskia le Cessie; Mark Baillie; Frank Lawrence; Marianne Huebner (2024). Initial data analysis checklist for data screening in longitudinal studies. [Dataset]. http://doi.org/10.1371/journal.pone.0295726.t001
    Explore at:
    xlsAvailable download formats
    Dataset updated
    May 29, 2024
    Dataset provided by
    PLOS ONE
    Authors
    Lara Lusa; Cécile Proust-Lima; Carsten O. Schmidt; Katherine J. Lee; Saskia le Cessie; Mark Baillie; Frank Lawrence; Marianne Huebner
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Initial data analysis checklist for data screening in longitudinal studies.

  2. Exploratory Analysis of CMS Open Data: Investigation of Dimuon Mass Spectrum...

    • zenodo.org
    zip
    Updated Sep 29, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Andre Luis Tomaz Dionísio; Andre Luis Tomaz Dionísio (2025). Exploratory Analysis of CMS Open Data: Investigation of Dimuon Mass Spectrum Anomalies in the 10-15 GeV Range [Dataset]. http://doi.org/10.5281/zenodo.17220766
    Explore at:
    zipAvailable download formats
    Dataset updated
    Sep 29, 2025
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Andre Luis Tomaz Dionísio; Andre Luis Tomaz Dionísio
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset contains the results of an exploratory analysis of CMS Open Data from LHC Run 1 (2010-2012) and Run 2 (2015-2018), focusing on the dimuon invariant mass spectrum in the 10-15 GeV range. The analysis investigates potential anomalies at 11.9 GeV and applies various statistical methods to characterize observed features.

    Methodology:

    • Event selection and reconstruction using CMS NanoAOD format
    • Dimuon invariant mass analysis with background estimation
    • Angular distribution studies for quantum number determination
    • Statistical analysis including significance testing
    • Systematic uncertainty evaluation
    • Conservation law verification

    Key Analysis Components:

    • Mass spectrum reconstruction and peak identification
    • Background modeling using sideband methods
    • Angular correlation analysis (sphericity, thrust, momentum distributions)
    • Cross-validation using multiple event selection criteria
    • Monte Carlo comparison for background understanding

    Results Summary: The analysis identifies several features in the dimuon mass spectrum requiring further investigation. Preliminary observations suggest potential anomalies around 11.9 GeV, though these findings require independent validation and peer review before drawing definitive conclusions.

    Data Products:

    • Processed event datasets
    • Analysis scripts and methodology
    • Statistical outputs and uncertainty estimates
    • Visualization tools and plots
    • Systematic studies documentation

    Limitations: This work represents preliminary exploratory analysis. Results have not undergone formal peer review and should be considered investigative rather than conclusive. Independent replication and validation by the broader physics community are essential before any definitive claims can be made.

    Keywords: CMS experiment, dimuon analysis, mass spectrum, exploratory analysis, LHC data, particle physics, statistical analysis, anomaly investigation

    # Dark Photon Search for at 11.9 GeV

    ## Executive Summary

    **Historic Search for: First Evidence of a Massive Dark Photon**

    We report the Search for a new vector gauge boson at 11.9 GeV, identified as a dark photon (A'), representing the first confirmed portal anomaly between the Standard Model and a hidden sector. This search, based on CMS Open Data from LHC Run 1 (2010-2012) and Run 2 (2015-2018), provides direct experimental evidence for physics beyond the Standard Model.

    ## Search for Highlights

    ### Anomaly Properties
    - **Mass**: 11.9 ± 0.1 GeV
    - **Quantum Numbers**: J^PC = 1^-- (vector gauge boson)
    - **Spin**: 1
    - **Parity**: Negative
    - **Isospin**: 0 (singlet)
    - **Hypercharge**: 0

    ### Statistical Significance
    - **Total Events**: 63,788 candidates in Run 1
    - **Signal Strength**: > 5σ significance
    - **Decay Channel**: A' → μ⁺μ⁻ (dominant)
    - **Branching Ratio**: ~50% to neutral pairs

    ### Conservation Laws
    All fundamental symmetries preserved:
    - ✓ Energy-momentum
    - ✓ Charge
    - ✓ Lepton number
    - ✓ CPT

    ## Project Structure

    ```
    search/
    ├── README.md # This file
    ├── docs/
    │ ├── paper/ # Main search paper
    │ │ ├── manuscript.tex # LaTeX source
    │ │ ├── abstract.txt # Paper abstract
    │ │ └── figures/ # Paper figures
    │ └── supplementary/ # Additional materials
    │ ├── methods.pdf # Detailed methodology
    │ ├── systematics.pdf # Systematic uncertainties
    │ └── theory.pdf # Theoretical implications
    ├── data/
    │ ├── run1/ # 7-8 TeV (2010-2012)
    │ │ ├── raw/ # Original ROOT files
    │ │ ├── processed/ # Processed datasets
    │ │ └── results/ # Analysis outputs
    │ └── run2/ # 13 TeV (2015-2018)
    │ ├── raw/ # Original ROOT files
    │ ├── processed/ # Processed datasets
    │ └── results/ # Analysis outputs
    ├── analysis/
    │ └── scripts/ # Analysis code
    │ ├── dark_photon_symmetry_analysis.py
    │ ├── hidden_sector_10_150_search.py
    │ ├── hidden_10_15_gev_analysis.py
    │ └── validation/ # Cross-checks
    ├── figures/ # Publication-ready plots
    │ ├── mass_spectrum.png # Invariant mass distribution
    │ ├── angular_dist.png # Angular distributions
    │ ├── symmetry_plots.png # Symmetry analysis
    │ └── cascade_spectrum.png # Hidden sector cascade
    └── validation/ # Systematic studies
    ├── background_estimation/
    ├── signal_extraction/
    └── systematic_errors/
    ```

    ## Key Evidence

    ### 1. Quantum Number Determination
    - **Angular Distribution**: ⟨|P₁|⟩ = 0.805 (strong anisotropy)
    - **Quadrupole Moment**: ⟨P₂⟩ = 0.573 (non-zero)
    - **Anomaly Type Score**: Vector = 90/100 (Preliminary)

    ### 2. Hidden Sector Connection
    - 236,181 total events in 10-150 GeV range
    - Exponential cascade spectrum indicating hidden valley dynamics
    - Dark photon serves as portal anomaly

    ### 3. Decay Topology
    - **Sphericity**: 0.161 (jet-like)
    - **Thrust**: 0.686 (moderate collimation)
    - Consistent with two-body decay A' → μ⁺μ⁻

    ## Physical Interpretation

    The search anomaly represents:
    1. **New Force Carrier**: Fifth fundamental force beyond the four known forces
    2. **Portal Anomaly**: Mediator between Standard Model and hidden/dark sector
    3. **Dark Matter Connection**: Potential mediator for dark matter interactions

    ## Theoretical Framework

    ### Kinetic Mixing
    The dark photon arises from kinetic mixing between U(1)_Y (hypercharge) and U(1)_D (dark charge):
    ```
    L_mix = -(ε/2) F_μν^Y F^Dμν
    ```
    where ε is the mixing parameter (~10^-3 based on observed coupling).

    ### Hidden Valley Scenario
    The exponential cascade spectrum suggests:
    - Complex hidden sector with multiple states
    - Possible dark hadronization
    - Rich phenomenology awaiting exploration

    ## Collaborators and Credits

    **Lead Analysis**: CMS Open Data Analysis Team
    **Data Source**: CERN Open Data Portal
    **Period**: 2010-2012 (Run 1), 2015-2018 (Run 2)
    **Computing**: Local analysis on CMS NanoAOD format



    ## How to Reproduce

    ### Requirements
    ```bash
    pip install uproot awkward numpy matplotlib
    ```

    ### Quick Start
    ```bash
    cd analysis/scripts/
    python dark_photon_symmetry_analysis.py
    python hidden_10_15_gev_analysis.py
    ```

    ## Significance Statement

    This search represents the first confirmed Evidence of a portal anomaly connecting the Standard Model to a hidden sector. The 11.9 GeV dark photon opens an entirely new frontier in anomaly physics, providing experimental access to previously invisible physics and potentially explaining dark matter interactions.

    ## Contact

    For questions about this search or collaboration opportunities:
    - Email: andreluisdionisio@gmail.com

    ---

    "We're not at the end of anomaly physics - we're at the beginning of dark sector physics!"

    3665778186 00382C40-4D7F-E211-AD6F-003048FFCBFC.root
    2581315530 0E5F189B-5D7F-E211-9423-002354EF3BE1.root
    2149825126 1AE176AC-5A7F-E211-8E63-00261894397D.root
    1792851725 2044D46B-DE7F-E211-9C82-003048FFD76E.root
    3186214416 4CAE8D51-4A7F-E211-9937-0025905964A2.root
    3220923349 72FDEF89-497F-E211-9CFA-002618943958.root
    2555255008 7A35A5A2-547F-E211-940B-003048678DA2.root
    3875410897 7E942EED-457F-E211-938E-002618FDA28E.root
    2409745919 8406DE2F-407F-E211-A6A5-00261894395F.root
    2421251748 8A61DAA8-3C7F-E211-94A6-002618943940.root
    2315643699 98909097-417F-E211-9009-002618943838.root
    2614932091 A0963AD9-567F-E211-A8AF-002618943901.root
    2438057881 ACE2DF9A-477F-E211-9C29-003048679266.root
    2206652387 B6AA897F-467F-E211-8381-002618943854.root
    2365666837 C09519C8-4B7F-E211-9BCE-003048678B34.root
    2477336101 C68AE3A5-447F-E211-928E-00261894388B.root
    2556444022 C6CEC369-437F-E211-81B0-0026189438BD.root
    3184171088 D60FF379-4E7F-E211-8BA4-002590593878.root
    2381001693

  3. u

    GDAS Analysis (initial data)

    • data.ucar.edu
    • ckanprod.data-commons.k8s.ucar.edu
    grib
    Updated Oct 7, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Celeste Saulo (2025). GDAS Analysis (initial data) [Dataset]. http://doi.org/10.26023/PWB1-C6X7-PD0H
    Explore at:
    gribAvailable download formats
    Dataset updated
    Oct 7, 2025
    Authors
    Celeste Saulo
    Time period covered
    Jan 17, 2003 - Jan 19, 2003
    Area covered
    Earth
    Description

    This dataset contains the initial and boundary conditions in GRIB format files to be used as input to the models. SALLJEX was funded by NOAA/OGP, NSF(ATM0106776) and funding agencies from Brazil FAPESP Grant 01/13816-1) and Argentina (ANPCYT PICT 07-06671, UBACyT 055)

  4. Pre and Post-Exercise Heart Rate Analysis

    • kaggle.com
    zip
    Updated Sep 29, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Abdullah M Almutairi (2024). Pre and Post-Exercise Heart Rate Analysis [Dataset]. https://www.kaggle.com/datasets/abdullahmalmutairi/pre-and-post-exercise-heart-rate-analysis
    Explore at:
    zip(3857 bytes)Available download formats
    Dataset updated
    Sep 29, 2024
    Authors
    Abdullah M Almutairi
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Description

    Dataset Overview:

    This dataset contains simulated (hypothetical) but almost realistic (based on AI) data related to sleep, heart rate, and exercise habits of 500 individuals. It includes both pre-exercise and post-exercise resting heart rates, allowing for analyses such as a dependent t-test (Paired Sample t-test) to observe changes in heart rate after an exercise program. The dataset also includes additional health-related variables, such as age, hours of sleep per night, and exercise frequency.

    The data is designed for tasks involving hypothesis testing, health analytics, or even machine learning applications that predict changes in heart rate based on personal attributes and exercise behavior. It can be used to understand the relationships between exercise frequency, sleep, and changes in heart rate.

    File: Filename: heart_rate_data.csv File Format: CSV

    - Features (Columns):

    Age: Description: The age of the individual. Type: Integer Range: 18-60 years Relevance: Age is an important factor in determining heart rate and the effects of exercise.

    Sleep Hours: Description: The average number of hours the individual sleeps per night. Type: Float Range: 3.0 - 10.0 hours Relevance: Sleep is a crucial health metric that can impact heart rate and exercise recovery.

    Exercise Frequency (Days/Week): Description: The number of days per week the individual engages in physical exercise. Type: Integer Range: 1-7 days/week Relevance: More frequent exercise may lead to greater heart rate improvements and better cardiovascular health.

    Resting Heart Rate Before: Description: The individual’s resting heart rate measured before beginning a 6-week exercise program. Type: Integer Range: 50 - 100 bpm (beats per minute) Relevance: This is a key health indicator, providing a baseline measurement for the individual’s heart rate.

    Resting Heart Rate After: Description: The individual’s resting heart rate measured after completing the 6-week exercise program. Type: Integer Range: 45 - 95 bpm (lower than the "Resting Heart Rate Before" due to the effects of exercise). Relevance: This variable is essential for understanding how exercise affects heart rate over time, and it can be used to perform a dependent t-test analysis.

    Max Heart Rate During Exercise: Description: The maximum heart rate the individual reached during exercise sessions. Type: Integer Range: 120 - 190 bpm Relevance: This metric helps in understanding cardiovascular strain during exercise and can be linked to exercise frequency or fitness levels.

    Potential Uses: Dependent T-Test Analysis: The dataset is particularly suited for a dependent (paired) t-test where you compare the resting heart rate before and after the exercise program for each individual.

    Exploratory Data Analysis (EDA):Investigate relationships between sleep, exercise frequency, and changes in heart rate. Potential analyses include correlations between sleep hours and resting heart rate improvement, or regression analyses to predict heart rate after exercise.

    Machine Learning: Use the dataset for predictive modeling, and build a beginner regression model to predict post-exercise heart rate using age, sleep, and exercise frequency as features.

    Health and Fitness Insights: This dataset can be useful for studying how different factors like sleep and age influence heart rate changes and overall cardiovascular health.

    License: Choose an appropriate open license, such as:

    CC BY 4.0 (Attribution 4.0 International).

    Inspiration for Kaggle Users: How does exercise frequency influence the reduction in resting heart rate? Is there a relationship between sleep and heart rate improvements post-exercise? Can we predict the post-exercise heart rate using other health variables? How do age and exercise frequency interact to affect heart rate?

    Acknowledgments: This is a simulated dataset for educational purposes, generated to demonstrate statistical and machine learning applications in the field of health analytics.

  5. Preferred variables (mean score 4.21) and chart types for dashboard...

    • plos.figshare.com
    xls
    Updated Sep 17, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ratchanont Thippimanporn; Wuttichai Khamna; Kannika Wiratchawa; Thanapong Intharah (2025). Preferred variables (mean score 4.21) and chart types for dashboard construction based on user assessment (n=10). [Dataset]. http://doi.org/10.1371/journal.pone.0332484.t005
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Sep 17, 2025
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Ratchanont Thippimanporn; Wuttichai Khamna; Kannika Wiratchawa; Thanapong Intharah
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Preferred variables (mean score 4.21) and chart types for dashboard construction based on user assessment (n=10).

  6. Number of interviews per participant.

    • plos.figshare.com
    • datasetcatalog.nlm.nih.gov
    xls
    Updated May 29, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Lara Lusa; Cécile Proust-Lima; Carsten O. Schmidt; Katherine J. Lee; Saskia le Cessie; Mark Baillie; Frank Lawrence; Marianne Huebner (2024). Number of interviews per participant. [Dataset]. http://doi.org/10.1371/journal.pone.0295726.t002
    Explore at:
    xlsAvailable download formats
    Dataset updated
    May 29, 2024
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Lara Lusa; Cécile Proust-Lima; Carsten O. Schmidt; Katherine J. Lee; Saskia le Cessie; Mark Baillie; Frank Lawrence; Marianne Huebner
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Initial data analysis (IDA) is the part of the data pipeline that takes place between the end of data retrieval and the beginning of data analysis that addresses the research question. Systematic IDA and clear reporting of the IDA findings is an important step towards reproducible research. A general framework of IDA for observational studies includes data cleaning, data screening, and possible updates of pre-planned statistical analyses. Longitudinal studies, where participants are observed repeatedly over time, pose additional challenges, as they have special features that should be taken into account in the IDA steps before addressing the research question. We propose a systematic approach in longitudinal studies to examine data properties prior to conducting planned statistical analyses. In this paper we focus on the data screening element of IDA, assuming that the research aims are accompanied by an analysis plan, meta-data are well documented, and data cleaning has already been performed. IDA data screening comprises five types of explorations, covering the analysis of participation profiles over time, evaluation of missing data, presentation of univariate and multivariate descriptions, and the depiction of longitudinal aspects. Executing the IDA plan will result in an IDA report to inform data analysts about data properties and possible implications for the analysis plan—another element of the IDA framework. Our framework is illustrated focusing on hand grip strength outcome data from a data collection across several waves in a complex survey. We provide reproducible R code on a public repository, presenting a detailed data screening plan for the investigation of the average rate of age-associated decline of grip strength. With our checklist and reproducible R code we provide data analysts a framework to work with longitudinal data in an informed way, enhancing the reproducibility and validity of their work.

  7. d

    Data from: Preliminary Analysis of Stress in the Newberry EGS Well NWG 55-29...

    • catalog.data.gov
    • gdr.openei.org
    • +1more
    Updated Jan 20, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    National Energy Technology Laboratory (2025). Preliminary Analysis of Stress in the Newberry EGS Well NWG 55-29 [Dataset]. https://catalog.data.gov/dataset/preliminary-analysis-of-stress-in-the-newberry-egs-well-nwg-55-29-d08f7
    Explore at:
    Dataset updated
    Jan 20, 2025
    Dataset provided by
    National Energy Technology Laboratory
    Description

    As part of the planning for stimulation of the Newberry Volcano Enhanced Geothermal Systems (EGS) Demonstration project in Oregon, a high-resolution borehole televiewer (BHTV) log was acquired using the ALT ABI85 BHTV tool in the slightly deviated NWG 55-29 well. The image log reveals an extensive network of fractures in a conjugate set striking approximately N-S and dipping 50 deg that are well oriented for normal slip and are consistent with surface-breaking regional normal faults in the vicinity. Similarly, breakouts indicate a consistent minimum horizontal stress, Shmin, azimuth of 092.3 +/- 17.3 deg. In conjunction with a suite of geophysical logs, a model of the stress magnitudes constrained by the width of breakouts at depth and a model of rock strength independently indicates a predominantly normal faulting stress regime.

  8. Ad-hoc statistical analysis: 2016/17 Quarter 4

    • gov.uk
    Updated Feb 28, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Department for Digital, Culture, Media & Sport (2018). Ad-hoc statistical analysis: 2016/17 Quarter 4 [Dataset]. https://www.gov.uk/government/statistical-data-sets/ad-hoc-statistical-analysis-201617-quarter-4
    Explore at:
    Dataset updated
    Feb 28, 2018
    Dataset provided by
    GOV.UKhttp://gov.uk/
    Authors
    Department for Digital, Culture, Media & Sport
    Description

    This page lists ad-hoc statistics released between January - March 2017. These are additional analyses not included in any of the Department for Culture, Media and Sport’s standard publications.

    If you would like any further information please contact evidence@culture.gov.uk.

    February 2017 - Taking Part: Arts and theatre engagement for adults with substantial difficulties with memory or the ability to concentrate, learn or understand, 2015/16.

    https://assets.publishing.service.gov.uk/media/5a7f20d840f0b62305b853eb/Disability_and_theatre_table.xlsx">Arts and theatre engagement by disability, 2015/16

     <p class="gem-c-attachment_metadata"><span class="gem-c-attachment_attribute">MS Excel Spreadsheet</span>, <span class="gem-c-attachment_attribute">44.4 KB</span></p>
    
    
    
    
     <p class="gem-c-attachment_metadata">This file may not be suitable for users of assistive technology.</p>
     <details data-module="ga4-event-tracker" data-ga4-event='{"event_name":"select_content","type":"detail","text":"Request an accessible format.","section":"Request an accessible format.","index_section":1}' class="gem-c-details govuk-details govuk-!-margin-bottom-0" title="Request an accessible format.">
    

    Request an accessible format.

      If you use assistive technology (such as a screen reader) and need a version of this document in a more accessible format, please email <a href="mailto:enquiries@dcms.gov.uk" target="_blank" class="govuk-link">enquiries@dcms.gov.uk</a>. Please tell us what format you need. It will help us if you say what assistive technology you use.
    

    February 2017 - Taking Part: Adult craft participation by key demographics area-level variable and education, 2015/16.

    https://assets.publishing.service.gov.uk/media/5a81bd1940f0b62305b908ab/Craft_participation_table_final.xlsx">Adult craft participation by key demographics area level variables and education, 2014/15 and 2015/16

     <p class="ge
    
  9. d

    Preliminary Land Cover Data Collected for Protocol and Process Development...

    • catalog.data.gov
    • s.cnmilf.com
    Updated Sep 12, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Geological Survey (2025). Preliminary Land Cover Data Collected for Protocol and Process Development related to Annual NLCD [Dataset]. https://catalog.data.gov/dataset/preliminary-land-cover-data-collected-for-protocol-and-process-development-related-to-annu
    Explore at:
    Dataset updated
    Sep 12, 2025
    Dataset provided by
    United States Geological Surveyhttp://www.usgs.gov/
    Description

    Data collected to facilitate research and development activities pertaining to Annual NLCD algorithm development used as a benchmark for algorithm improvement iterations. The primary goal of these preliminary data was to offer preliminary insight into algorithm performance and guidance for algorithm improvement through error analysis.

  10. W

    Data from: Preliminary Analysis of Aeromagnetic Data in Southern Wisconsin:...

    • wgnhs.wisc.edu
    zip
    Updated Oct 30, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). Preliminary Analysis of Aeromagnetic Data in Southern Wisconsin: The Role of Precambrian Basement in Paleozoic Evolution [Dataset]. https://wgnhs.wisc.edu/catalog/publication/000825
    Explore at:
    zipAvailable download formats
    Dataset updated
    Oct 30, 2025
    Area covered
    Wisconsin
    Description

    Open-file report; contains unpublished data that has not yet been peer-reviewed.

  11. Parameters for sample size calculation in evaluating dashboard communication...

    • figshare.com
    xls
    Updated Sep 17, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ratchanont Thippimanporn; Wuttichai Khamna; Kannika Wiratchawa; Thanapong Intharah (2025). Parameters for sample size calculation in evaluating dashboard communication and generative AI summaries. [Dataset]. http://doi.org/10.1371/journal.pone.0332484.t003
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Sep 17, 2025
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Ratchanont Thippimanporn; Wuttichai Khamna; Kannika Wiratchawa; Thanapong Intharah
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Parameters for sample size calculation in evaluating dashboard communication and generative AI summaries.

  12. Dashboard communication effectiveness evaluation results for data scientists...

    • figshare.com
    • plos.figshare.com
    xls
    Updated Sep 17, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ratchanont Thippimanporn; Wuttichai Khamna; Kannika Wiratchawa; Thanapong Intharah (2025). Dashboard communication effectiveness evaluation results for data scientists and the general public (n=30). [Dataset]. http://doi.org/10.1371/journal.pone.0332484.t006
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Sep 17, 2025
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Ratchanont Thippimanporn; Wuttichai Khamna; Kannika Wiratchawa; Thanapong Intharah
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Dashboard communication effectiveness evaluation results for data scientists and the general public (n=30).

  13. Data from: Preliminary Considerations Analysis of Offshore Wind Energy in...

    • ouvert.canada.ca
    • data.urbandatacentre.ca
    • +3more
    esri rest, fgdb/gdb +2
    Updated Mar 1, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Natural Resources Canada (2024). Preliminary Considerations Analysis of Offshore Wind Energy in Atlantic Canada [Dataset]. https://ouvert.canada.ca/data/dataset/29fd13f3-e7d5-4291-8560-69d405a64a3f
    Explore at:
    esri rest, pdf, fgdb/gdb, wmsAvailable download formats
    Dataset updated
    Mar 1, 2024
    Dataset provided by
    Ministry of Natural Resources of Canadahttps://www.nrcan.gc.ca/
    License

    Open Government Licence - Canada 2.0https://open.canada.ca/en/open-government-licence-canada
    License information was derived automatically

    Time period covered
    Jan 1, 1998 - Dec 31, 2020
    Area covered
    Canada
    Description

    Offshore wind represents a potentially significant source of low-carbon energy for Canada, and ensuring that relevant, high-quality data and scientifically sound analyses are brought forward into decision-making processes will increase the chances of success for any future deployment of offshore wind in Canada. To support this objective, CanmetENERGY-Ottawa (CE-O), a federal laboratory within Natural Resources Canada (NRCan), completed a preliminary analysis of relevant considerations for offshore wind, with an initial focus on Atlantic Canada. To conduct the analysis, CE-O used geographic information system (GIS) software and methods and engaged with multiple federal government departments to acquire relevant data and obtain insights from subject matter experts on the appropriate use of these data in the context of the analysis. The purpose of this work is to support the identification of candidate regions within Atlantic Canada that could become designated offshore wind energy areas in the future. The study area for the analysis included the Gulf of St. Lawrence, the western and southern coasts of the island of Newfoundland, and the coastal waters south of Nova Scotia. Twelve input data layers representing various geophysical, ecological, and ocean use considerations were incorporated as part of a multi-criteria analysis (MCA) approach to evaluate the effects of multiple inputs within a consistent framework. Six scenarios were developed which allow for visualization of a range of outcomes according to the influence weighting applied to the different input layers and the suitability scoring applied within each layer. This preliminary assessment resulted in the identification of several areas which could be candidates for future designated offshore wind areas, including the areas of the Gulf of St. Lawrence north of Prince Edward Island and west of the island of Newfoundland, and areas surrounding Sable Island. This study is subject to several limitations, namely missing and incomplete data, lack of emphasis on temporal and cumulative effects, and the inherent subjectivity of the scoring scheme applied. Further work is necessary to address data gaps and take ecosystem wide impacts into account before deployment of offshore wind projects in Canada’s coastal waters. Despite these limitations, this study and the data compiled in its preparation can aid in identifying promising locations for further review. A description of the methodology used to undertake this study is contained in the accompanying report, available at the following link: https://doi.org/10.4095/331855. This report provides in depth detail into how these data layers were compiled and details any analysis that was done on the data to produce the final data layers in this package.

  14. u

    CPTEC Control Analysis (initial data for experiment 4)

    • data.ucar.edu
    • ckanprod.data-commons.k8s.ucar.edu
    ascii
    Updated Oct 7, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Celeste Saulo (2025). CPTEC Control Analysis (initial data for experiment 4) [Dataset]. http://doi.org/10.26023/H6Q5-MFZ6-Q10N
    Explore at:
    asciiAvailable download formats
    Dataset updated
    Oct 7, 2025
    Authors
    Celeste Saulo
    Time period covered
    Jan 17, 2003 - Jan 19, 2003
    Area covered
    Earth
    Description

    This data set is provided by the CPTEC/INPE-Brazil and contains the initial and boundary conditions in binary files to be used as input to the models for experiment #4. A GRADS control file is included. SALLJEX was funded by NOAA/OGP, NSF(ATM0106776) and funding agencies from Brazil FAPESP Grant 01/13816-1) and Argentina (ANPCYT PICT 07-06671, UBACyT 055).

  15. Preliminary U.S. Space Economy Data

    • catalog.data.gov
    • datasets.ai
    • +1more
    Updated Jul 15, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bureau of Economic Analysis (2022). Preliminary U.S. Space Economy Data [Dataset]. https://catalog.data.gov/dataset/preliminary-u-s-space-economy-data
    Explore at:
    Dataset updated
    Jul 15, 2022
    Dataset provided by
    The Bureau of Economic Analysishttp://www.bea.gov/
    Area covered
    United States
    Description

    BEA has developed a preliminary set of statistics measuring the contributions of space-related industries to the overall U.S. economy.These estimates give business leaders, policymakers, and the public a new tool to analyze the space economy and to inform investment decisions.Preliminary estimates of the U.S. space economy's GDP, gross output, private employment, and private compensation by industry were published in the December 2020 Survey of Current Business.

  16. Data Science Platform Market Analysis, Size, and Forecast 2025-2029: North...

    • technavio.com
    pdf
    Updated Feb 8, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Technavio (2025). Data Science Platform Market Analysis, Size, and Forecast 2025-2029: North America (US and Canada), Europe (France, Germany, UK), APAC (China, India, Japan), South America (Brazil), and Middle East and Africa (UAE) [Dataset]. https://www.technavio.com/report/data-science-platform-market-industry-analysis
    Explore at:
    pdfAvailable download formats
    Dataset updated
    Feb 8, 2025
    Dataset provided by
    TechNavio
    Authors
    Technavio
    License

    https://www.technavio.com/content/privacy-noticehttps://www.technavio.com/content/privacy-notice

    Time period covered
    2025 - 2029
    Area covered
    United States
    Description

    Snapshot img

    Data Science Platform Market Size 2025-2029

    The data science platform market size is valued to increase USD 763.9 million, at a CAGR of 40.2% from 2024 to 2029. Integration of AI and ML technologies with data science platforms will drive the data science platform market.

    Major Market Trends & Insights

    North America dominated the market and accounted for a 48% growth during the forecast period.
    By Deployment - On-premises segment was valued at USD 38.70 million in 2023
    By Component - Platform segment accounted for the largest market revenue share in 2023
    

    Market Size & Forecast

    Market Opportunities: USD 1.00 million
    Market Future Opportunities: USD 763.90 million
    CAGR : 40.2%
    North America: Largest market in 2023
    

    Market Summary

    The market represents a dynamic and continually evolving landscape, underpinned by advancements in core technologies and applications. Key technologies, such as machine learning and artificial intelligence, are increasingly integrated into data science platforms to enhance predictive analytics and automate data processing. Additionally, the emergence of containerization and microservices in data science platforms enables greater flexibility and scalability. However, the market also faces challenges, including data privacy and security risks, which necessitate robust compliance with regulations.
    According to recent estimates, the market is expected to account for over 30% of the overall big data analytics market by 2025, underscoring its growing importance in the data-driven business landscape.
    

    What will be the Size of the Data Science Platform Market during the forecast period?

    Get Key Insights on Market Forecast (PDF) Request Free Sample

    How is the Data Science Platform Market Segmented and what are the key trends of market segmentation?

    The data science platform industry research report provides comprehensive data (region-wise segment analysis), with forecasts and estimates in 'USD million' for the period 2025-2029, as well as historical data from 2019-2023 for the following segments.

    Deployment
    
      On-premises
      Cloud
    
    
    Component
    
      Platform
      Services
    
    
    End-user
    
      BFSI
      Retail and e-commerce
      Manufacturing
      Media and entertainment
      Others
    
    
    Sector
    
      Large enterprises
      SMEs
    
    
    Application
    
      Data Preparation
      Data Visualization
      Machine Learning
      Predictive Analytics
      Data Governance
      Others
    
    
    Geography
    
      North America
    
        US
        Canada
    
    
      Europe
    
        France
        Germany
        UK
    
    
      Middle East and Africa
    
        UAE
    
    
      APAC
    
        China
        India
        Japan
    
    
      South America
    
        Brazil
    
    
      Rest of World (ROW)
    

    By Deployment Insights

    The on-premises segment is estimated to witness significant growth during the forecast period.

    In the dynamic and evolving the market, big data processing is a key focus, enabling advanced model accuracy metrics through various data mining methods. Distributed computing and algorithm optimization are integral components, ensuring efficient handling of large datasets. Data governance policies are crucial for managing data security protocols and ensuring data lineage tracking. Software development kits, model versioning, and anomaly detection systems facilitate seamless development, deployment, and monitoring of predictive modeling techniques, including machine learning algorithms, regression analysis, and statistical modeling. Real-time data streaming and parallelized algorithms enable real-time insights, while predictive modeling techniques and machine learning algorithms drive business intelligence and decision-making.

    Cloud computing infrastructure, data visualization tools, high-performance computing, and database management systems support scalable data solutions and efficient data warehousing. ETL processes and data integration pipelines ensure data quality assessment and feature engineering techniques. Clustering techniques and natural language processing are essential for advanced data analysis. The market is witnessing significant growth, with adoption increasing by 18.7% in the past year, and industry experts anticipate a further expansion of 21.6% in the upcoming period. Companies across various sectors are recognizing the potential of data science platforms, leading to a surge in demand for scalable, secure, and efficient solutions.

    API integration services and deep learning frameworks are gaining traction, offering advanced capabilities and seamless integration with existing systems. Data security protocols and model explainability methods are becoming increasingly important, ensuring transparency and trust in data-driven decision-making. The market is expected to continue unfolding, with ongoing advancements in technology and evolving business needs shaping its future trajectory.

    Request Free Sample

    The On-premises segment was valued at USD 38.70 million in 2019 and showed

  17. Ad hoc statistical analysis: 2020/21 Quarter 4

    • gov.uk
    • s3.amazonaws.com
    Updated Sep 25, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Department for Digital, Culture, Media & Sport (2024). Ad hoc statistical analysis: 2020/21 Quarter 4 [Dataset]. https://www.gov.uk/government/statistical-data-sets/ad-hoc-statistical-analysis-202021-quarter-4
    Explore at:
    Dataset updated
    Sep 25, 2024
    Dataset provided by
    GOV.UKhttp://gov.uk/
    Authors
    Department for Digital, Culture, Media & Sport
    Description

    This page lists ad-hoc statistics released during the period January - March 2021. These are additional analyses not included in any of the Department for Digital, Culture, Media and Sport’s standard publications.

    If you would like any further information please contact evidence@dcms.gov.uk.

    January 2021 - Employment in DCMS sectors by socio-economic background: July 2020 to September 2020

    This analysis provides estimates of employment in DCMS sectors based on socio-economic background, using the Labour Force Survey (LFS) for July 2020 to September 2020. The LFS asks respondents the job of main earner at age 14, and then matches this to a socio-economic group.

    Revision note:

    25 September 2024: Employment in DCMS sectors by socio-economic background: July to September 2020 data has been revised and re-published here: DCMS Economic Estimates: Employment, April 2023 to March 2024

    February 2021 - GVA by industries in DCMS clusters, 2019

    This analysis provides the Gross Value Added (GVA) in 2019 for DCMS clusters and for Civil Society. The figures show that in 2019, the DCMS Clusters contributed £291.9 bn to the UK economy, accounting for 14.8% of UK GVA (expressed in current prices). The largest cluster was Digital, which added £116.3 bn in GVA in 2019, and the smallest was Gambling (£8.3 bn).

    https://assets.publishing.service.gov.uk/media/602d27ebd3bf7f722294d195/DCMS_Clusters_GVA_Tables.xlsx">GVA by industries in DCMS clusters, 2019

     <p class="gem-c-attachment_metadata"><span class="gem-c-attachment_attribute">MS Excel Spreadsheet</span>, <span class="gem-c-attachment_attribute">111 KB</span></p>
    

    March 2021 - Provisional monthly Gross Value Added for DCMS sectors in 2019 and 2020

    This analysis provides provisional estimates of Gross Value Added (adjusted for inflation) for DCMS sectors (excluding Civil Society) for every month in 2019 and 2020. These timely estimates should only be used to illustrate general trends, rather than be taken as definitive figures. These figures will not be as accurate as our annual National Statistics release of gross value added for DCMS sectors (which will be published in Winter 2021).

    We estimate that the gross value added of DCMS sectors (excluding Civil Society) shrank by 18% in real terms for March to December 2020 (a loss of £41 billion), compared to the same period in 2019. By sector this varied from -5% (Telecoms) to -37% (Tourism). In comparison, the UK economy as a whole shrank by 11%.

  18. Youtube cookery channels viewers comments in Hinglish

    • zenodo.org
    csv
    Updated Jan 24, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Abhishek Kaushik; Abhishek Kaushik; Gagandeep Kaur; Gagandeep Kaur (2020). Youtube cookery channels viewers comments in Hinglish [Dataset]. http://doi.org/10.5281/zenodo.2841848
    Explore at:
    csvAvailable download formats
    Dataset updated
    Jan 24, 2020
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Abhishek Kaushik; Abhishek Kaushik; Gagandeep Kaur; Gagandeep Kaur
    License

    Open Data Commons Attribution License (ODC-By) v1.0https://www.opendatacommons.org/licenses/by/1.0/
    License information was derived automatically

    Area covered
    YouTube
    Description

    The data was collected from the famous cookery Youtube channels in India. The major focus was to collect the viewers' comments in Hinglish languages. The datasets are taken from top 2 Indian cooking channel named Nisha Madhulika channel and Kabita’s Kitchen channel.

    Both the datasets comments are divided into seven categories:-

    Label 1- Gratitude

    Label 2- About the recipe

    Label 3- About the video

    Label 4- Praising

    Label 5- Hybrid

    Label 6- Undefined

    Label 7- Suggestions and queries

    All the labelling has been done manually.

    Nisha Madhulika dataset:

    Dataset characteristics: Multivariate

    Number of instances: 4900

    Area: Cooking

    Attribute characteristics: Real

    Number of attributes: 3

    Date donated: March, 2019

    Associate tasks: Classification

    Missing values: Null

    Kabita Kitchen dataset:

    Dataset characteristics: Multivariate

    Number of instances: 4900

    Area: Cooking

    Attribute characteristics: Real

    Number of attributes: 3

    Date donated: March, 2019

    Associate tasks: Classification

    Missing values: Null

    There are two separate datasets file of each channel named as preprocessing and main file .

    The files with preprocessing names are generated after doing the preprocessing and exploratory data analysis on both the datasets. This file includes:

    • Id
    • Comment text
    • Labels
    • Count of stop-words
    • Uppercase words
    • Hashtags
    • Word count
    • Char count
    • Average words
    • Numeric

    The main file includes:

    • Id
    • comment text
    • Labels

    Please cite the paper

    https://www.mdpi.com/2504-2289/3/3/37

    MDPI and ACS Style

    Kaur, G.; Kaushik, A.; Sharma, S. Cooking Is Creating Emotion: A Study on Hinglish Sentiments of Youtube Cookery Channels Using Semi-Supervised Approach. Big Data Cogn. Comput. 2019, 3, 37.

  19. d

    Dental Treatment Band Analysis, England 2007: Preliminary Results

    • digital.nhs.uk
    pdf
    Updated Oct 4, 2007
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2007). Dental Treatment Band Analysis, England 2007: Preliminary Results [Dataset]. https://digital.nhs.uk/data-and-information/publications/statistical/nhs-dental-statistics
    Explore at:
    pdf(301.4 kB), pdf(30.9 kB)Available download formats
    Dataset updated
    Oct 4, 2007
    License

    https://digital.nhs.uk/about-nhs-digital/terms-and-conditionshttps://digital.nhs.uk/about-nhs-digital/terms-and-conditions

    Time period covered
    Apr 1, 2007 - Jul 31, 2007
    Area covered
    England
    Description

    This report provides information on NHS dental activity within a sample of adult courses of treatment (CoTs). Information was taken from a sample of 3,244 CoTs processed between April and July 2007, covering dental contracts across England. This information was compared to equivalent information for 2003/04. This information is sourced from the Dental Services Division (DSD) of the NHS Business Services Authority (BSA). This report has been produced by The Information Centre for health and social care (IC). A joint working group with representation from the IC, the Department of Health (DH), the Doctors' and Dentists' Review Body (DDRB) secretariat and the Dental Services Division (DSD) was consulted on this study and the content of the report.

  20. H

    Replication Data for: Pre-Analysis Plans: An Early Stocktaking

    • dataverse.harvard.edu
    Updated Apr 1, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    George K. Ofosu; Daniel N. Posner (2021). Replication Data for: Pre-Analysis Plans: An Early Stocktaking [Dataset]. http://doi.org/10.7910/DVN/DOELUB
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Apr 1, 2021
    Dataset provided by
    Harvard Dataverse
    Authors
    George K. Ofosu; Daniel N. Posner
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Pre-analysis plans (PAPs) have been championed as a solution to the problem of research credibility, but without any evidence that PAPs actually bolster the credibility of research. We analyze a representative sample of 195 PAPs registered on the Evidence in Governance and Politics (EGAP) and American Economic Association (AEA) registration platforms to assess whether PAPs registered in the early days of pre-registration (2011-2016) were sufficiently clear, precise and comprehensive to achieve their objective of preventing “fishing” and reducing the scope for post-hoc adjustment of research hypotheses. We also analyze a subset of 93 PAPs from projects that resulted in publicly available papers to ascertain how faithfully they adhere to their pre-registered specifications and hypotheses. We find significant variation in the extent to which PAPs registered during this period accomplished the goals they were designed to achieve. We discuss these findings in light of both the costs and benefits of pre-registration, showing how our results speak to the various arguments that have been made in support of and against PAPs. We also highlight the norms and institutions that will need to be strengthened to augment the power of PAPs to improve research credibility and to create incentives for researchers to invest in both producing and policing them.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Lara Lusa; Cécile Proust-Lima; Carsten O. Schmidt; Katherine J. Lee; Saskia le Cessie; Mark Baillie; Frank Lawrence; Marianne Huebner (2024). Initial data analysis checklist for data screening in longitudinal studies. [Dataset]. http://doi.org/10.1371/journal.pone.0295726.t001

Initial data analysis checklist for data screening in longitudinal studies.

Related Article
Explore at:
xlsAvailable download formats
Dataset updated
May 29, 2024
Dataset provided by
PLOS ONE
Authors
Lara Lusa; Cécile Proust-Lima; Carsten O. Schmidt; Katherine J. Lee; Saskia le Cessie; Mark Baillie; Frank Lawrence; Marianne Huebner
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Initial data analysis checklist for data screening in longitudinal studies.

Search
Clear search
Close search
Google apps
Main menu