7 datasets found
  1. Insurance(HealthCare)

    • kaggle.com
    zip
    Updated Jul 27, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Damini Tiwari (2020). Insurance(HealthCare) [Dataset]. https://www.kaggle.com/daminitiwari/insurance
    Explore at:
    zip(16433 bytes)Available download formats
    Dataset updated
    Jul 27, 2020
    Authors
    Damini Tiwari
    Description

    Dataset

    This dataset was created by Damini Tiwari

    Contents

  2. f

    Data_Sheet_1_ImputEHR: A Visualization Tool of Imputation for the Prediction...

    • figshare.com
    • frontiersin.figshare.com
    pdf
    Updated Jun 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yi-Hui Zhou; Ehsan Saghapour (2023). Data_Sheet_1_ImputEHR: A Visualization Tool of Imputation for the Prediction of Biomedical Data.PDF [Dataset]. http://doi.org/10.3389/fgene.2021.691274.s001
    Explore at:
    pdfAvailable download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    Frontiers
    Authors
    Yi-Hui Zhou; Ehsan Saghapour
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Electronic health records (EHRs) have been widely adopted in recent years, but often include a high proportion of missing data, which can create difficulties in implementing machine learning and other tools of personalized medicine. Completed datasets are preferred for a number of analysis methods, and successful imputation of missing EHR data can improve interpretation and increase our power to predict health outcomes. However, use of the most popular imputation methods mainly require scripting skills, and are implemented using various packages and syntax. Thus, the implementation of a full suite of methods is generally out of reach to all except experienced data scientists. Moreover, imputation is often considered as a separate exercise from exploratory data analysis, but should be considered as art of the data exploration process. We have created a new graphical tool, ImputEHR, that is based on a Python base and allows implementation of a range of simple and sophisticated (e.g., gradient-boosted tree-based and neural network) data imputation approaches. In addition to imputation, the tool enables data exploration for informed decision-making, as well as implementing machine learning prediction tools for response data selected by the user. Although the approach works for any missing data problem, the tool is primarily motivated by problems encountered for EHR and other biomedical data. We illustrate the tool using multiple real datasets, providing performance measures of imputation and downstream predictive analysis.

  3. Squid Game Netflix Twitter Data

    • kaggle.com
    zip
    Updated Oct 16, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Deep Contractor (2021). Squid Game Netflix Twitter Data [Dataset]. https://www.kaggle.com/datasets/deepcontractor/squid-game-netflix-twitter-data/versions/6
    Explore at:
    zip(6803403 bytes)Available download formats
    Dataset updated
    Oct 16, 2021
    Authors
    Deep Contractor
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    https://www.looper.com/img/gallery/the-ending-of-squid-game-season-1-explained/intro-1632168234.jpg" alt="">

    • The dataset contains the recent tweets about the record-breaking Netflix show "Squid Game"

    • The data is collected using tweepy Python package to access Twitter API.

  4. i

    Data Bandwidth Measurement for Conference Call Data Usage

    • ieee-dataport.org
    Updated Sep 18, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dikamsiyochi Young UDOCHI (2020). Data Bandwidth Measurement for Conference Call Data Usage [Dataset]. http://doi.org/10.21227/g9v0-hm89
    Explore at:
    Dataset updated
    Sep 18, 2020
    Dataset provided by
    IEEE Dataport
    Authors
    Dikamsiyochi Young UDOCHI
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    During the first half of 2020, the COVID-19 pandemic changed the social gathering lifestyle to online business and social interaction. The worldwide imposed travel bans and national lockdown prevented social gatherings, making learning institutions and businesses to adopt an online platform for learning and business transactions. This development led to the incorporation of video conferencing into daily activities. This data article presents broadband data usage measurement collected using Glasswire software on various conference calls made between July and August. The services considered in this work are Google Meet, Zoom, Mixir, and Hangout. The data were recorded in Microsoft Excel 2016, running on a personal computer. The data was cleaned and processed using google colaboratory, which runs Python scripts on the browser. Exploratory data analysis is conducted on the data set using linear regression to model a predictive model to assess the best performance model that offers the best quality of service for online video and voice conferencing. The data is necessary to learning institutions using online programs and to learners accessing online programs in a smart city and developing countries. The data is presented in tables and graphs

  5. B

    Twitter Dataset: Replies to President Trump’s Tweet about contracting the...

    • borealisdata.ca
    Updated May 20, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Anatoliy Gruzd (2021). Twitter Dataset: Replies to President Trump’s Tweet about contracting the Novel Coronavirus [Dataset]. http://doi.org/10.5683/SP2/BMCGI3
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 20, 2021
    Dataset provided by
    Borealis
    Authors
    Anatoliy Gruzd
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    On October 2nd, 2020, U.S. President Donald Trump tweeted that he and the First Lady of the United States (FLOTUS), Melania Trump, both tested positive for COVID-19 (https://twitter.com/realDonaldTrump/status/1311892190680014849). Within seconds, his tweet received thousands of replies. The dataset consists of 298,172 replies to Donald Trump’s tweet announcing his COVID-19 diagnosis, posted on October 2nd between 6am and 12:30pm (ET). It contains tweet ids, the toxicity scores (as calculated by Google's Perspective API via https://Communalytic.com) and tweet availability status values (via twarc library). Following Twitter’s API policy, we stripped metadata associated with each tweet. So, if you’d like to examine potential relationships between other metadata elements, you would need to recollect original tweets using tools like DocNow’s Hydrator first. The only downside of this approach is that tweets that have been blocked or deleted will not be recollected. To help you get started, we also shared our Exploratory Data Analysis (EDA) Python script at https://github.com/RUSocialMediaLab/toxicityanalysis

  6. Data and Code for the paper "An Empirical Study on Exploratory Crowdtesting...

    • zenodo.org
    zip
    Updated Sep 25, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sergio Di Martino; Sergio Di Martino; Anna Rita Fasolino; Anna Rita Fasolino; Luigi Libero Lucio Starace; Luigi Libero Lucio Starace; Porfirio Tramontana; Porfirio Tramontana (2023). Data and Code for the paper "An Empirical Study on Exploratory Crowdtesting of Android Applications" [Dataset]. http://doi.org/10.5281/zenodo.8043855
    Explore at:
    zipAvailable download formats
    Dataset updated
    Sep 25, 2023
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Sergio Di Martino; Sergio Di Martino; Anna Rita Fasolino; Anna Rita Fasolino; Luigi Libero Lucio Starace; Luigi Libero Lucio Starace; Porfirio Tramontana; Porfirio Tramontana
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This package contains data and code to replicate the findings presented in our paper titled " An Empirical Study on Exploratory Crowdtesting of Android Applications".

    Abstract

    Crowdtesting is an emerging paradigm in which a ``crowd'' of people is recruited to perform testing tasks on demand. It proved to be especially promising in the mobile apps domain and in combination with exploratory testing strategies, in which individual testers pursue a creative, experience-based approach to design tests.
    Managing the crowdtesting process, however, is still a challenging task, that can easily result either in wasteful spending or in inadequate software quality, due to the unpredictability of remote testing activities.
    A number of works in the literature investigated the application of crowdtesting in the mobile apps domain. These works, however, investigated crowdtesting effectiveness in finding bugs, and not in scenarios in which the goal is to generate a re-executable test suite, as well. Moreover, less work has been conducted on to the impact of different exploratory testing strategies in the crowdtesting process.
    As a first step towards filling this gap in the literature, in this work we conduct an empirical evaluation involving four open-source Android apps and twenty masters students, that we believe can be representative of practitioners partaking in crowdtesting activities. The students were asked to generate test suites for the apps using a Capture and Replay tool and different exploratory testing strategies. We then compare the effectiveness, in terms of aggregate code coverage, that different-sized crowds of students using different exploratory testing strategies may achieve.

    Results suggest that exploratory crowdtesting can be a valuable approach for generating GUI test suites for mobile apps, and provide a deeper insight on code coverage dynamics to project managers interested in using crowdtesting to test simple apps, on which they can make more informed decisions.

    Contents and Instructions

    This package contains:

    • apps-under-test.zip A zip archive containing the source code of the four Android applications we considered in our study, namely MunchLife, TippyTipper, Trolly, and SimplyDo.
    • InstrumentedSourceCode.zip A zip archive containing the instrumented source code of the four Android applications we used to compute branch coverage.
    • students-test-suites.zip A zip archive containing the test suites developed by the students using Uninformed Exploratory Testing (referred to as "Black Box" in the subdirectories) and Informed Exploratory Testing (referred to as "White Box" in the subdirectories). This also includes coverage reports.
    • compute-coverage-unions.zip A zip archive containing Python scripts we developed to compute the aggregate LOC coverage of all possible subsets of students. The scripts have been tested on MS Windows. To compute the LOC coverage achieved by any possible subsets of testers using IET and UET strategies, run the analysisAndReport.py script. To compute the LOC coverage achieved by mixed crowds in which some testers use a U+IET approach and others use a UET approach, run the analysisAndReport_UET_IET_combinations_emma.py script.
    • branch-coverage-computation.zip A zip archive containing Python scripts we developed to compute the aggregate branch coverage of all considered subsets of students. The scripts have been tested on MS Windows. To compute the branch coverage achieved by any possible subsets of testers using UET and I+UET strategies, run the branch_coverage_analysis.py script. To compute the code coverage achieved by mixed crowds in which some testers use a U+IET approach and others use a UET approach, run the mixed_branch_coverage_analysis.py script.
    • data-analysis-scripts.zip A zip archive containing R scripts to merge and manipulate coverage data, to carry out statistical analysis and draw plots. All data concerning RQ1 and RQ2 is available as a ready-to-use R data frame in the ./data/all_coverage_data.rds file. All data concerning RQ3 is available in the ./data/all_mixed_coverage_data.rds file.
  7. Deep Uncertainty in Humanitarian Logistics - Model, Data and Analyses

    • narcis.nl
    • data.4tu.nl
    • +1more
    Updated Nov 23, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Tim Romijn (2018). Deep Uncertainty in Humanitarian Logistics - Model, Data and Analyses [Dataset]. http://doi.org/10.4121/uuid:f056f50d-0546-4ec3-9d5e-8c85aa9296b1
    Explore at:
    media types: application/json, application/octet-stream, application/x-gzip, application/zip, image/png, text/csv, text/markdown, text/plain, text/x-pythonAvailable download formats
    Dataset updated
    Nov 23, 2018
    Dataset provided by
    4TUhttps://www.4tu.nl/
    Authors
    Tim Romijn
    Description

    This archive contains the model, the data and the analyses of my MSc Thesis. The model and analyses are coded in Python and documented in Jupyter Notebooks.

    Thesis Title: Deep Uncertainty in Humanitarian Logistics: Simulation and Analysis of the Interplay between Decisions and Uncertainty for Post-Disaster Facility Location Decisions

    Link to Thesis: http://resolver.tudelft.nl/uuid:580c5a9c-73dc-4195-930e-97c524a4b7c7

  8. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Damini Tiwari (2020). Insurance(HealthCare) [Dataset]. https://www.kaggle.com/daminitiwari/insurance
Organization logo

Insurance(HealthCare)

Exploratory Data AnalysisPracticing statistics using PythonHypothesis testing

Explore at:
zip(16433 bytes)Available download formats
Dataset updated
Jul 27, 2020
Authors
Damini Tiwari
Description

Dataset

This dataset was created by Damini Tiwari

Contents

Search
Clear search
Close search
Google apps
Main menu