7 datasets found

Insurance(HealthCare)
kaggle.com
zip
Updated Jul 27, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Damini Tiwari (2020). Insurance(HealthCare) [Dataset]. https://www.kaggle.com/daminitiwari/insurance
Explore at:
zip(16433 bytes)Available download formats
Dataset updated
Jul 27, 2020
Authors
Damini Tiwari
Description
Dataset

This dataset was created by Damini Tiwari

Contents
f
Data_Sheet_1_ImputEHR: A Visualization Tool of Imputation for the Prediction...
figshare.com
frontiersin.figshare.com
pdf
Updated Jun 1, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yi-Hui Zhou; Ehsan Saghapour (2023). Data_Sheet_1_ImputEHR: A Visualization Tool of Imputation for the Prediction of Biomedical Data.PDF [Dataset]. http://doi.org/10.3389/fgene.2021.691274.s001
Explore at:
pdfAvailable download formats
Unique identifier
https://doi.org/10.3389/fgene.2021.691274.s001
Dataset updated
Jun 1, 2023
Dataset provided by
Frontiers
Authors
Yi-Hui Zhou; Ehsan Saghapour
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Electronic health records (EHRs) have been widely adopted in recent years, but often include a high proportion of missing data, which can create difficulties in implementing machine learning and other tools of personalized medicine. Completed datasets are preferred for a number of analysis methods, and successful imputation of missing EHR data can improve interpretation and increase our power to predict health outcomes. However, use of the most popular imputation methods mainly require scripting skills, and are implemented using various packages and syntax. Thus, the implementation of a full suite of methods is generally out of reach to all except experienced data scientists. Moreover, imputation is often considered as a separate exercise from exploratory data analysis, but should be considered as art of the data exploration process. We have created a new graphical tool, ImputEHR, that is based on a Python base and allows implementation of a range of simple and sophisticated (e.g., gradient-boosted tree-based and neural network) data imputation approaches. In addition to imputation, the tool enables data exploration for informed decision-making, as well as implementing machine learning prediction tools for response data selected by the user. Although the approach works for any missing data problem, the tool is primarily motivated by problems encountered for EHR and other biomedical data. We illustrate the tool using multiple real datasets, providing performance measures of imputation and downstream predictive analysis.
Squid Game Netflix Twitter Data
kaggle.com
zip
Updated Oct 16, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Deep Contractor (2021). Squid Game Netflix Twitter Data [Dataset]. https://www.kaggle.com/datasets/deepcontractor/squid-game-netflix-twitter-data/versions/6
Explore at:
zip(6803403 bytes)Available download formats
Dataset updated
Oct 16, 2021
Authors
Deep Contractor
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
https://www.looper.com/img/gallery/the-ending-of-squid-game-season-1-explained/intro-1632168234.jpg" alt="">

The dataset contains the recent tweets about the record-breaking Netflix show "Squid Game"

The data is collected using tweepy Python package to access Twitter API.
i
Data Bandwidth Measurement for Conference Call Data Usage
ieee-dataport.org
Updated Sep 18, 2020
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Dikamsiyochi Young UDOCHI (2020). Data Bandwidth Measurement for Conference Call Data Usage [Dataset]. http://doi.org/10.21227/g9v0-hm89
Explore at:
Unique identifier
https://doi.org/10.21227/g9v0-hm89
Dataset updated
Sep 18, 2020
Dataset provided by
IEEE Dataport
Authors
Dikamsiyochi Young UDOCHI
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
During the first half of 2020, the COVID-19 pandemic changed the social gathering lifestyle to online business and social interaction. The worldwide imposed travel bans and national lockdown prevented social gatherings, making learning institutions and businesses to adopt an online platform for learning and business transactions. This development led to the incorporation of video conferencing into daily activities. This data article presents broadband data usage measurement collected using Glasswire software on various conference calls made between July and August. The services considered in this work are Google Meet, Zoom, Mixir, and Hangout. The data were recorded in Microsoft Excel 2016, running on a personal computer. The data was cleaned and processed using google colaboratory, which runs Python scripts on the browser. Exploratory data analysis is conducted on the data set using linear regression to model a predictive model to assess the best performance model that offers the best quality of service for online video and voice conferencing. The data is necessary to learning institutions using online programs and to learners accessing online programs in a smart city and developing countries. The data is presented in tables and graphs
B
Twitter Dataset: Replies to President Trump’s Tweet about contracting the...
borealisdata.ca
Updated May 20, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Anatoliy Gruzd (2021). Twitter Dataset: Replies to President Trump’s Tweet about contracting the Novel Coronavirus [Dataset]. http://doi.org/10.5683/SP2/BMCGI3
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.5683/SP2/BMCGI3
Dataset updated
May 20, 2021
Dataset provided by
Borealis
Authors
Anatoliy Gruzd
License
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Description
On October 2nd, 2020, U.S. President Donald Trump tweeted that he and the First Lady of the United States (FLOTUS), Melania Trump, both tested positive for COVID-19 (https://twitter.com/realDonaldTrump/status/1311892190680014849). Within seconds, his tweet received thousands of replies. The dataset consists of 298,172 replies to Donald Trump’s tweet announcing his COVID-19 diagnosis, posted on October 2nd between 6am and 12:30pm (ET). It contains tweet ids, the toxicity scores (as calculated by Google's Perspective API via https://Communalytic.com) and tweet availability status values (via twarc library). Following Twitter’s API policy, we stripped metadata associated with each tweet. So, if you’d like to examine potential relationships between other metadata elements, you would need to recollect original tweets using tools like DocNow’s Hydrator first. The only downside of this approach is that tweets that have been blocked or deleted will not be recollected. To help you get started, we also shared our Exploratory Data Analysis (EDA) Python script at https://github.com/RUSocialMediaLab/toxicityanalysis
Data and Code for the paper "An Empirical Study on Exploratory Crowdtesting...
zenodo.org
zip
Updated Sep 25, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sergio Di Martino; Sergio Di Martino; Anna Rita Fasolino; Anna Rita Fasolino; Luigi Libero Lucio Starace; Luigi Libero Lucio Starace; Porfirio Tramontana; Porfirio Tramontana (2023). Data and Code for the paper "An Empirical Study on Exploratory Crowdtesting of Android Applications" [Dataset]. http://doi.org/10.5281/zenodo.8043855
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.8043855
Dataset updated
Sep 25, 2023
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Sergio Di Martino; Sergio Di Martino; Anna Rita Fasolino; Anna Rita Fasolino; Luigi Libero Lucio Starace; Luigi Libero Lucio Starace; Porfirio Tramontana; Porfirio Tramontana
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This package contains data and code to replicate the findings presented in our paper titled " An Empirical Study on Exploratory Crowdtesting of Android Applications".

Abstract

Crowdtesting is an emerging paradigm in which a ``crowd'' of people is recruited to perform testing tasks on demand. It proved to be especially promising in the mobile apps domain and in combination with exploratory testing strategies, in which individual testers pursue a creative, experience-based approach to design tests.
Managing the crowdtesting process, however, is still a challenging task, that can easily result either in wasteful spending or in inadequate software quality, due to the unpredictability of remote testing activities.
A number of works in the literature investigated the application of crowdtesting in the mobile apps domain. These works, however, investigated crowdtesting effectiveness in finding bugs, and not in scenarios in which the goal is to generate a re-executable test suite, as well. Moreover, less work has been conducted on to the impact of different exploratory testing strategies in the crowdtesting process.
As a first step towards filling this gap in the literature, in this work we conduct an empirical evaluation involving four open-source Android apps and twenty masters students, that we believe can be representative of practitioners partaking in crowdtesting activities. The students were asked to generate test suites for the apps using a Capture and Replay tool and different exploratory testing strategies. We then compare the effectiveness, in terms of aggregate code coverage, that different-sized crowds of students using different exploratory testing strategies may achieve.

Results suggest that exploratory crowdtesting can be a valuable approach for generating GUI test suites for mobile apps, and provide a deeper insight on code coverage dynamics to project managers interested in using crowdtesting to test simple apps, on which they can make more informed decisions.

Contents and Instructions

This package contains:

apps-under-test.zip A zip archive containing the source code of the four Android applications we considered in our study, namely MunchLife, TippyTipper, Trolly, and SimplyDo.

InstrumentedSourceCode.zip A zip archive containing the instrumented source code of the four Android applications we used to compute branch coverage.

students-test-suites.zip A zip archive containing the test suites developed by the students using Uninformed Exploratory Testing (referred to as "Black Box" in the subdirectories) and Informed Exploratory Testing (referred to as "White Box" in the subdirectories). This also includes coverage reports.

compute-coverage-unions.zip A zip archive containing Python scripts we developed to compute the aggregate LOC coverage of all possible subsets of students. The scripts have been tested on MS Windows. To compute the LOC coverage achieved by any possible subsets of testers using IET and UET strategies, run the analysisAndReport.py script. To compute the LOC coverage achieved by mixed crowds in which some testers use a U+IET approach and others use a UET approach, run the analysisAndReport_UET_IET_combinations_emma.py script.

branch-coverage-computation.zip A zip archive containing Python scripts we developed to compute the aggregate branch coverage of all considered subsets of students. The scripts have been tested on MS Windows. To compute the branch coverage achieved by any possible subsets of testers using UET and I+UET strategies, run the branch_coverage_analysis.py script. To compute the code coverage achieved by mixed crowds in which some testers use a U+IET approach and others use a UET approach, run the mixed_branch_coverage_analysis.py script.

data-analysis-scripts.zip A zip archive containing R scripts to merge and manipulate coverage data, to carry out statistical analysis and draw plots. All data concerning RQ1 and RQ2 is available as a ready-to-use R data frame in the ./data/all_coverage_data.rds file. All data concerning RQ3 is available in the ./data/all_mixed_coverage_data.rds file.
Deep Uncertainty in Humanitarian Logistics - Model, Data and Analyses
narcis.nl
data.4tu.nl
+1more
Updated Nov 23, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Tim Romijn (2018). Deep Uncertainty in Humanitarian Logistics - Model, Data and Analyses [Dataset]. http://doi.org/10.4121/uuid:f056f50d-0546-4ec3-9d5e-8c85aa9296b1
Explore at:
media types: application/json, application/octet-stream, application/x-gzip, application/zip, image/png, text/csv, text/markdown, text/plain, text/x-pythonAvailable download formats
Unique identifier
https://doi.org/10.4121/uuid:f056f50d-0546-4ec3-9d5e-8c85aa9296b1
Dataset updated
Nov 23, 2018
Dataset provided by
4TUhttps://www.4tu.nl/
Authors
Tim Romijn
Description
This archive contains the model, the data and the analyses of my MSc Thesis. The model and analyses are coded in Python and documented in Jupyter Notebooks.

Thesis Title: Deep Uncertainty in Humanitarian Logistics: Simulation and Analysis of the Interplay between Decisions and Uncertainty for Post-Disaster Facility Location Decisions

Link to Thesis: http://resolver.tudelft.nl/uuid:580c5a9c-73dc-4195-930e-97c524a4b7c7
Not seeing a result you expected?
Learn how you can add new datasets to our index.