100+ datasets found

d
Data from: Problems in dealing with missing data and informative censoring...
catalog.data.gov
data.virginia.gov
Updated Sep 7, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
National Institutes of Health (2025). Problems in dealing with missing data and informative censoring in clinical trials [Dataset]. https://catalog.data.gov/dataset/problems-in-dealing-with-missing-data-and-informative-censoring-in-clinical-trials
Explore at:
Dataset updated
Sep 7, 2025
Dataset provided by
National Institutes of Health
Description
A common problem in clinical trials is the missing data that occurs when patients do not complete the study and drop out without further measurements. Missing data cause the usual statistical analysis of complete or all available data to be subject to bias. There are no universally applicable methods for handling missing data. We recommend the following: (1) Report reasons for dropouts and proportions for each treatment group; (2) Conduct sensitivity analyses to encompass different scenarios of assumptions and discuss consistency or discrepancy among them; (3) Pay attention to minimize the chance of dropouts at the design stage and during trial monitoring; (4) Collect post-dropout data on the primary endpoints, if at all possible; and (5) Consider the dropout event itself an important endpoint in studies with many.
R
Data from: Problem Dataset
universe.roboflow.com
zip
Updated Dec 23, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
jin (2024). Problem Dataset [Dataset]. https://universe.roboflow.com/jin-cthqm/problem-tqqcx
Explore at:
zipAvailable download formats
Dataset updated
Dec 23, 2024
Dataset authored and provided by
jin
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Variables measured
Problem Bounding Boxes
Description
Problem

## Overview Problem is a dataset for object detection tasks - it contains Problem annotations for 2,923 images. ## Getting Started You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model. ## License This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
Italy: privacy concerns regarding personal data on the internet, by issue
statista.com
Updated Sep 14, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2018). Italy: privacy concerns regarding personal data on the internet, by issue [Dataset]. https://www.statista.com/statistics/830088/privacy-concerns-regarding-personal-data-on-the-internet-in-italy/
Explore at:
Dataset updated
Sep 14, 2018
Dataset authored and provided by
Statistahttp://statista.com/
Time period covered
Jul 2016 - Aug 2016
Area covered
Italy
Description
This statistic displays the results of a survey on the share of individuals expressing privacy concerns regarding their personal data on the internet in Italy in 2016. During the survey period, it was found that **** percent of the respondents reported that the use of the internet exposes each one to be tracked and followed up while **** percent stated that privacy was not a real problem.
Room Assignment problem
kaggle.com
zip
Updated Oct 19, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Daniel Sepulveda (2022). Room Assignment problem [Dataset]. https://www.kaggle.com/datasets/kathuman/room-assignment-problem
Explore at:
zip(6030 bytes)Available download formats
Dataset updated
Oct 19, 2022
Authors
Daniel Sepulveda
License
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Description
This dataset contains information about a number of participants (participants.csv) to a workshop that need to be assigned to a number of rooms (rooms.csv).

Restrictions: 1.- The workshop has 5 different activities 2.- Each participant has indicated their first, second and third preferences for the activities available (Priority1, Priority2 and Priority3 columns in participants.csv) 3.- Participants are part of teams (Team column in participant.csv) and should be assigned together 4.- Each Activity lasts for half a day, and each participant will take part in one activity in the morning and one activity in the afternoon. 5.- Each Room must contain the SAME activity in the morning and in the afternoon.

Requirements A.- Define the way i which each participant should be assigned through a csv file in the format Name;ActivityAM;RoomAM, ActivityPM;RoomPM B.- Maximize the number of people getting their 1st and 2nd preferences.
Countries where companies face challenges with international data issues...
statista.com
Updated Nov 24, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2025). Countries where companies face challenges with international data issues 2019 [Dataset]. https://www.statista.com/statistics/997950/cross-border-data-issues-country/
Explore at:
Dataset updated
Nov 24, 2025
Dataset authored and provided by
Statistahttp://statista.com/
Time period covered
Oct 9, 2018 - Nov 5, 2018
Area covered
Worldwide
Description
This statistic shows the countries where American and European organizations face regulatory challenges involving cross-border data issues in 2019. During the survey, ** percent of respondents mentioned they faced a challenge involving cross-border data issues in the United States.
Social Media and Mental Health
kaggle.com
zip
Updated Jul 18, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
SouvikAhmed071 (2023). Social Media and Mental Health [Dataset]. https://www.kaggle.com/datasets/souvikahmed071/social-media-and-mental-health
Explore at:
zip(10944 bytes)Available download formats
Dataset updated
Jul 18, 2023
Authors
SouvikAhmed071
License
Open Database License (ODbL) v1.0https://www.opendatacommons.org/licenses/odbl/1.0/
License information was derived automatically
Description
This dataset was originally collected for a data science and machine learning project that aimed at investigating the potential correlation between the amount of time an individual spends on social media and the impact it has on their mental health.

The project involves conducting a survey to collect data, organizing the data, and using machine learning techniques to create a predictive model that can determine whether a person should seek professional help based on their answers to the survey questions.

This project was completed as part of a Statistics course at a university, and the team is currently in the process of writing a report and completing a paper that summarizes and discusses the findings in relation to other research on the topic.

The following is the Google Colab link to the project, done on Jupyter Notebook -

https://colab.research.google.com/drive/1p7P6lL1QUw1TtyUD1odNR4M6TVJK7IYN

The following is the GitHub Repository of the project -

https://github.com/daerkns/social-media-and-mental-health

Libraries used for the Project -

Pandas Numpy Matplotlib Seaborn Sci-kit Learn
Shady Dataset (find the error)
kaggle.com
zip
Updated Feb 18, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Computing School (2022). Shady Dataset (find the error) [Dataset]. https://www.kaggle.com/datasets/computingschool/shady-dataset
Explore at:
zip(242680 bytes)Available download formats
Dataset updated
Feb 18, 2022
Authors
Computing School
Description
This dataset is inspired by a real story of something funny that happened to a data science team in a company. They had built a classifier that was suspiciously good, even when analyzing its performance on unseen validation data. It was too good to be true.

Your job is to find the error.

The following notebook explains the task:

https://www.kaggle.com/computingschool/shady-dataset-find-the-error
d
The Issue Correlates of War (ICOW) Project Issue Data Set: Territorial...
search.dataone.org
dataverse.harvard.edu
Updated Nov 21, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Paul R. Hensel; Sara M. Mitchell (2023). The Issue Correlates of War (ICOW) Project Issue Data Set: Territorial Claims Data [Dataset]. http://doi.org/10.7910/DVN/E6PSGZ
Explore at:
Unique identifier
https://doi.org/10.7910/DVN/E6PSGZ
Dataset updated
Nov 21, 2023
Dataset provided by
Harvard Dataverse
Authors
Paul R. Hensel; Sara M. Mitchell
Description
The ICOW project is currently collecting data on territorial issues in all regions of the world since 1816, compiled into several related data files. The first file identifies territorial claims, or explicit claims by official government representatives of at least two sovereign nation-states to the same piece of territory, and includes basic claim-level information such as the overall beginning and ending of the claim and the form of claim termination. The second file is organized by claim-dyad-years and includes one data point for each year of each claimant dyad, with information on details such as the characteristics of the claimed territory. The third and final data file covers attempts to settle these territorial claims through bilateral negotiations or with third party assistance (good offices, mediation, inquiry, conciliation, arbitration, or adjudication), and includes details such as the beginning, ending, and effectiveness of each settlement attempt.
GitHub Issues
kaggle.com
zip
Updated Jan 17, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
David Shinn (2018). GitHub Issues [Dataset]. https://www.kaggle.com/davidshinn/github-issues
Explore at:
zip(1027424218 bytes)Available download formats
Dataset updated
Jan 17, 2018
Authors
David Shinn
Description
Description

Over 8 million GitHub issue titles and descriptions from 2017. Prepared from instructions at How To Create Data Products That Are Magical Using Sequence-to-Sequence Models.

Original Source

The data was adapted from GitHub data accessible from GitHub Archive. The constructocat image is from https://octodex.github.com/constructocat-v2.

License

MIT License

Copyright (c) 2018 David Shinn

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
Data from: Location Problem
kaggle.com
zip
Updated Apr 4, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ramon (2022). Location Problem [Dataset]. https://www.kaggle.com/datasets/ramondiaz7/location-problem
Explore at:
zip(26653 bytes)Available download formats
Dataset updated
Apr 4, 2022
Authors
Ramon
Description
Dataset

This dataset was created by Ramon

Contents
TS Note Vol 2 Issue 8
catalog.data.gov
data.va.gov
+2more
Updated Jul 2, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Department of Veterans Affairs (2025). TS Note Vol 2 Issue 8 [Dataset]. https://catalog.data.gov/dataset/ts-note-vol-2-issue-8
Explore at:
Dataset updated
Jul 2, 2025
Dataset provided by
United States Department of Veterans Affairshttp://va.gov/
Description
VA Executive's Guide to Configuration Management
New Security Issues, State and Local Governments
catalog.data.gov
s.cnmilf.com
Updated Dec 18, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Board of Governors of the Federal Reserve System (2024). New Security Issues, State and Local Governments [Dataset]. https://catalog.data.gov/dataset/new-security-issues-state-and-local-governments
Explore at:
Dataset updated
Dec 18, 2024
Dataset provided by
Federal Reserve Systemhttp://www.federalreserve.gov/
Federal Reserve Board of Governors
Description
The New Security Issues, State and Local Governments tables (1.45) are updated monthly. Data were previously published in the Supplement to the Federal Reserve Bulletin, which ceased publication in December 2008. Data sources have included: Mergent, beginning November 2011; Securities Data Company, from January 1990 to October 2011; and Investment Dealers Digest before then.
The Public Jira Dataset
zenodo.org
Updated May 13, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Lloyd Montgomery; Lloyd Montgomery; Clara Lüders; Prof. Dr. Walid Maalej; Clara Lüders; Prof. Dr. Walid Maalej (2025). The Public Jira Dataset [Dataset]. http://doi.org/10.5281/zenodo.5882882
Explore at:
Unique identifier
https://doi.org/10.5281/zenodo.5882882
Dataset updated
May 13, 2025
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Lloyd Montgomery; Lloyd Montgomery; Clara Lüders; Prof. Dr. Walid Maalej; Clara Lüders; Prof. Dr. Walid Maalej
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Jira is an issue tracking system that supports software companies (among other types of companies) with managing their projects, community, and processes. This dataset is a collection of public Jira repositories downloaded from the internet using the Jira API V2. We collected data from 16 pubic Jira repositories containing 1822 projects and 2.7 million issues. Included in this data are historical records of 32 million changes, 8 million comments, and 1 million issue links that connect the issues in complex ways. This artefact repository contains the data as a MongoDB dump, the scripts used to download the data, the scripts used to interpret the data, and qualitative work conducted to make the data more approachable.
Regression fake data
kaggle.com
zip
Updated Apr 9, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
A.Sehwag (2024). Regression fake data [Dataset]. https://www.kaggle.com/datasets/atulsehwag00/regression-fake-data
Explore at:
zip(26398 bytes)Available download formats
Dataset updated
Apr 9, 2024
Authors
A.Sehwag
Description
Simple data created for practicing regression problems. Consist of three columns: Price , Feature 1 and Feature 2. Try to predict price using feature1 and feature2.The data is clean and data cleaning is not required.
d
Data from: Topic Modeling for OLAP on Multidimensional Text Databases: Topic...
catalog.data.gov
Updated Apr 10, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Dashlink (2025). Topic Modeling for OLAP on Multidimensional Text Databases: Topic Cube and its Applications [Dataset]. https://catalog.data.gov/dataset/topic-modeling-for-olap-on-multidimensional-text-databases-topic-cube-and-its-applications
Explore at:
Dataset updated
Apr 10, 2025
Dataset provided by
Dashlink
Description
As the amount of textual information grows explosively in various kinds of business systems, it becomes more and more desirable to analyze both structured data records and unstructured text data simultaneously. Although online analytical processing (OLAP) techniques have been proven very useful for analyzing and mining structured data, they face challenges in handling text data. On the other hand, probabilistic topic models are among the most effective approaches to latent topic analysis and mining on text data. In this paper, we study a new data model called topic cube to combine OLAP with probabilistic topic modeling and enable OLAP on the dimension of text data in a multidimensional text database. Topic cube extends the traditional data cube to cope with a topic hierarchy and stores probabilistic content measures of text documents learned through a probabilistic topic model. To materialize topic cubes efficiently, we propose two heuristic aggregations to speed up the iterative Expectation-Maximization (EM) algorithm for estimating topic models by leveraging the models learned on component data cells to choose a good starting point for iteration. Experimental results show that these heuristic aggregations are much faster than the baseline method of computing each topic cube from scratch. We also discuss some potential uses of topic cube and show sample experimental results.
d
EMS - Top Ten Dispatch Problems by Fiscal Year
catalog.data.gov
data.austintexas.gov
Updated Oct 25, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
data.austintexas.gov (2025). EMS - Top Ten Dispatch Problems by Fiscal Year [Dataset]. https://catalog.data.gov/dataset/ems-top-ten-dispatch-problems-by-fiscal-year
Explore at:
Dataset updated
Oct 25, 2025
Dataset provided by
data.austintexas.gov
Description
This table shows the 10 most frequently recorded incident problem types as recorded by communications personnel for each fiscal year presented.
V
Data from: Medical errors: how the US Government is addressing the problem
data.virginia.gov
healthdata.gov
+1more
html
Updated Sep 6, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
National Institutes of Health (2025). Medical errors: how the US Government is addressing the problem [Dataset]. https://data.virginia.gov/dataset/medical-errors-how-the-us-government-is-addressing-the-problem
Explore at:
htmlAvailable download formats
Dataset updated
Sep 6, 2025
Dataset provided by
National Institutes of Health
Description
November's Institute of Medicine (IOM) report on medical errors has sparked debate among US health policy makers as to the appropriate response to the problem. Proposals range from the implementation of nationwide mandatory reporting with public release of performance data to voluntary reporting and quality-assurance efforts that protect the confidentiality of error-related data. Any successful safety program will require a national effort to make significant investments in information technology infrastructure, and to provide an environment and education that enables providers to contribute to an active quality-improvement process.
g
Housing Problems by Type of Issue and Community, 2019 - Dataset - Open Data
opendata.gov.nt.ca
Updated Jan 1, 2019
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2019). Housing Problems by Type of Issue and Community, 2019 - Dataset - Open Data [Dataset]. https://opendata.gov.nt.ca/dataset/housing-problems-by-type-of-issue-and-community-2019
Explore at:
Dataset updated
Jan 1, 2019
License
Description
Housing Problems by Type of Issue and Community, 2019
Length of time to resolve data breach issues U.S. and Canada 2020-2021
statista.com
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista, Length of time to resolve data breach issues U.S. and Canada 2020-2021 [Dataset]. https://www.statista.com/statistics/1280042/time-to-resolve-data-breach-issues-can-us/
Explore at:
Dataset authored and provided by
Statistahttp://statista.com/
Area covered
Canada, United States
Description
In 2021, ** percent of respondents from Canada indicated that it takes their company less than a week to resolve issues created by a data breach. Because data breaches are expensive, companies need to be able to resolves these issues quickly. For this reason, many companies have an incident response plan.
Change Asset Problem Recording System
catalog.data.gov
s.cnmilf.com
+2more
Updated Nov 27, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Social Security Administration (2025). Change Asset Problem Recording System [Dataset]. https://catalog.data.gov/dataset/change-asset-problem-recording-system
Explore at:
Dataset updated
Nov 27, 2025
Dataset provided by
Social Security Administrationhttp://ssa.gov/
Description
Provide information on problem tracking, assignment and notification, alerts and escalation, problem resolution, and problem closure.

Facebook

Twitter

Click to copy link

Link copied

Cite

National Institutes of Health (2025). Problems in dealing with missing data and informative censoring in clinical trials [Dataset]. https://catalog.data.gov/dataset/problems-in-dealing-with-missing-data-and-informative-censoring-in-clinical-trials

Data from: Problems in dealing with missing data and informative censoring in clinical trials

Explore at:

Dataset updated

Sep 7, 2025

Dataset provided by

National Institutes of Health

Description

A common problem in clinical trials is the missing data that occurs when patients do not complete the study and drop out without further measurements. Missing data cause the usual statistical analysis of complete or all available data to be subject to bias. There are no universally applicable methods for handling missing data. We recommend the following: (1) Report reasons for dropouts and proportions for each treatment group; (2) Conduct sensitivity analyses to encompass different scenarios of assumptions and discuss consistency or discrepancy among them; (3) Pay attention to minimize the chance of dropouts at the design stage and during trial monitoring; (4) Collect post-dropout data on the primary endpoints, if at all possible; and (5) Consider the dropout event itself an important endpoint in studies with many.

Clear search

Close search

Google apps

Main menu

Data from: Problems in dealing with missing data and informative censoring...

Data from: Problem Dataset

Problem

Italy: privacy concerns regarding personal data on the internet, by issue

Room Assignment problem

Countries where companies face challenges with international data issues...

Social Media and Mental Health

Shady Dataset (find the error)

The Issue Correlates of War (ICOW) Project Issue Data Set: Territorial...

GitHub Issues

Description

Original Source

License

Data from: Location Problem

Dataset

Contents

TS Note Vol 2 Issue 8

New Security Issues, State and Local Governments

The Public Jira Dataset

Regression fake data

Data from: Topic Modeling for OLAP on Multidimensional Text Databases: Topic...

EMS - Top Ten Dispatch Problems by Fiscal Year

Data from: Medical errors: how the US Government is addressing the problem

Housing Problems by Type of Issue and Community, 2019 - Dataset - Open Data

Length of time to resolve data breach issues U.S. and Canada 2020-2021

Change Asset Problem Recording System

Data from: Problems in dealing with missing data and informative censoring in clinical trials