The statistic displays the result of a survey conducted in 2019 on respondents' actions against data misuse in Sweden. 50 percent of the participants did not take any actions. The most common measure taken, according to 32 percent of respondents was to delete an app from their phone. The least common measure, according to one percent of respondents was to report to the Swedish Data Protection Authority.
This dataset contains water column and sediment geochemical data taken by bottle and bottom samples from research vessel Atlantis (cruise AT41) in the southeastern U.S. Atlantic margin in August 2018. Data include nutrients, nitrate, phosphorus, and other chemical characteristics. There are CTD data for this expedition archived at NCEI under accession number 0177873. This cruise is part of the DEEP Sea Exploration to Advance Research on Coral/Canyon/Cold seep Habitats (DEEP SEARCH) project, the goal of which is to explore a variety of deep sea ecosystems off the southeastern U.S. Atlantic margin. Data are in CSV format.
In 2024, ** percent of respondents in a global survey had implemented multifactor authentication as their main data protection measure, both in the cloud and on-premises. Furthermore, over ** percent of respondents stated that their company had already implemented backups in the cloud.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Overview This dataset contains 45,000 records of loan applicants, with various attributes related to personal demographics, financial status, and loan details. The dataset can be used for predictive modeling, particularly in credit risk assessment and loan default prediction.
Dataset Content The dataset includes 14 columns representing different factors influencing loan approvals and defaults:
Personal Information
person_age: Age of the applicant (in years). person_gender: Gender of the applicant (male, female). person_education: Educational background (High School, Bachelor, Master, etc.). person_income: Annual income of the applicant (in USD). person_emp_exp: Years of employment experience. person_home_ownership: Type of home ownership (RENT, OWN, MORTGAGE). Loan Details
loan_amnt: Loan amount requested (in USD). loan_intent: Purpose of the loan (PERSONAL, EDUCATION, MEDICAL, etc.). loan_int_rate: Interest rate on the loan (percentage). loan_percent_income: Ratio of loan amount to income. Credit & Loan History
cb_person_cred_hist_length: Length of the applicant's credit history (in years). credit_score: Credit score of the applicant. previous_loan_defaults_on_file: Whether the applicant has previous loan defaults (Yes or No). Target Variable
loan_status: 1 if the loan was repaid successfully, 0 if the applicant defaulted. Use Cases Loan Default Prediction: Build a classification model to predict loan repayment. Credit Risk Analysis: Analyze the relationship between income, credit score, and loan defaults. Feature Engineering: Extract new insights from employment history, home ownership, and loan amounts. Acknowledgments This dataset is synthetic and designed for machine learning and financial risk analysis.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
This data set contains Raw data taken by the New Horizons (NH) Multispectral Visible Imaging Camera (MVIC) instrument during the Kuiper Belt Extended Mission 1 (KEM1) Encounter mission phase. This data was acquired by the spacecraft between 14 August 2018 and 30 April 2022. It only includes data downlinked before 01 May 2022. Future data releases may include more data acquired by the spacecraft after 13 August 2018 but downlinked after 30 April 2022. The data includes functional tests and images during the approach and departure of 486958 Arrokoth (2014 MU69). A look back at Pluto was also performed after the Arrokoth flyby. A Color Scan of Neptune and Uranus was done along with a Solar Star Calibration and Radiometric Calibration. These data were migrated from the Planetary Data System (PDS) PDS3 format dataset: NH-A-MVIC-2-KEM1-V6.0. Labels were redesigned during migration using the Flexible Image Transport (FIT) header and PDS3 label (LBL) files, but the data files are unchanged from their PDS3 version.
This dataset contains water column and sediment geochemical data collected taken by bottle and bottom samples from NOAA Ship Ronald H. Brown (cruise RB19-03) in the southeastern U.S. Atlantic margin in April 2019. Data include nutrient, nitrate, phosphorus, and other chemical characteristics. There are CTD data for this expedition already archived on NCEI under accession number 0207828. This cruise is part of the DEEP SEARCH project, the goal of which is to explore a variety of deep sea ecosystems off the southeastern U.S. Atlantic margin. Data are in CSV format.
According to a survey on the state of digital literacy in 2022, around 50.6 percent of Indonesian respondents stated that they used letter, number, or pattern passwords to unlock their phones. Furthermore, around 22.4 percent of participants claimed to use fingerprint authentication.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
This dataset aggregates information about 191 research data repositories that were shut down. The data collection was based on the registry of research data repositories re3data and a comprehensive content analysis of repository websites and related materials. Documented in the dataset are the period in which a repository was active, the risks resulting in its shutdown, and the repositories taking over custody of the data after.
This dataset contains preliminary and final cruise reports describing physical and chemical ocean data taken by CTD, Niskin bottles, and other instrumentation from the research vessel Sonne, cruise SO298, along cross Equatorial Pacific Ocean transect. This cruise is U.S. State Department MSR U2022-049 as part of the World Data Service for Geophysics and Oceanography. Reports are in PDF.
A survey conducted from April to May 2022 found that six in 10 organizations in the United States designated an internal project manager or owner to manage compliance with state-level privacy laws. Around half of the organizations conducted data mapping and had an understanding of data practices across the organization. A further 41 percent said they updated privacy policies, while 40 percent said they were in the process of doing so.
Nearly 40 percent of Russians restricted or denied access to their location while browsing the Internet as of November 2020. That was the most common action with regards to personal data protection online. Furthermore, one quarter of respondents restricted access to their social media profiles, posts, or cloud services.
Open Data Commons Attribution License (ODC-By) v1.0https://www.opendatacommons.org/licenses/by/1.0/
License information was derived automatically
After May 3, 2024, this dataset and webpage will no longer be updated because hospitals are no longer required to report data on COVID-19 hospital admissions, and hospital capacity and occupancy data, to HHS through CDC’s National Healthcare Safety Network. Data voluntarily reported to NHSN after May 1, 2024, will be available starting May 10, 2024, at COVID Data Tracker Hospitalizations.
This time series dataset includes viral COVID-19 laboratory test [Polymerase chain reaction (PCR)] results from over 1,000 U.S. laboratories and testing locations including commercial and reference laboratories, public health laboratories, hospital laboratories, and other testing locations. Data are reported to state and jurisdictional health departments in accordance with applicable state or local law and in accordance with the Coronavirus Aid, Relief, and Economic Security (CARES) Act (CARES Act Section 18115).
Data are provisional and subject to change.
Data presented here is representative of diagnostic specimens being tested - not individual people - and excludes serology tests where possible. Data presented might not represent the most current counts for the most recent 3 days due to the time it takes to report testing information. The data may also not include results from all potential testing sites within the jurisdiction (e.g., non-laboratory or point of care test sites) and therefore reflect the majority, but not all, of COVID-19 testing being conducted in the United States.
Sources: CDC COVID-19 Electronic Laboratory Reporting (CELR), Commercial Laboratories, State Public Health Labs, In-House Hospital Labs
Data for each state is sourced from either data submitted directly by the state health department via COVID-19 electronic laboratory reporting (CELR), or a combination of commercial labs, public health labs, and in-house hospital labs. Data is taken from CELR for states that either submit line level data or submit aggregate counts which do not include serology tests.
Objectives: Demonstrate the application of decision trees—classification and regression trees (CARTs), and their cousins, boosted regression trees (BRTs)—to understand structure in missing data. Setting: Data taken from employees at 3 different industrial sites in Australia. Participants: 7915 observations were included. Materials and methods: The approach was evaluated using an occupational health data set comprising results of questionnaires, medical tests and environmental monitoring. Statistical methods included standard statistical tests and the ‘rpart’ and ‘gbm’ packages for CART and BRT analyses, respectively, from the statistical software ‘R’. A simulation study was conducted to explore the capability of decision tree models in describing data with missingness artificially introduced. Results: CART and BRT models were effective in highlighting a missingness structure in the data, related to the type of data (medical or environmental), the site in which it was collected, the number of visits, and the presence of extreme values. The simulation study revealed that CART models were able to identify variables and values responsible for inducing missingness. There was greater variation in variable importance for unstructured as compared to structured missingness. Discussion: Both CART and BRT models were effective in describing structural missingness in data. CART models may be preferred over BRT models for exploratory analysis of missing data, and selecting variables important for predicting missingness. BRT models can show how values of other variables influence missingness, which may prove useful for researchers. Conclusions: Researchers are encouraged to use CART and BRT models to explore and understand missing data.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This data set includes all data associated with the study, "Taking aversive action: an experimental investigation into the negotiation of bodily limits." The data include Gender, Age, BMI and Activity levels (IPAQ). In addition they include all data associated with the main variables of the study, including Interoceptive Accuracy, Heart Rate Variability, Anxiety Sensitivity, Porges Body awareness, and Mean Power Output. Absolute difference scores are provided for time intervals 1, 2, and 3.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
There are a number of Kaggle datasets that provide spatial data around New York City. For many of these, it may be quite interesting to relate the data to the demographic and economic characteristics of nearby neighborhoods. I hope this data set will allow for making these comparisons without too much difficulty.
Exploring the data and making maps could be quite interesting as well.
This dataset contains two CSV files:
nyc_census_tracts.csv
This file contains a selection of census data taken from the ACS DP03 and DP05 tables. Things like total population, racial/ethnic demographic information, employment and commuting characteristics, and more are contained here. There is a great deal of additional data in the raw tables retrieved from the US Census Bureau website, so I could easily add more fields if there is enough interest.
I obtained data for individual census tracts, which typically contain several thousand residents.
census_block_loc.csv
For this file, I used an online FCC census block lookup tool to retrieve the census block code for a 200 x 200 grid containing
New York City and a bit of the surrounding area. This file contains the coordinates and associated census block codes along
with the state and county names to make things a bit more readable to users.
Each census tract is split into a number of blocks, so one must extract the census tract code from the block code.
The data here was taken from the American Community Survey 2015 5-year estimates (https://factfinder.census.gov/faces/nav/jsf/pages/index.xhtml).
The census block coordinate data was taken from the FCC Census Block Conversions API (https://www.fcc.gov/general/census-block-conversions-api)
As public data from the US government, this is not subject to copyright within the US and should be considered public domain.
https://www.usa.gov/government-workshttps://www.usa.gov/government-works
This data set contains Raw data taken by New Horizons Long Range Reconnaissance Imager instrument (LORRI) during the post-launch checkout mission phase. This is VERSION 3.0 of this data set. During the Launch Phase, LORRI conducted a variety of instrument commissioning activities in preparation for science operations. An extensive series of bias images was collected during the spring and summer of 2006, before the LORRI aperture door was opened. Images were also collected of LORRI's internal lamps. The LORRI aperture door was opened on 2006 August 29, and an extensive series of calibration observations of stars in the open cluster M7 was performed during the next several days. Commissioning test observations were also performed on the planetary targets Jupiter, Uranus, Neptune, and Pluto. The Jovian observations were conducted specifically to test LORRI's ability to perform imaging using exposure times that were short compared to the frame transfer time (i.e., shorter than 13 ms). Some histogram-only solar scattered light observations were performed during a Ralph commissioning test.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
This dataset is the result of a study, in which we analyzed parallels and differences between clicks as well as eye movements on two different digital library homepages. For this analysis we used diverse tracking tools for mouse clicks and eye tracking data that where further studied with respect to specific areas of interest (AOI).
The dataset contains two screenshots indicating the areas of interest (AOIs; entitled “AreasOfInterest_Kartenportal.jpg and AreasOfInterest_Webportal.jpg), which separate the homepages into analyzable parts. It also contains eight screenshots of the homepages containing the total amount of collected clicks (each name starting with “clicks”) and two screenshots with the eye tracking heat maps (starting with “Heatmap”). The screenshots have directly been extracted from the click and eye tracking tools and matched with the before mentioned AOIs in order to gain the total count of clicks and views as well as the view duration the concerned area.
All data are synthesized in a document containing three sheets with different tables: a first one with the initial data compilation for all AOIs of the two analyzed homepages (entitled “Data”), a second one with a more visual compiled data analysis for both homepages and all AOIs (entitled “Data2) and last one with the duration of the view as well as the duration of the fixation and the compiled click data (entitled « Eye tracking study data »).
This data set contains Raw data taken by the New Horizons Student Dust Counter instrument during the KEM1 ENCOUNTER mission phase. This is VERSION 1.0 of this data set. This data set contains data acquired by the spacecraft between 08/14/2018 and 12/31/2018. It only includes data downlinked before 01/01/2019. Future datasets may include more data acquired by the spacecraft after 08/13/2018 but downlinked after 12/31/2018.
Jiang Wang and Xiaohan Nie captured the Multiview 3D event dataset at UCLA. It contains RGB, depth, and human skeleton data captured simultaneously by three Kinect cameras. This dataset includes 10 action categories: pick up with one hand, pick up with two hands, drop trash, walk around, sit down, stand up, donning, doffing, throw, and carry. Ten actors perform each action. This dataset contains data taken from a variety of viewpoints.
This is the version of the dataset that contains complete data.
NOTE : The dataset has been uploaded on Kaggle with permission from one of the Original Authors for others to use. So, Feel Free to use it in your projects and research
https://www.icpsr.umich.edu/web/ICPSR/studies/8896/termshttps://www.icpsr.umich.edu/web/ICPSR/studies/8896/terms
This data collection represents the fourth wave of the High School and Beyond series. The base-year data (ICPSR 7896) were collected in 1980, and the first and second follow-ups (ICPSR 8297 and ICPSR 8443) were conducted in 1982 and 1984. The High School and Beyond series is a longitudinal study of students who were high school sophomores and seniors in 1980. As with the first and second follow-ups, the structure and documentation of High School and Beyond Third Follow-Up data files represent a departure from base-year (1980) practices. While the base-year student file contains data from both the senior and sophomore cohorts, the three follow-up surveys provide separate student files for the two cohorts. Each of the cohort files for this collection merges the base year and first and second follow-up data with the third follow-up data. To maintain comparability with prior waves, many questions from previous follow-up surveys were repeated on the third follow-up questionnaire. Respondents were asked to update background information and to provide information about their work experience, unemployment history, education and other training, family information, income, and other experiences and opinions. Event history formats were used for obtaining responses about jobs held, schools attended, periods of unemployment, and marriage patterns. New items were added on respondents' interest in graduate degree programs and on alcohol consumption habits. The transcript files, which present data taken from official records of academic and vocational schools, include information on program enrollments, periods of study, fields of study pursued, specific courses taken, and credentials earned.
The statistic displays the result of a survey conducted in 2019 on respondents' actions against data misuse in Sweden. 50 percent of the participants did not take any actions. The most common measure taken, according to 32 percent of respondents was to delete an app from their phone. The least common measure, according to one percent of respondents was to report to the Swedish Data Protection Authority.