100+ datasets found

m
Requirements data sets (user stories)
data.mendeley.com
zenodo.org
Updated Jul 28, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Fabiano Dalpiaz (2018). Requirements data sets (user stories) [Dataset]. http://doi.org/10.17632/7zbk8zsd8y.1
Explore at:
Unique identifier
https://doi.org/10.17632/7zbk8zsd8y.1
Dataset updated
Jul 28, 2018
Authors
Fabiano Dalpiaz
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
A collection of 22 data set of 50+ requirements each, expressed as user stories. These were all found online, or retrieved from software companies with a permission to disclose.

The data sets have been originally used to conduct experiments about ambiguity detection with the REVV-Light tool: https://github.com/RELabUU/revv-light
m
Labeled Optical Coherence Tomography (OCT) and Chest X-Ray Images for...
data.mendeley.com
narcis.nl
Updated Jan 6, 2018
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Daniel Kermany (2018). Labeled Optical Coherence Tomography (OCT) and Chest X-Ray Images for Classification [Dataset]. http://doi.org/10.17632/rscbjbr9sj.2
Explore at:
Unique identifier
https://doi.org/10.17632/rscbjbr9sj.2
Dataset updated
Jan 6, 2018
Authors
Daniel Kermany
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Dataset of validated OCT and Chest X-Ray images described and analyzed in "Deep learning-based classification and referral of treatable human diseases". The OCT Images are split into a training set and a testing set of independent patients. OCT Images are labeled as (disease)-(randomized patient ID)-(image number by this patient) and split into 4 directories: CNV, DME, DRUSEN, and NORMAL.
m
United States Surface Urban Heat Island database
data.mendeley.com
Updated Jul 2, 2020
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
TC Chakraborty (2020). United States Surface Urban Heat Island database [Dataset]. http://doi.org/10.17632/x9mv4krnm2.2
Explore at:
Unique identifier
https://doi.org/10.17632/x9mv4krnm2.2
Dataset updated
Jul 2, 2020
Authors
TC Chakraborty
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
United States
Description
This dataset contains urban and rural LST, DEM, and NDVI data for annual, summer, and winter daytime and nighttime for all census tracts in US urbanized areas, as well as the mean values for the entire urbanized area.

METADATA DEM: Digital Elevation Model NDVI: Normalized Difference Vegetation Index LST: Land Surface Temperature
_urb: Urban values (all urban pixels within urbanized areas) _rur: Rural reference (Spatial mean of the non-urban, non-water pixels within the region of interest) Regions of Interest: _CT: Spatial mean of pixels intersecting the Census Tract clipped to the urbanized area (one value per census tract). This should be equal to the _CT for census tracts that are completely within the urbanized areas (the census tracts with the green dots in the image below) _all: Spatial mean of all pixels intersecting the urbanized area, as defined by the US census (one value for one urbanized area) _CT_act: Spatial mean of all available pixels intersecting the Census Tract (one value per census tract) [This should be equal to the previous values I calculated]

For the UHI: The ideal configuration is LST_urb_all-LST_rur_all for the entire urbanized area (from the US_Urbanized file) and LST_urb_CT_act-LST_rur_all for individual census tracts within the urbanized areas (from the census file) For the equity analysis: Either _CT or CT_act can be used if we are only concerned with spatial variation. Using CT_act leads to mismatch between census data for the tracts crossing the urban boundary and the remotely sensed data. Using _CT leads to mismatch between the UHI analysis and the equity analysis.
m
Diabetes Dataset
data.mendeley.com
Updated Jul 18, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ahlam Rashid (2020). Diabetes Dataset [Dataset]. http://doi.org/10.17632/wj9rwkp9c2.1
Explore at:
Unique identifier
https://doi.org/10.17632/wj9rwkp9c2.1
Dataset updated
Jul 18, 2020
Authors
Ahlam Rashid
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The construction of diabetes dataset was explained. The data were collected from the Iraqi society, as they data were acquired from the laboratory of Medical City Hospital and (the Specializes Center for Endocrinology and Diabetes-Al-Kindy Teaching Hospital). Patients' files were taken and data extracted from them and entered in to the database to construct the diabetes dataset. The data consist of medical information, laboratory analysis. The data attribute are: The data consist of medical information, laboratory analysis… etc. The data that have been entered initially into the system are: No. of Patient, Sugar Level Blood, Age, Gender, Creatinine ratio(Cr), Body Mass Index (BMI), Urea, Cholesterol (Chol), Fasting lipid profile, including total, LDL, VLDL, Triglycerides(TG) and HDL Cholesterol , HBA1C, Class (the patient's diabetes disease class may be Diabetic, Non-Diabetic, or Predict-Diabetic).
m
Phishing Websites Dataset
data.mendeley.com
Updated Sep 24, 2020
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Grega Vrbančič (2020). Phishing Websites Dataset [Dataset]. http://doi.org/10.17632/72ptz43s9v.1
Explore at:
Unique identifier
https://doi.org/10.17632/72ptz43s9v.1
Dataset updated
Sep 24, 2020
Authors
Grega Vrbančič
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
These data consist of a collection of legitimate as well as phishing website instances. Each website is represented by the set of features which denote, whether website is legitimate or not. Data can serve as an input for machine learning process.

In this repository the two variants of the Phishing Dataset are presented.

Full variant - dataset_full.csv Short description of the full variant dataset: Total number of instances: 88,647 Number of legitimate website instances (labeled as 0): 58,000 Number of phishing website instances (labeled as 1): 30,647 Total number of features: 111

Small variant - dataset_small.csv Short description of the small variant dataset: Total number of instances: 58,645 Number of legitimate website instances (labeled as 0): 27,998 Number of phishing website instances (labeled as 1): 30,647 Total number of features: 111
m
Data from: Single-cell RNA-Seq of human primary lung and bronchial...
data.mendeley.com
figshare.com
Updated Mar 13, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Soeren Lukassen (2020). Single-cell RNA-Seq of human primary lung and bronchial epithelium cells [Dataset]. http://doi.org/10.17632/7r2cwbw44m.1
Explore at:
Unique identifier
https://doi.org/10.17632/7r2cwbw44m.1
Dataset updated
Mar 13, 2020
Authors
Soeren Lukassen
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset contains count matrices and per-cells metadata tables for RNA sequencing of 39778 single nuclei from healthy primary lung samples of 12 lung adenocarcinoma patients as well as 17451 single human bronchiole epithelial cells from 4 donors. All samples were processed using the 10X Genomics Chromium platform with v2 chemistry and sequenced with one sample per lane on an Illumina HiSeq4000. Reads were aligned to the hg19 reference genome version 1.2.0 obtained from 10X Genomics. Data processing was performed using Seurat3. The metadata table includes patient ID, sex, age, smoking status, and cell type, as well as QC statistics (number of genes, number of cells, ratio of mitochondrial reads).
m
SpanishTweetsCOVID-19: A Social Media Enriched Covid-19 Twitter Spanish...
data.mendeley.com
Updated Jul 15, 2020
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Antonela Tommasel (2020). SpanishTweetsCOVID-19: A Social Media Enriched Covid-19 Twitter Spanish Dataset [Dataset]. http://doi.org/10.17632/nv8k69y59d.1
Explore at:
Unique identifier
https://doi.org/10.17632/nv8k69y59d.1
Dataset updated
Jul 15, 2020
Authors
Antonela Tommasel
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset presents a large-scale collection of millions of Twitter posts related to the coronavirus pandemic in Spanish language. The collection was built by monitoring public posts written in Spanish containing a diverse set of hashtags related to the COVID-19, as well as tweets shared by the official Argentinian government offices, such as ministries and secretaries at different levels. Data was collected between March and June 2020 using the Twitter API, and will be periodically updated.

In addition to tweets IDs, the dataset includes information about mentions, retweets, media, URLs, hashtags, replies, users and content-based user relations, allowing the observation of the dynamics of the shared information. Data is presented in different tables that can be analysed separately or combined.

The dataset aims at serving as source for studying several coronavirus effects in people through social media, including the impact of public policies, the perception of risk and related disease consequences, the adoption of guidelines, the emergence, dynamics and propagation of disinformation and rumours, the formation of communities and other social phenomena, the evolution of health related indicators (such as fear, stress, sleep disorders, or children behaviour changes), among other possibilities. In this sense, the dataset can be useful for multi-disciplinary researchers related to the different fields of data science, social network analysis, social computing, medical informatics, social sciences, among others.
m
Panoramic Dental X-rays With Segmented Mandibles
data.mendeley.com
Updated Nov 12, 2017
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Amir Abdi (2017). Panoramic Dental X-rays With Segmented Mandibles [Dataset]. http://doi.org/10.17632/hxt48yk462.1
Explore at:
Unique identifier
https://doi.org/10.17632/hxt48yk462.1
Dataset updated
Nov 12, 2017
Authors
Amir Abdi
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset consists of anonymized and deidentified panoramic dental X-rays of 116 patients, taken at Noor Medical Imaging Center, Qom, Iran. The subjects cover a wide range of dental conditions from healthy, to partial and complete edentulous cases. The mandibles of all cases are manually segmented by two dentists. This dataset is used as the basis for the article by Abdi et al [1].

[1] A. H. Abdi, S. Kasaei, and M. Mehdizadeh, “Automatic segmentation of mandible in panoramic x-ray,” J. Med. Imaging, vol. 2, no. 4, p. 44003, 2015.
m
Genome-wide association studies and Mendelian randomization analyses for...
data.mendeley.com
search.datacite.org
Updated Mar 10, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yordi van de Vegte (2020). Genome-wide association studies and Mendelian randomization analyses for leisure sedentary behaviours - Supplementary Data Files [Dataset]. http://doi.org/10.17632/mxjj6czsrd.1
Explore at:
Unique identifier
https://doi.org/10.17632/mxjj6czsrd.1
Dataset updated
Mar 10, 2020
Authors
Yordi van de Vegte
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Supplementary data files related to the article "Genome-wide association studies and Mendelian randomization analyses for leisure sedentary behaviours" by van de Vegte et al.
m
Annotated Terms of Service of 100 Online Platforms
data.mendeley.com
Updated Dec 12, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Przemyslaw Palka (2023). Annotated Terms of Service of 100 Online Platforms [Dataset]. http://doi.org/10.17632/dtbj87j937.3
Explore at:
Unique identifier
https://doi.org/10.17632/dtbj87j937.3
Dataset updated
Dec 12, 2023
Authors
Przemyslaw Palka
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The dataset contains information about the contents of 100 Terms of Service (ToS) of online platforms. The documents were analyzed and evaluated from the point of view of the European Union consumer law. The main results have been presented in the table titled "Terms of Service Analysis and Evaluation_RESULTS." This table is accompanied by the instruction followed by the annotators, titled "Variables Definitions," allowing for the interpretation of the assigned values. In addition, we provide the raw data (analyzed ToS, in the folder "Clear ToS") and the annotated documents (in the folder "Annotated ToS," further subdivided).

SAMPLE: The sample contains 100 contracts of digital platforms operating in sixteen market sectors: Cloud storage, Communication, Dating, Finance, Food, Gaming, Health, Music, Shopping, Social, Sports, Transportation, Travel, Video, Work, and Various. The selected companies' main headquarters span four legal surroundings: the US, the EU, Poland specifically, and Other jurisdictions. The chosen platforms are both privately held and publicly listed and offer both fee-based and free services. Although the sample cannot be treated as representative of all online platforms, it nevertheless accounts for the most popular consumer services in the analyzed sectors and contains a diverse and heterogeneous set.

CONTENT: Each ToS has been assigned the following information: 1. Metadata: 1.1. the name of the service; 1.2. the URL; 1.3. the effective date; 1.4. the language of ToS; 1.5. the sector; 1.6. the number of words in ToS; 1.7–1.8. the jurisdiction of the main headquarters; 1.9. if the company is public or private; 1.10. if the service is paid or free. 2. Evaluative Variables: remedy clauses (2.1– 2.5); dispute resolution clauses (2.6–2.10); unilateral alteration clauses (2.11–2.15); rights to police the behavior of users (2.16–2.17); regulatory requirements (2.18–2.20); and various (2.21–2.25). 3. Count Variables: the number of clauses seen as unclear (3.1) and the number of other documents referred to by the ToS (3.2). 4. Pull-out Text Variables: rights and obligations of the parties (4.1) and descriptions of the service (4.2)

ACKNOWLEDGEMENT: The research leading to these results has received funding from the Norwegian Financial Mechanism 2014-2021, project no. 2020/37/K/HS5/02769, titled “Private Law of Data: Concepts, Practices, Principles & Politics.”
m
Web page phishing detection
data.mendeley.com
Updated Jun 25, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Abdelhakim Hannousse (2021). Web page phishing detection [Dataset]. http://doi.org/10.17632/c2gw7fy2j4.3
Explore at:
Unique identifier
https://doi.org/10.17632/c2gw7fy2j4.3
Dataset updated
Jun 25, 2021
Authors
Abdelhakim Hannousse
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The provided dataset includes 11430 URLs with 87 extracted features. The dataset are designed to be used as a a benchmark for machine learning based phishing detection systems. Features are from three different classes: 56 extracted from the structure and syntax of URLs, 24 extracted from the content of their correspondent pages and 7 are extracetd by querying external services. The datatset is balanced, it containes exactly 50% phishing and 50% legitimate URLs. Associated to the dataset, we provide Python scripts used for the extraction of the features for potential replication or extension. Datasets are constructed on May 2020.

dataset_A: contains a list a URLs together with their DOM tree objects that can be used for replication and experimenting new URL and content-based features overtaking short-time living of phishing web pages.

dataset_B: containes the extracted feature values that can be used directly as inupt to classifiers for examination. Note that the data in this dataset are indexed with URLs so that one need to remove the index before experimentation.
m
Oral Images Dataset
data.mendeley.com
Updated Feb 5, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Chandrashekar H S (2021). Oral Images Dataset [Dataset]. http://doi.org/10.17632/mhjyrn35p4.2
Explore at:
Unique identifier
https://doi.org/10.17632/mhjyrn35p4.2
Dataset updated
Feb 5, 2021
Authors
Chandrashekar H S
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The dataset includes color images of oral lesions captured using mobile cameras and intraoral cameras. These images can be used for identifying potential oral malignancies by image analysis. These images have been collected in consultation with doctors from different hospitals and colleges in Karnataka, India. This dataset contains two folders - original_data and augmented_data. The first folder contains images of 165 benign lesions and 158 malignant lesions. The second folder contains images created by augmenting the original images. The augmentation techniques used are flipping, rotation and resizing.
m
Data from: Predicting Long-term Dynamics of Soil Salinity and Sodicity on a...
data.mendeley.com
Updated Nov 26, 2020
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Amirhossein Hassani (2020). Predicting Long-term Dynamics of Soil Salinity and Sodicity on a Global Scale [Dataset]. http://doi.org/10.17632/v9mgbmtnf2.1
Explore at:
Unique identifier
https://doi.org/10.17632/v9mgbmtnf2.1
Dataset updated
Nov 26, 2020
Authors
Amirhossein Hassani
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset globally (excluding frigid/polar zones) quantifies the different facets of variability in surface soil (0 – 30 cm) salinity and sodicity for the period between 1980 and 2018. This is realised by developing 4-D predictive models of Electrical Conductivity of saturated soil Extract (ECe) and soil Exchangeable Sodium Percentage (ESP) as indicators of soil salinity and sodicity. These machine learning-based models make predictions for ECe and ESP at different times, locations, and depths and by extracting meaningful statistics form those predictions, different facets of variability in the surface soil salinity and sodicity are quantified. The dataset includes 10 maps documenting different aspects of soil salinity and sodicity variations, and auxiliary data required for generation of those maps. Users are referred to the corresponding "READ_ME" file for more information about this dataset.
m
Benchmark data sets
data.mendeley.com
narcis.nl
Updated Dec 27, 2017
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Haonan Tong (2017). Benchmark data sets [Dataset]. http://doi.org/10.17632/923xvkk5mm.1
Explore at:
Unique identifier
https://doi.org/10.17632/923xvkk5mm.1
Dataset updated
Dec 27, 2017
Authors
Haonan Tong
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
A total of 12 software defect data sets from NASA were used in this study, where five data sets (part I) including CM1, JM1, KC1, KC2, and PC1 are obtained from PROMISE software engineering repository (http://promise.site.uottawa.ca/SERepository/), the other seven data sets (part II) are obtained from tera-PROMISE Repository (http://openscience.us/repo/defect/mccabehalsted/).
m
DAWN
data.mendeley.com
opendatalab.com
+1more
Updated Mar 6, 2020
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mourad KENK (2020). DAWN [Dataset]. http://doi.org/10.17632/766ygrbt8y.3
Explore at:
Unique identifier
https://doi.org/10.17632/766ygrbt8y.3
Dataset updated
Mar 6, 2020
Authors
Mourad KENK
License
Attribution-NonCommercial 3.0 (CC BY-NC 3.0)https://creativecommons.org/licenses/by-nc/3.0/
License information was derived automatically
Description
DAWN (Detection in Adverse Weather Nature) dataset consists of real-world images collected under various adverse weather conditions. This dataset emphasizes a diverse traffic environment (urban, highway and freeway) as well as a rich variety of traffic flow. The DAWN dataset comprises a collection of 1000 images from real-traffic environments, which are divided into four sets of weather conditions: fog, snow, rain and sandstorms. The dataset is annotated with object bounding boxes for autonomous driving and video surveillance scenarios. This data helps interpreting effects caused by the adverse weather conditions on the performance of vehicle detection systems. Also, it is required by researchers work in autonomous vehicles and intelligent visual traffic surveillance systems fields. All the rights of the DAWN dataset are reserved and commercial use/distribution of this database is strictly prohibited.
m
Data for: MACHINE LEARNING IN MEDICINE: CLASSIFICATION AND PREDICTION OF...
data.mendeley.com
Updated Jul 2, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Gopi Battineni (2019). Data for: MACHINE LEARNING IN MEDICINE: CLASSIFICATION AND PREDICTION OF DEMENTIA BY SUPPORT VECTOR MACHINES (SVM) [Dataset]. http://doi.org/10.17632/tsy6rbc5d4.1
Explore at:
Unique identifier
https://doi.org/10.17632/tsy6rbc5d4.1
Dataset updated
Jul 2, 2019
Authors
Gopi Battineni
License
Attribution-NonCommercial 3.0 (CC BY-NC 3.0)https://creativecommons.org/licenses/by-nc/3.0/
License information was derived automatically
Description
This set consists of a longitudinal collection of 150 subjects aged 60 to 96. Each subject was scanned on two or more visits, separated by at least one year for a total of 373 imaging sessions. For each subject, 3 or 4 individual T1-weighted MRI scans obtained in single scan sessions are included. The subjects are all right-handed and include both men and women. 72 of the subjects were characterized as nondemented throughout the study. 64 of the included subjects were characterized as demented at the time of their initial visits and remained so for subsequent scans, including 51 individuals with mild to moderate Alzheimer’s disease. Another 14 subjects were characterized as nondemented at the time of their initial visit and were subsequently characterized as demented at a later visit.
m
LG 18650HG2 Li-ion Battery Data and Example Deep Neural Network xEV SOC...
data.mendeley.com
Updated Mar 5, 2020
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Philip Kollmeyer (2020). LG 18650HG2 Li-ion Battery Data and Example Deep Neural Network xEV SOC Estimator Script [Dataset]. http://doi.org/10.17632/cp3473x7xv.3
Explore at:
Unique identifier
https://doi.org/10.17632/cp3473x7xv.3
Dataset updated
Mar 5, 2020
Authors
Philip Kollmeyer
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The included tests were performed at McMaster University in Hamilton, Ontario, Canada by Dr. Phillip Kollmeyer (phillip.kollmeyer@gmail.com). If this data is utilized for any purpose, it should be appropriately referenced. -A brand new 3Ah LG HG2 cell was tested in an 8 cu.ft. thermal chamber with a 75amp, 5 volt Digatron Firing Circuits Universal Battery Tester channel with a voltage and current accuracy of 0.1% of full scale. these data are used in the design process of an SOC estimator using a deep feedforward neural network (FNN) approach. The data also includes a description of data acquisition, data preparation, development of an FNN example script.

-Instructions for Downloading and Running the Script: 1-Select download all files from the Mendeley Data page (https://data.mendeley.com/datasets/cp3473x7xv/2).
2-The files will be downloaded as a zip file. Unzip the file to a folder, do not modify the folder structure.
3-Navigate to the folder with "FNN_xEV_Li_ion_SOC_EstimatorScript_March_2020.mlx" 4-Open and run "FNN_xEV_Li_ion_SOC_EstimatorScript_March_2020.mlx" 5-The matlab script should run without any modification, if there is an issue it's likely due to the testing and training data not being in the expected place. 6-The script is set by default to train for 50 epochs and to repeat the training 3 times. This should take 5-10 minutes to execute. 7-To recreate the results in the paper, set number of epochs to 5500 and number of repetitions to 10.

-The test data, or similar data, has been used for some publications, including: [1] C. Vidal, P. Kollmeyer, M. Naguib, P. Malysz, O. Gross, and A. Emadi, “Robust xEV Battery State-of-Charge Estimator Design using Deep Neural Networks,” in Proc WCX SAE World Congress Experience, Detroit, MI, Apr 2020 [2] C. Vidal, P. Kollmeyer, E. Chemali and A. Emadi, "Li-ion Battery State of Charge Estimation Using Long Short-Term Memory Recurrent Neural Network with Transfer Learning," 2019 IEEE Transportation Electrification Conference and Expo (ITEC), Detroit, MI, USA, 2019, pp. 1-6.
m
Concrete Crack Segmentation Dataset
data.mendeley.com
datasetninja.com
Updated Apr 3, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Çağlar Fırat Özgenel (2019). Concrete Crack Segmentation Dataset [Dataset]. http://doi.org/10.17632/jwsn7tfbrp.1
Explore at:
Unique identifier
https://doi.org/10.17632/jwsn7tfbrp.1
Dataset updated
Apr 3, 2019
Authors
Çağlar Fırat Özgenel
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The dataset includes 458 hi-res images together with their alpha maps (BW) indicating the crack presence. The ground truth for semantic segmentation has two classes to conduct binary pixelwise classification. The photos are captured in various buildings located in Middle East Technical University.

You can access a larger dataset containing images with 227x227 px dimensions for classification which are produced from this dataset from http://dx.doi.org/10.17632/5y9wdsg2zt.1 .
m
Dataset for Crop Pest and Disease Detection
data.mendeley.com
Updated Apr 26, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Patrick Mensah Kwabena (2023). Dataset for Crop Pest and Disease Detection [Dataset]. http://doi.org/10.17632/bwh3zbpkpv.1
Explore at:
Unique identifier
https://doi.org/10.17632/bwh3zbpkpv.1
Dataset updated
Apr 26, 2023
Authors
Patrick Mensah Kwabena
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The application of Artificial Intelligence (AI) has been evident in the agricultural sector recently. The main goal of AI in agriculture is to improve crop yield, control crop pests/diseases, and reduce cost. The agricultural sector in developing countries faces severe in the form of disease and pest infestation, the knowledge gap between farmers and technology, and a lack of storage facilities, among others. To help address some of these challenges, this work presents crop pests/disease datasets sourced from local farms in Ghana. The dataset is presented in two folds; the raw images which consists of 24,881 images ( 6,549-Cashew, 7,508-Cassava, 5,389-Maize, and 5,435-Tomato) and augmented images which is further split into train and test set consists of 102,976 images (25,811-Cashew, 26,330-Cassava, 23,657-Maize, and 27,178-Tomato), categorized into 22 classes. All images are de-identified, validated by expert plant virologists, and freely available for use by the research community.
m
Malhari Dataset
data.mendeley.com
kaggle.com
Updated May 22, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Madhura Kalbhor (2023). Malhari Dataset [Dataset]. http://doi.org/10.17632/m5kxdj7m36.1
Explore at:
Unique identifier
https://doi.org/10.17632/m5kxdj7m36.1
Dataset updated
May 22, 2023
Authors
Madhura Kalbhor
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Real-world clinical data collected from hospital settings plays a crucial role in medical research. Dataset aims to support research and development in the field of cervical cancer screening and diagnosis. To develop automated systems, an amount of large input data is needed. The creation of this Cervical Cancer images dataset aims to make a valuable contribution to provide dataset for developing the algorithms for automated cervical cancer diagnostics systems.

Facebook

Twitter

Click to copy link

Link copied

Cite

Fabiano Dalpiaz (2018). Requirements data sets (user stories) [Dataset]. http://doi.org/10.17632/7zbk8zsd8y.1

Requirements data sets (user stories)

Explore at:

44 scholarly articles cite this dataset (View in Google Scholar)

Unique identifier

https://doi.org/10.17632/7zbk8zsd8y.1

Dataset updated

Jul 28, 2018

Authors

Fabiano Dalpiaz

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

A collection of 22 data set of 50+ requirements each, expressed as user stories. These were all found online, or retrieved from software companies with a permission to disclose.

The data sets have been originally used to conduct experiments about ambiguity detection with the REVV-Light tool: https://github.com/RELabUU/revv-light

Clear search

Close search

Google apps

Main menu

Requirements data sets (user stories)

Labeled Optical Coherence Tomography (OCT) and Chest X-Ray Images for...

United States Surface Urban Heat Island database

Diabetes Dataset

Phishing Websites Dataset

Data from: Single-cell RNA-Seq of human primary lung and bronchial...

SpanishTweetsCOVID-19: A Social Media Enriched Covid-19 Twitter Spanish...

Panoramic Dental X-rays With Segmented Mandibles

Genome-wide association studies and Mendelian randomization analyses for...

Annotated Terms of Service of 100 Online Platforms

Web page phishing detection

Oral Images Dataset

Data from: Predicting Long-term Dynamics of Soil Salinity and Sodicity on a...

Benchmark data sets

DAWN

Data for: MACHINE LEARNING IN MEDICINE: CLASSIFICATION AND PREDICTION OF...

LG 18650HG2 Li-ion Battery Data and Example Deep Neural Network xEV SOC...

Concrete Crack Segmentation Dataset

Dataset for Crop Pest and Disease Detection

Malhari Dataset

Requirements data sets (user stories)