Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This is the dataset used to train and evaluate the CNN and KNN machine learning techniques for the ReDraw paper, published in IEEE Transactions on Software Engineering in 2018.
Link to ReDraw Paper: https://arxiv.org/abs/1802.02312
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
GLIB: image dataset 132 screenshots of game1 & game2 with UI display issues from 466 test reports. data/images/Code : 9,412 screenshots of game1 & game2 with UI display issues generated by our Code augmentation method. data/images/Normal: 7,750 screenshots of game1 & game2 without UI display issues collected by randomly traversing the game scene. data/images/Rule(F) : 7,750 screenshots of game1 & game2 with UI display issues generated by our Rule(F) augmentation method. data/images/Rule(R) : 7,750 screenshots of game1 & game2 with UI display issues generated by our Rule(R) augmentation method. data/images/testDataSet : 192 screenshots with UI display issues from 466 test reports(exclude game1 & game2).
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The EGFE dataset is a collection of high-quality UI design prototypes with fragmented layered data. It includes high-fidelity UI screenshots and JSON files containing metadata of the design prototypes. This dataset aims to assist in merging fragmented elemetns within design prototypes, thereby alleviating the burden on developers to understand the designer's intent and aiding automated code generation tools in producing high-quality frontend code. What sets this dataset apart from others is the methodology employed to obtain UI screenshots and the hierarchical structure of views, which involves parsing UI design prototypes created using design tools like Sketch and Figma. It's important to note that a significant portion of the UI design drafts used in this dataset was generously provided by Alibaba Group, and their usage requires consent from Alibaba Group. To facilitate model testing, we have released a partial dataset here, adhering to the terms of the MIT license. If you require access to the complete dataset, please contact the authors of the paper.
In this repo, we release a total of 300 samples and pre-trained model checkpoints. It includes the following:
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The Excel file contains meta-data about collected white and gray literature, applied inclusion/exclusion criteria, the code system, and a list of identified guidelines.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Here are a few use cases for this project:
Automated Web Design Analysis: By identifying various UI elements, dataset-v2 can help designers and developers analyze existing web designs for improvements or optimization, providing insights on the UI structure, accessibility, and user-friendliness.
Content Management System (CMS) Auto-tagging: Integrate dataset-v2 with a CMS to automatically scan and tag visual elements within web pages, simplifying asset management and organization for website developers and content creators.
Accessibility Compliance: Dataset-v2 can analyze websites to ensure proper UI elements usage, helping organizations adhere to accessibility guidelines and standards, such as the Web Content Accessibility Guidelines (WCAG).
Prototype Testing and Feedback: Dataset-v2 can help UX/UI designers evaluate prototypes by identifying UI components and their placement, offering objective feedback and highlighting areas for improvement in the design process.
Competitive Analysis and Web Scraping: Dataset-v2 can identify UI elements across multiple websites, empowering businesses to analyze competitor websites and extract valuable design patterns, best practices, and trends for UI/UX applications.
The WebUI dataset contains 400K web UIs captured over a period of 3 months and cost about $500 to crawl. We grouped web pages together by their domain name, then generated training (70%), validation (10%), and testing (20%) splits. This ensured that similar pages from the same website must appear in the same split. We created four versions of the training dataset. Three of these splits were generated by randomly sampling a subset of the training split: Web-7k, Web-70k, Web-350k. We chose 70k as a baseline size, since it is approximately the size of existing UI datasets. We also generated an additional split (Web-7k-Resampled) to provide a small, higher quality split for experimentation. Web-7k-Resampled was generated using a class-balancing sampling technique, and we removed screens with possible visual defects (e.g., very small, occluded, or invisible elements). The validation and test split was always kept the same.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The Mobile Icon | Mobile Screenshot Dataset is a meticulously curated collection of 9,000+ high-quality mobile screenshots, categorized across 13 diverse application types. This dataset is designed to support AI/ML researchers, UI/UX analysts, and developers in advancing mobile interface understanding, image classification, and content recognition.
Each image has been manually reviewed and verified by computer vision professionals at DataCluster Labs, ensuring high-quality and reliable data for research and development purposes.
The images in this dataset are exclusively owned by Data Cluster Labs and were not downloaded from the internet. To access a larger portion of the training dataset for research and commercial purposes, a license can be purchased. Contact us at sales@datacluster.ai Visit www.datacluster.ai to know more.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
It includes five different datasets. The first four datasets contain student projects collected from different offerings of two undergraduate-level courses – Object-Oriented Analysis and Design (OOAD) and Software Engineering (SE) – taught in a renowned private university in Lahore over a period of six years. The fifth dataset contains real-life industry projects collected from a renowned software house (i.e. member of Pakistan Software Houses Association for IT and ITeS (P@SHA)) in Lahore.
Dataset #1 consists of 31 C++ GUI-based desktop applications. Dataset #2 consists of 19 Java GUI-based desktop applications. Dataset #3 consists of 12 Java web applications. Dataset #4 consists of 31 Java all two categories. Dataset #5 consists of 11 VB.NET GUI-based desktop applications.
Attributes are used as follows: Project Code – Project ID for identification purposes NOC – The total number of classes in a class diagram NOA – The total number of attributes in a class diagram NOM – The total number of methods/operations in a class diagram NODep – The total number of dependency relationships in a class diagram NOAss – The total number of association relationships in a class diagram NOComp – The total number of composition relationships in a class diagram NOAgg – The total number of aggregation relationships in a class diagram NOGen – The total number of generalization relationships in a class diagram NORR – The total number of realization relationships in a class diagram NOOM – The total number of one-to-one multiplicity relationships in a class diagram NOMM – The total number of one-to-many multiplicity relationships in a class diagram NMMM – The total number of many-to-many multiplicity relationships in a class diagram OCP – objective class points EOCP – enhanced objective class points WEOCP – weighted enhanced objective class points SLOC – software size measured in source lines of code
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
# Data Code Repository
This repository contains open-source data code that provides utilities for the paper named "Here comes trouble! Distinguishing GUI Component States for Blind Users using Large Language Models". The code is designed to facilitate data-related tasks and promote reproducibility in research and data analysis projects.
## Features
- Attribute identification and extraction: Including real-time recognition and extraction of GUI components in the view type, resource-id, color, action of four attributes
- Components State Distinction: Provides the prompt needed for large language models, covering their specific design schemes and chain of thought reasoning processes as well as contextual learning content.
- Implementation: Offers specific methods to realize the process, including the setting of relevant parameters and the use of functions.
## Installation
To use the data code, you can down or clone the required code.
Notably, before using the code, make sure the necessary environment configuration is done.
## Dependencies
The data code has the following dependencies:
Python (version 3.6 or higher)
NumPy
Pandas
Seaborn
Scikit-learn
Openai
Android Studio (version 4.0)
Install the required dependencies using pip:
pip install numpy..
##License
This data code is distributed under the MIT License. See LICENSE for more information.
##Copyright
All copyright of the tool is owned by the author of the paper.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Our dataset contains 2 weeks of approx. 8-9 hours of acceleration data per day from 11 participants wearing a Bangle.js Version 1 smartwatch with our firmware installed.
The dataset contains annotations from 4 different commonly used annotation methods utilized in user studies that focus on in-the-wild data. These methods can be grouped in user-driven, in situ annotations - which are performed before or during the activity is recorded - and recall methods - where participants annotate their data in hindsight at the end of the day.
The participants had the task to label their activities using (1) a button located on the smartwatch, (2) the activity tracking app Strava, (3) a (hand)written diary and (4) a tool to visually inspect and label activity data, called MAD-GUI. Methods (1)-(3) are used in both weeks, however method (4) is introduced in the beginning of the second study week.
The accelerometer data is recorded with 25 Hz, a sensitivity of ±8g and is stored in a csv format. Labels and raw data are not yet combined. You can either write your own script to label the data or follow the instructions in our corresponding Github repository.
The following unique classes are included in our dataset:
laying, sitting, walking, running, cycling, bus_driving, car_driving, vacuum_cleaning, laundry, cooking, eating, shopping, showering, yoga, sport, playing_games, desk_work, guitar_playing, gardening, table_tennis, badminton, horse_riding.
However, many activities are very participant specific and therefore only performed by one of the participants.
The labels are also stored as a .csv file and have the following columns:
week_day, start, stop, activity, layer
Example:
week2_day2,10:30:00,11:00:00,vacuum_cleaning,d
The layer columns specifies which annotation method was used to set this label.
The following identifiers can be found in the column:
b: in situ button
a: in situ app
d: self-recall diary
g: time-series recall labelled with a the MAD-GUI
The corresponding publication is currently under review.
UI-PRMD is a data set of movements related to common exercises performed by patients in physical therapy and rehabilitation programs. The data set consists of 10 rehabilitation exercises. A sample of 10 healthy individuals repeated each exercise 10 times in front of two sensory systems for motion capturing: a Vicon optical tracker, and a Kinect camera. The data is presented as positions and angles of the body joints in the skeletal models provided by the Vicon and Kinect mocap systems.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
OpenSim is an open-source biomechanical package with a variety of applications. It is available for many users with bindings in MATLAB, Python, and Java via its application programming interfaces (APIs). Although the developers described well the OpenSim installation on different operating systems (Windows, Mac, and Linux), it is time-consuming and complex since each operating system requires a different configuration. This project aims to demystify the development of neuro-musculoskeletal modeling in OpenSim with zero configuration on any operating system for installation (thus cross-platform), easy to share models while accessing free graphical processing units (GPUs) on a web-based platform of Google Colab. To achieve this, OpenColab was developed where OpenSim source code was used to build a Conda package that can be installed on the Google Colab with only one block of code in less than 7 min. To use OpenColab, one requires a connection to the internet and a Gmail account. Moreover, OpenColab accesses vast libraries of machine learning methods available within free Google products, e.g. TensorFlow. Next, we performed an inverse problem in biomechanics and compared OpenColab results with OpenSim graphical user interface (GUI) for validation. The outcomes of OpenColab and GUI matched well (r≥0.82). OpenColab takes advantage of the zero-configuration of cloud-based platforms, accesses GPUs, and enables users to share and reproduce modeling approaches for further validation, innovative online training, and research applications. Step-by-step installation processes and examples are available at: https://simtk.org/projects/opencolab.
The Highway-Runoff Database (HRDB) was developed by the U.S. Geological Survey (USGS), in cooperation with the Federal Highway Administration (FHWA) Office of Project Delivery and Environmental Review to provide planning-level information for decision makers, planners, and highway engineers to assess and mitigate possible adverse effects of highway runoff on the Nation's receiving waters (Granato and Cazenas, 2009; Granato, 2013; 2019; Granato and others, 2018; Granato and Friesz, 2021). The HRDB was assembled by using a Microsoft Access database application to facilitate use of the data and to calculate runoff-quality statistics with methods that properly handle censored-concentration data. The HRDB was first published as version 1.0 in cooperation with the FHWA in 2009 (Granato and Cazenas, 2009). The second version (1.0.0a) was published in cooperation with the Massachusetts Department of Transportation Highway Division to include data from Ohio and Massachusetts (Smith and Granato, 2010). The third version (1.0.0b) was published in cooperation with FHWA to include a substantial amount of additional data (Granato and others, 2018; Granato and Jones, 2019). The fourth version (1.1.0) was updated with additional data and modified to provide data-quality information within the Graphical User Interface (GUI), calculate statistics for multiple sites in batch mode, and output additional statistics. The fifth version (1.1.0a) was published in cooperation with the California Department of Transportation to add highway-runoff data collected in California. The sixth version published in this release (1.2.0) has been updated to include additional data, correct data-transfer errors in previous versions, add new parameter information, and modify the statistical output. This version includes data from 270 highway sites across the country (26 states); data from 8,108 storm events; and 119,224 concentration values with data for 418 different water-quality constituents or parameters.
SECOND is a well-annotated semantic change detection dataset. To ensure data diversity, we firstly collect 4662 pairs of aerial images from several platforms and sensors. These pairs of images are distributed over the cities such as Hangzhou, Chengdu, and Shanghai. Each image has size 512 x 512 and is annotated at the pixel level. The annotation of SECOND is carried out by an expert group of earth vision applications, which guarantees high label accuracy. For the change category in the SECOND dataset, we focus on 6 main land-cover classes, i.e. , non-vegetated ground surface, tree, low vegetation, water, buildings and playgrounds , that are frequently involved in natural and man-made geographical changes. It is worth noticing that, in the new dataset, non-vegetated ground surface ( n.v.g. surface for short) mainly corresponds to impervious surface and bare land. In summary, these 6 selected land-cover categories result in 30 common change categories (including non-change ). Through the random selection of image pairs, the SECOND reflects real distributions of land-cover categories when changes occur.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
https://i.imgur.com/ZUX61cD.png" alt="Overview">
The method of disuniting similar data is called clustering. you can create dummy data for classifying clusters by method from sklearn package but it needs to put your effort into job.
For users who making hard test cases for example of clustering, I think this dataset helps them.
Try out to select a meaningful number of clusters, and dividing the data into clusters. Here are exercises for you.
All csv files contain a lots of x
, y
and color
, and you can see above figures.
If you want to use position as type of integer, scale it and round off to integer as like x = round(x * 100)
.
Furthermore, here is GUI Tool to generate 2D points for clustering. you can make your dataset with this tool. https://www.joonas.io/cluster-paint
Stay tuned for further updates! also if any idea, you can comment me.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The experimental dataset supports UI research based on video-oculography method, or eye tracking.
Oculography is a non-contact optical method for detecting eye movement, based on the registration (using an optical sensor) of infrared illumination reflected by the eye's pupil. This method finds wide application in usability to assess the attractiveness of fragments of the interface, search efficiency, ease of transition between sections and other aspects of the interface design of a software product.
We used an Eye Tracker device from Eye Tribe and OGAMA software ver. 5.0.5754 (http://www.ogama.net/).
During the experiment, five users independently solved the problem of finding and obtaining information about the selected object on a virtual map. This task implied the following steps: - find and press the "Map" button (Slide0);
find on the map and select a natural object called “Big Round Estuary” (Slide2);
read all the information about this object and go to the main menu (Slide3).
Procedure for preparing and conducting the experiment: launch the OGAMA application; calibrate the device taking into account the individual characteristics of the user; record eye tracking while the user completes the task of finding information on the screen; stop recording.
The work was carried out in 2019 by Valery Grigoryev, graduate student of the Southern Federal University as part of the dissertation “User interface mobile guide application railway traveler”, supervised by Assoc. Prof. Vitaly Kompaniets, Master's program "Ergodesign user interface".
List of variables in the dataset:
SubjectName
SubjectCategory
Age
Sex
Handedness
Comments
TrialID
TrialName
TrialSequence
TrialCategory
TrialStartTime
Duration
EliminateData
EventEventID
EventSequence
EventTime
EventType
EventTask
EventParam
Time
PupilDiaX
PupilDiaY
GazePosX
GazePosY
MousePosX
MousePosY
EventID
Open Government Licence - Canada 2.0https://open.canada.ca/en/open-government-licence-canada
License information was derived automatically
These guidelines interpret the requirements for good manufacturing practices (GMP) in Part C, Division 2 of the Regulations. Guidance documents like this one are meant to help industry and health care professionals understand how to comply with regulations. They also provide guidance to Health Canada staff, so that the rules are enforced in a fair, consistent and effective way across Canada.
https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
Animals move their head and eyes as they explore and sample the visual scene. Previous studies have demonstrated neural correlates of head and eye movements in rodent primary visual cortex (V1), but the sources and computational roles of these signals are unclear. We addressed this by combining measurement of head and eye movements with high density neural recordings in freely moving mice. V1 neurons responded primarily to gaze shifts, where head movements are accompanied by saccadic eye movements, rather than to head movements where compensatory eye movements stabilize gaze. A variety of activity patterns immediately followed gaze shifts, including units with positive, biphasic, or negative responses, and together these responses formed a temporal sequence following the gaze shift. These responses were greatly diminished in the dark for the vast majority of units, replaced by a uniform suppression of activity, and were similar to those evoked by sequentially flashed stimuli in head-fixed conditions, suggesting that gaze shift transients represent the temporal response to the rapid onset of new visual input. Notably, neurons responded in a sequence that matches their spatial frequency preference, from low to high spatial frequency tuning, consistent with coarse-to-fine processing of the visual scene following each gaze shift. Recordings in foveal V1 of freely gazing head-fixed marmosets revealed a similar sequence of temporal response following a saccade, as well as the progression of spatial frequency tuning. Together, our results demonstrate that active vision in both mice and marmosets consists of a dynamic temporal sequence of neural activity associated with visual sampling. Methods Mouse data : Mice were initially implanted with a titanium headplate over primary visual cortex to allow for head-fixation and attachment of head-mounted experimental hardware. After three days of recovery, widefield imaging53 was performed to help target the electrophysiology implant to the approximate center of left monocular V1. A miniature connector (Mill-Max 853-93-100-10-001000) was secured to the headplate to allow repeated, reversible attachment of a camera arm, eye/world cameras and IMU21,22. In order to simulate the weight of the real electrophysiology drive for habituation, a ‘dummy’ electrophysiology drive was glued to the headplate. Animals were handled by the experimenter for several days before surgical procedures, and subsequently habituated (~45 min total) to the spherical treadmill and freely moving arena with hardware tethering attached for several days before experiments. The electrophysiology implant was performed once animals moved comfortably in the arena. A craniotomy was performed over V1, and a linear silicon probe (64 or 128 channels, Diagnostic Biochips P64-3 or P128-6) mounted in a custom 3D-printed drive (Yuta Senzai, UCSF) was lowered into the brain using a stereotax to an approximate tip depth of 750 µm from the pial surface. The surface of the craniotomy was coated in artificial dura (Dow DOWSIL 3-4680) and the drive was secured to the headplate using light-curable dental acrylic (Unifast LC). A second craniotomy was performed above left frontal cortex, and a reference wire was inserted into the brain. The opening was coated with a small amount of sterile ophthalmic ointment before the wire was glued in place with cyanoacrylate. Animals recovered overnight and experiments began the following day. The camera arm was oriented approximately 90 deg to the right of the nose and included an eye-facing camera (iSecurity101 1000TVL NTSC, 30 fps interlaced), an infrared-LED to illuminate the eye (Chanzon, 3 mm diameter, 940 nm wavelength), a wide-angle camera oriented toward the mouse’s point of view (BETAFPV C01, 30 fps interlaced) and an inertial measurement unit acquiring three-axis gyroscope and accelerometer signals (Rosco Technologies; acquired 30 kHz, downsampled to 300 Hz and interpolated to camera data). Fine gauge wire (Cooner, 36 AWG, #CZ1174CLR) connected the IMU to its acquisition box, and each of the cameras to a USB video capture device (Pinnacle Dazzle or StarTech USB3HDCAP). A top-down camera (FLIR Blackfly USB3, 60 fps) recorded the mouse in the arena. The electrophysiology headstage (built into the silicon probe package) was connected to an Open Ephys acquisition system via an ultra thin cable (Intan #C3216). Electrophysiology data were acquired at 30 kHz and bandpass filtered between 0.01 Hz and 7.5 kHz. We first used the Open Ephys GUI (https://github.com/open-ephys/plugin-GUI) to assess the quality of the electrophysiology data, then recordings were performed in Bonsai54 using custom workflows (https://github.com/nielllab/FreelyMovingEphys). System timestamps were collected for all hardware devices and later used to align data streams through interpolation. Marmoset data: Electrophysiological recordings were performed using 2x32 channel silicon electrode arrays (http://www.neuronexus.com). Probes included 2 sharpened tip shanks of 50µm width spaced 200 µm apart, each containing 32 channels separated by 35 µm. In one animal we used a semi-chronic microdrive (EDDS Microdrive system, https://microprobes.com) to place electrodes in cortex for 1-2 weeks over which we made 3-6 recordings. In the second animal we used a custom micro-drive (https://marmolab.bcs.rochester.edu/resources/) to place and remove electrodes daily. Arrays were lowered slowly through silastic into cortex using a thumb screw. Data were amplified and digitized at 30 kHz with Intan headstages (Intan) using the Open Ephys GUI (https://github.com/open-ephys/plugin-GUI). The wideband signal was high-pass filtered by the headstage at 0.1 Hz, preprocessed by common-average referencing across all channels, and then high-pass filtered at 300 Hz. The resulting traces were spike sorted using Kilosort2. Outputs from the spike sorting algorithms were manually labeled using ’phy’ GUI (https://github.com/kwikteam/phy). Any units that were either physiologically implausible based on the lack of a waveform with a trough followed by a peak or with an inter-spike interval (ISI) distribution with more than 1% of the spikes under 1 ms were excluded from analyses. Gaze position was monitored using infra-red eye tracking methods described previously63. Briefly, the 1st and 4th Purkinje images (P1 and P4) were visualized using a collimated IR light source and tracked at 593 frames per second to estimate the 2D eye angle. The eye tracker was manually calibrated to adjust the offset and gain (horizontal and vertical) by showing marmoset monkeys small windowed face images at different screen positions to obtain their fixation as described previously60,61. Saccadic eye movements were identified automatically using a combination of velocity and acceleration thresholds64.
This software tool generates simulated radar signals and creates RF datasets. The datasets can be used to develop and test detection algorithms by utilizing machine learning/deep learning techniques for the 3.5 GHz Citizens Broadband Radio Service (CBRS) or similar bands. In these bands, the primary users of the band are federal incumbent radar systems. The software tool generates radar waveforms and randomizes the radar waveform parameters. The pulse modulation types for the radar signals and their parameters are selected based on NTIA testing procedures for ESC certification, available at http://www.its.bldrdoc.gov/publications/3184.aspx. Furthermore, the tool mixes the waveforms with interference and packages them into one RF dataset file. The tool utilizes a graphical user interface (GUI) to simplify the selection of parameters and the mixing process. A reference RF dataset was generated using this software. The RF dataset is published at https://doi.org/10.18434/M32116.
Open Government Licence - Canada 2.0https://open.canada.ca/en/open-government-licence-canada
License information was derived automatically
This guide is for people who work with Active Pharmaceutical Ingredients (APIs) and their intermediates to understand and comply with Part C, Division 2 of the Food and Drug Regulations (the Regulations), which is about Good Manufacturing Practices (GMP).
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This is the dataset used to train and evaluate the CNN and KNN machine learning techniques for the ReDraw paper, published in IEEE Transactions on Software Engineering in 2018.
Link to ReDraw Paper: https://arxiv.org/abs/1802.02312