Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
OpenSim is an open-source biomechanical package with a variety of applications. It is available for many users with bindings in MATLAB, Python, and Java via its application programming interfaces (APIs). Although the developers described well the OpenSim installation on different operating systems (Windows, Mac, and Linux), it is time-consuming and complex since each operating system requires a different configuration. This project aims to demystify the development of neuro-musculoskeletal modeling in OpenSim with zero configuration on any operating system for installation (thus cross-platform), easy to share models while accessing free graphical processing units (GPUs) on a web-based platform of Google Colab. To achieve this, OpenColab was developed where OpenSim source code was used to build a Conda package that can be installed on the Google Colab with only one block of code in less than 7βmin. To use OpenColab, one requires a connection to the internet and a Gmail account. Moreover, OpenColab accesses vast libraries of machine learning methods available within free Google products, e.g. TensorFlow. Next, we performed an inverse problem in biomechanics and compared OpenColab results with OpenSim graphical user interface (GUI) for validation. The outcomes of OpenColab and GUI matched well (rβ₯0.82). OpenColab takes advantage of the zero-configuration of cloud-based platforms, accesses GPUs, and enables users to share and reproduce modeling approaches for further validation, innovative online training, and research applications. Step-by-step installation processes and examples are available at: https://simtk.org/projects/opencolab.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This repository contains a Python script for classifying apple leaf diseases using a Vision Transformer (ViT) model. The dataset used is the Plant Village dataset, which contains images of apple leaves with four classes: Healthy, Apple Scab, Black Rot, and Cedar Apple Rust. The script includes data preprocessing, model training, and evaluation steps.
The goal of this project is to classify apple leaf diseases using a Vision Transformer (ViT) model. The dataset is divided into four classes: Healthy, Apple Scab, Black Rot, and Cedar Apple Rust. The script includes data preprocessing, model training, and evaluation steps.
matplotlib
, seaborn
, numpy
, pandas
, tensorflow
, and sklearn
. These libraries are used for data visualization, data manipulation, and building/training the deep learning model.walk_through_dir
function is used to explore the dataset directory structure and count the number of images in each class.Train
, Val
, and Test
directories, each containing subdirectories for the four classes.ImageDataGenerator
from Keras to apply data augmentation techniques such as rotation, horizontal flipping, and rescaling to the training data. This helps in improving the model's generalization ability.Patches
layer that extracts patches from the images. This is a crucial step in Vision Transformers, where images are divided into smaller patches that are then processed by the transformer.seaborn
to provide a clear understanding of the model's predictions.Dataset Preparation
Train
, Val
, and Test
directories, with each directory containing subdirectories for each class (Healthy, Apple Scab, Black Rot, Cedar Apple Rust).Install Required Libraries
pip install tensorflow matplotlib seaborn numpy pandas scikit-learn
Run the Script
Analyze Results
Fine-Tuning
This dataset contains the codes to reproduce the results of "Time resolved micro-XRCT dataset of Enzymatically Induced Calcite Precipitation (EICP) in sintered glass bead columns", cf. https://doi.org/10.18419/darus-2227. The code takes "low-dose" images as an input where the images contain many artifacts and noise as a trade-off of a fast data acquisition (6 min / dataset while 3 hours / dataset ("high-dose") in normal configuration). These low quality images are able to be improved with the help of a pre-trained model. The pre-trained model provided in here is trained with pairs of "high-dose" and "low-dose" data of above mentioned EICP application. The examples of used training, input and output data can be also found in this dataset. Although we showed only limited examples in here, we would like to emphasize that the used workflow and codes can be further extended to general image enhancement applications. The code requires a Python version above 3.7.7 with packages such as tensorflow, kears, pandas, scipy, scikit, numpy and patchify libraries. For further details of operation, please refer to the readme.txt file.
Dataset Details
Dataset Description
TP4 is a comprehensive dataset containing a curated collection of questions and answers from Stack Overflow. Focused on the realms of Python programming, NumPy, Pandas, TensorFlow, and PyTorch, TP4 includes essential attributes such as question ID, title, question body, answer body, associated tags, and score. This dataset is designed to facilitate research, analysis, and exploration of inquiries and solutions within the Python and⦠See the full description on the dataset page: https://huggingface.co/datasets/Syed-Hasan-8503/StackOverflow-TP4-1M.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
PROGRAM SUMMARY No. of lines in distributed program, including test data, etc.: 481 No. of bytes in distributed program, including test data, etc.: 14540.8 Distribution format: .py, .csv Programming language: Python Computer: Any workstation or laptop computer running TensorFlow, Google Colab, Anaconda, Jupyter, pandas, NumPy, Microsoft Azure and Alteryx. Operating system: Windows and Mac OS, Linux.
Nature of problem: Navier-Stokes equations are solved numerically in ANSYS Fluent using Reynolds stress model for turbulence. The simulated values of friction factor are validated with theoretical and experimental data obtained from literature. Artificial neural networks are then used for a prediction-based augmentation of friction factor. The capabilities of the neural networks is discussed, in regard to computational cost and domain limitations.
Solution method: The simulation data is obtained through Reynolds stress modelling of fluid flow through pipe. This data is augmented using the artificial neural network model that predicts within and without data domain.
Restrictions: The code used in this research is limited to smooth pipe bends, in which friction factor is analysed using a steady state incompressible fluid flow.
Runtime: The artificial neural network produces results within a span of 20 seconds for three-dimensional geometry, using the allocated free computational resources of Google Colaboratory cloud-based computing system.
This dataset contains the codes to reproduce the five different segmentation results of the paper Lee et al (2021). The original dataset before applying these segmentation codes could be found in Ruf & Steeb (2020). The adopted segmentation methods in order to identify the micro fractures within the original dataset are the Local threshold, Sato, Chan-Vese, Random forest and U-net model. The Local threshold, Sato and U-net models are written in Python. The codes require a version above Python 3.7.7 with tensorflow, keras, pandas, scipy, scikit and numpy libraries. The workflow of the Chan-Vese method is interpreted in Matlab2018b. The result of the Random forest method could be reproduced with the uploaded trained model in an open source program ImageJ and trainableWeka library. For further details of operation, please refer to the readme.txt file.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Attribute pruning is a simple yet effective post-processing technique that enforces individual fairness by zeroing out sensitive attribute weights in a pre-trained DNNβs input layer. To ensure the generalizability of our results, we conducted experiments on 32 models and 4 widely used datasets, and compared attribute pruningβs performance with 3 baseline post-processing methods (i.e., equalized odds, calibrated equalized odds, and ROC). In this study, we reveal the effectiveness of sensitive attribute pruning on small-scale DNN bias removal and discuss its usage in multi-attribute fairness estimation by answering the following research questions:
RQ1: How does single-attribute pruning perform in comparison to the existing post-processing methods?
By answering this research question, we aim to understand the accuracy and group fairness impact of single-attribute pruning on 32 models and compare them with 3 state-of-the-art post-processing methods.
RQ2: How does multi-attribute pruning impact and aid understanding of the original models?
By answering this research question, we investigate the accuracy impact of multi-attribute pruning on 24 models. Further, we investigate the prediction change brought by attribute pruning on different subgroups and discuss their implications on multi-attribute fairness estimation.
To comprehensively understand the impact of sensitive attribute pruning, we select four commonly used fairness datasets collected from different domains, namely Bank Marketing (BM), German Credit (GC), Adult Census (AC), and COMPAS. We select the four datasets because they provide a wide range of corresponding pre-trained models used in existing research. The introduction to the datasets is as follows:
Bank Marketing (BM): The Bank Marketing dataset consists of marketing data from a Portuguese bank, containing 45,222 instances with 16 attributes, and the biased attribute identified is age. The objective is to classify whether a client will subscribe to a term deposit.
German Credit (GC): The German Credit dataset includes 1,000 instances of individuals who have taken credit from a bank, each described by 20 attributes, with two sensitive attributes, sex and age; the single sensitive attribute to be evaluated in RQ1 is age, given that the subgroup positive rate difference (i.e., historical bias in the label) on this sensitive attribute is higher than sex. The task is to classify the credit risk of an individual.
Adult Census (AC): The Adult Census dataset comprises United States census data from 41,188 individuals after empty entry removal, with 13 attributes. The sensitive attributes in the dataset are sex and race; the single sensitive attribute to be evaluated in RQ1 is sex. The goal is to predict whether an individual earns more than $50,000 per year.
COMPAS: The COMPAS (Correctional Offender Management Profiling for Alternative Sanctions) dataset is collected from a system widely used for criminal recidivism risk prediction, containing 6,172 individuals and 6 attributes. The sensitive attributes in the dataset are race and age; to keep aligned with previous research, the single sensitive attribute to be evaluated in RQ1 is race. The goal is to predict whether an individual will reoffend in the future.
To replicate the experiments, run the code in the src
folder, the sub-folders contain the code for implementing the post-processing methods on each dataset. To obtain the basic results, run all the codes in each folder. The results will be stored in the results
folder; we also provide the code for statistical analyses (i.e., paired t-test) under this folder. To conduct the statistical analyses, run statistic_test.py
and check the results in single_att_ttest.json
.
RQ1: How does single-attribute pruning perform in comparison to the existing post-processing methods?
While ensuring individual fairness on the single attribute, attribute pruning will not significantly impact accuracy. It preserved the highest post-processing accuracy among the four methods on 23 out of 32 models. It can also improve the two group accuracies in general, but its improvements are insignificant and not always optimal in comparison to the other three methods. Further, given the theoretical difference between individual fairness and group fairness, attribute pruning may even harm group fairness when the observed dataset is not comprehensive enough to cover the whole data space.
RQ2: How does multi-attribute pruning impact and aid understanding of the original models?
According to our experiment on 24 models, multi-attribute pruning can also retain a certain level of accuracy while enhancing individual fairness. It can also be used to estimate multi-attribute group fairness in models with similar original accuracy based on the TPR difference before and after pruning the sensitive attributes.
βββ data # The 4 datasets used in the study
βββ models # Model files for the 32 models included in our experiment
βββ results # Results for RQ1 and RQ2
βββ AC
βββ BM
βββ GC
βββ compas
βββ single_att_ttest.json # Statistical analysis results
βββ statistic_test.py
βββ src # Codes for implementing the post-processing methods on each dataset
βββ AC
βββ BM
βββ GC
βββ compas
βββ utils
βββ tables
βββ README.md
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
PyLibAPIs.7z : contains public API data (MongoDB dump) for these frameworks:
Label.xlsx: contains issues and their labels
Breaking Changes for All Frameworks.pdf: contains the breaking change distributions of all six frameworks
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
PyLibAPIs.7z : contains public API data (mongodb dump) for these frameworks:
TensorFlow
Keras
scikit-learn
Pandas
Flask
Django
Label.xlsx: cintains issues and their labels
Open Data Commons Attribution License (ODC-By) v1.0https://www.opendatacommons.org/licenses/by/1.0/
License information was derived automatically
https://github.githubassets.com/images/modules/site/home/footer-illustration.svg" alt="GitHub">
Image credits: https://github.com
This is a dataset that contains all commit messages and its related metadata from 34 popular GitHub repositories. These repositories are:
Data as of Wed Apr 21 03:42:44 PM IST 2021
Image credits: Unsplash - plhnk
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
This fMRI dataset was collected for the study "Informative neural representations of unseen contents during higher-order processing in human brains and deep artificial networks".
Code corresponding to the dataste: https://github.com/nmningmei/unconfeats
Not seeing a result you expected?
Learn how you can add new datasets to our index.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
OpenSim is an open-source biomechanical package with a variety of applications. It is available for many users with bindings in MATLAB, Python, and Java via its application programming interfaces (APIs). Although the developers described well the OpenSim installation on different operating systems (Windows, Mac, and Linux), it is time-consuming and complex since each operating system requires a different configuration. This project aims to demystify the development of neuro-musculoskeletal modeling in OpenSim with zero configuration on any operating system for installation (thus cross-platform), easy to share models while accessing free graphical processing units (GPUs) on a web-based platform of Google Colab. To achieve this, OpenColab was developed where OpenSim source code was used to build a Conda package that can be installed on the Google Colab with only one block of code in less than 7βmin. To use OpenColab, one requires a connection to the internet and a Gmail account. Moreover, OpenColab accesses vast libraries of machine learning methods available within free Google products, e.g. TensorFlow. Next, we performed an inverse problem in biomechanics and compared OpenColab results with OpenSim graphical user interface (GUI) for validation. The outcomes of OpenColab and GUI matched well (rβ₯0.82). OpenColab takes advantage of the zero-configuration of cloud-based platforms, accesses GPUs, and enables users to share and reproduce modeling approaches for further validation, innovative online training, and research applications. Step-by-step installation processes and examples are available at: https://simtk.org/projects/opencolab.