https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Data science beginners start with curated set of data, but it's a well known fact that in a real Data Science Project, major time is spent on collecting, cleaning and organizing data . Also domain expertise is considered as important aspect of creating good ML models. Being an automobile enthusiast, I tool up this challenge to collect images of two of the popular car models from a used car website, where users upload the images of the car they want to sell and then train a Deep Neural Network to identify model of a car from car images. In my search for images I found that approximately 10 percent of the cars pictures did not represent the intended car correctly and those pictures have to be deleted from final data.
There are 4000 images of two of the popular cars (Swift and Wagonr) in India of make Maruti Suzuki with 2000 pictures belonging to each model. The data is divided into training set with 2400 images , validation set with 800 images and test set with 800 images. The data was randomized before splitting into training, test and validation set.
The starter kernal is provided for keras with CNN. I have also created github project documenting advanced techniques in pytorch and keras for image classification like data augmentation, dropout, batch normalization and transfer learning
With small dataset like this, how much accuracy can we achieve and whether more data is always better. The baseline model trained in Keras achieves 88% accuracy on validation set, can we achieve even better performance and by how much.
Is the data collected for the two car models representative of all possible car from all over country or there is sample bias .
I would also like someone to extend the concept to build a use case so that if user uploads an incorrect car picture of car , the ML model could automatically flag it. For example user uploading incorrect model or an image which is not a car
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
The original dataset is from https://www.kaggle.com/datasets/andyczhao/covidx-cxr2
The data is separated based on the .txt
file (see link) into positive and negative.
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.preprocessing.image import ImageDataGenerator
datagen = ImageDataGenerator(
rescale=1./255, # Normalize
rotation_range=20, # Rotation reference
zoom_range=0.2, # Zoom reference
width_shift_range=0.2, # wrap
height_shift_range=0.2, # wrap
shear_range=0.2, # Add shear transformation
brightness_range=(0.7, 1.3), # Wider brightness adjustment - reference 0.3
horizontal_flip=True,
fill_mode='nearest'
)
# Counts
current_count = len(os.listdir(input_dir))
target_count = 57199
required_augmented_count = target_count - current_count
print(f"Original negatives: {current_count}")
print(f"Required augmented images: {required_augmented_count}")
# augmenting ...
augmented_count = 0
max_augmentations_per_image = 10 #I used 5 and 10, this dataset was generated with 10
for img_file in os.listdir(input_dir):
img_path = os.path.join(input_dir, img_file)
img = load_img(img_path, target_size=(480, 480)) # 480 by 480 referring to reference.
img_array = img_to_array(img)
img_array = img_array.reshape((1,) + img_array.shape)
# Generate multiple augmentations per image
i = 0
for batch in datagen.flow(
img_array,
batch_size=1,
save_to_dir=output_dir,
save_prefix='aug',
save_format='jpeg'
):
i += 1
augmented_count += 1
if i >= max_augmentations_per_image:
break
if augmented_count >= required_augmented_count:
break
if augmented_count >= required_augmented_count:
break
I tried using different max_augmentations_per_image,
or without setting this parameter; both ways generated augmented data (around 9,000) ...
positive_balanced: ```python random.seed(42)
target_count = 20579
all_positive_images = os.listdir(positive_dir) selected_positive_images = random.sample(all_positive_images, target_count) ```
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This repository contains a Python script for classifying apple leaf diseases using a Vision Transformer (ViT) model. The dataset used is the Plant Village dataset, which contains images of apple leaves with four classes: Healthy, Apple Scab, Black Rot, and Cedar Apple Rust. The script includes data preprocessing, model training, and evaluation steps.
The goal of this project is to classify apple leaf diseases using a Vision Transformer (ViT) model. The dataset is divided into four classes: Healthy, Apple Scab, Black Rot, and Cedar Apple Rust. The script includes data preprocessing, model training, and evaluation steps.
matplotlib
, seaborn
, numpy
, pandas
, tensorflow
, and sklearn
. These libraries are used for data visualization, data manipulation, and building/training the deep learning model.walk_through_dir
function is used to explore the dataset directory structure and count the number of images in each class.Train
, Val
, and Test
directories, each containing subdirectories for the four classes.ImageDataGenerator
from Keras to apply data augmentation techniques such as rotation, horizontal flipping, and rescaling to the training data. This helps in improving the model's generalization ability.Patches
layer that extracts patches from the images. This is a crucial step in Vision Transformers, where images are divided into smaller patches that are then processed by the transformer.seaborn
to provide a clear understanding of the model's predictions.Dataset Preparation
Train
, Val
, and Test
directories, with each directory containing subdirectories for each class (Healthy, Apple Scab, Black Rot, Cedar Apple Rust).Install Required Libraries
pip install tensorflow matplotlib seaborn numpy pandas scikit-learn
Run the Script
Analyze Results
Fine-Tuning
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset contains images of 6 fruits and vegetables: apple, banana, bitter gourd, capsicum, orange, and tomato. The images of each fruit or vegetable are grouped into two categories: fresh and stale. The purpose behind the creation of this dataset is the development of a machine learning model to classify fruits and vegetables as fresh or stale. This feature is a part of our final year project titled ‘Food Aayush’. (Github Link)
For collecting the images to create the dataset, images of the fruits and vegetables were captured daily using a mobile phone camera. Depending on the visual properties of the fruit or vegetable in each image and on the day when the image was captured, each image was labelled as fresh or stale. Additionally, videos of the fruits and vegetables were taken, and the frames of these videos were extracted to collect a large number of images conveniently. The machine learning model requires a 224 x 224-pixel image. So, the images were cropped to extract the center square of the image and resized in 512 x 512 pixels using a data pre-processing library in Keras. Frame Extraction
Data Augmentation: We used ImageDataGenerator library from Keras for augmentation. We on average created 20 augmentations per image which indeed improve our models accuracy. Data Augmentation
We would like to give credit to this dataset as we have obtained the images in some of the classes from here. Dataset
Our BE final year project, titled ‘Food Aayush’, is an application that can be used for the classification of fruits and vegetables as fresh or stale, the classification of cooking oils into different rancidity levels, and the analysis of various parameters related to the nutritional value of food and people’s dietary intake. We have trained a machine learning model for the classification of fruits and vegetables. This dataset was created for training the machine learning model. Your data will be in front of the world's largest data science community. What questions do you want to see answered?
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This project focuses on developing an intelligent system capable of detecting and classifying diseases in plant leaves using image processing and deep learning techniques. Leveraging Convolutional Neural Networks (CNNs) and transfer learning, the system analyzes leaf images to identify signs of infection with high accuracy. It supports smart agriculture by enabling early disease detection, reducing crop loss, and providing actionable insights to farmers. The project uses datasets such as PlantVillage and integrates frameworks like TensorFlow, Keras, and PyTorch. The model can be deployed as a web or mobile application, offering a real-time solution for plant health monitoring in agricultural environments.
https://user-images.githubusercontent.com/91852182/147305077-8b86ec92-ed26-43ca-860c-5812fea9b1d8.gif" alt="ezgif com-gif-maker">
Self-drivi cars has become a trending subject with a significant improvement in the technologies in the last decade. The project purpose is to train a neural network to drive an autonomous car agent on the tracks of Udacity’s Car Simulator environment. Udacity has released the simulator as an open source software and enthusiasts have hosted a competition (challenge) to teach a car how to drive using only camera images and deep learning. Driving a car in an autonomous manner requires learning to control steering angle, throttle and brakes. Behavioral cloning technique is used to mimic human driving behavior in the training mode on the track. That means a dataset is generated in the simulator by user driven car in training mode, and the deep neural network model then drives the car in autonomous mode. Ultimately, the car was able to run on Track 1 generalizing well. The project aims at reaching the same accuracy on real time data in the future.https://user-images.githubusercontent.com/91852182/147298831-225740f9-6903-4570-8336-0c9f16676456.png" alt="6">
Udacity released an open source simulator for self-driving cars to depict a real-time environment. The challenge is to mimic the driving behavior of a human on the simulator with the help of a model trained by deep neural networks. The concept is called Behavioral Cloning, to mimic how a human drives. The simulator contains two tracks and two modes, namely, training mode and autonomous mode. The dataset is generated from the simulator by the user, driving the car in training mode. This dataset is also known as the “good” driving data. This is followed by testing on the track, seeing how the deep learning model performs after being trained by that user data.
https://user-images.githubusercontent.com/91852182/147298261-4d57a5c1-1fda-4654-9741-2f284e6d0479.png" alt="1">
The problem is solved in the following steps:
Technologies that are used in the implementation of this project and the motivation behind using these are described in this section.
TensorFlow: This an open-source library for dataflow programming. It is widely used for machine learning applications. It is also used as both a math library and for large computation. For this project Keras, a high-level API that uses TensorFlow as the backend is used. Keras facilitate in building the models easily as it more user friendly.
Different libraries are available in Python that helps in machine learning projects. Several of those libraries have improved the performance of this project. Few of them are mentioned in this section. First, “Numpy” that provides with high-level math function collection to support multi-dimensional metrices and arrays. This is used for faster computations over the weights (gradients) in neural networks. Second, “scikit-learn” is a machine learning library for Python which features different algorithms and Machine Learning function packages. Another one is OpenCV (Open Source Computer Vision Library) which is designed for computational efficiency with focus on real-time applications. In this project, OpenCV is used for image preprocessing and augmentation techniques.
The project makes use of Conda Environment which is an open source distribution for Python which simplifies package management and deployment. It is best for large scale data processing. The machine on which this project was built, is a personal computer.
CNN is a type of feed-forward neural network computing system that can be used to learn from input data. Learning is accomplished by determining a set of weights or filter values that allow the network to model the behavior according to the training data. The desired output and the output generated by CNN initialized with random weights will be different. This difference (generated error) is backpropagated through the layers of CNN to adjust the weights of the neurons, which in turn reduces the error and allows us produce output closer to the desired one.
CNN is good at capturing hierarchical and spatial data from images. It utilizes filters that look at regions of an input image with a defined window size and map it to some output. It then slides the window by some defined stride to other regions, covering the whole image. Each convolution filter layer thus captures the properties of this input image hierarchically in a series of subsequent layers, capturing the details like lines in image, then shapes, then whole objects in later layers. CNN can be a good fit to feed the images of a dataset and classify them into their respective classes.
Another type of layers sometimes used in deep learning networks is a Time- distributed layer. Time-Distributed layers are provided in Keras as wrapper layers. Every temporal slice of an input is applied with this wrapper layer. The requirement for input is that to be at least three-dimensional, first index can be considered as temporal dimension. These Time-Distributed can be applied to a dense layer to each of the timesteps, independently or even used with Convolutional Layers. The way they can be written is also simple in Keras as shown in Figure 1 and Figure 2.
https://user-images.githubusercontent.com/91852182/147298483-4f37a092-7e71-4ce6-9274-9a133d138a4c.png" alt="2">
Fig. 1: TimeDistributed Dense layer
https://user-images.githubusercontent.com/91852182/147298501-6459d968-a279-4140-9be3-2d3ea826d9f6.png" alt="3">
Fig. 2: TimeDistributed Convolution layer
We will first download the simulator to start our behavioural training process. Udacity has built a simulator for self-driving cars and made it open source for the enthusiasts, so they can work on something close to a real-time environment. It is built on Unity, the video game development platform. The simulator consists of a configurable resolution and controls setting and is very user friendly. The graphics and input configurations can be changed according to user preference and machine configuration as shown in Figure 3. The user pushes the “Play!” button to enter the simulator user interface. You can enter the Controls tab to explore the keyboard controls, quite similar to a racing game which can be seen in Figure 4.
https://user-images.githubusercontent.com/91852182/147298708-de15ebc5-2482-42f8-b2a2-8d3c59fceff4.png" alt=" 4">
Fig. 3: Configuration screen
https://user-images.githubusercontent.com/91852182/147298712-944e2c2d-e01d-459b-8a7d-3c5471bea179.png" alt="5">
Fig. 4: Controls Configuration
The first actual screen of the simulator can be seen in Figure 5 and its components are discussed below. The simulator involves two tracks. One of them can be considered as simple and another one as complex that can be evident in the screenshots attached in Figure 6 and Figure 7. The word “simple” here just means that it has fewer curvy tracks and is easier to drive on, refer Figure 6. The “complex” track has steep elevations, sharp turns, shadowed environment, and is tough to drive on, even by a user doing it manually. Please refer Figure 6. There are two modes for driving the car in the simulator: (1) Training mode and (2) Autonomous mode. The training mode gives you the option of recording your run and capturing the training dataset. The small red sign at the top right of the screen in the Figure 6 and 7 depicts the car is being driven in training mode. The autonomous mode can be used to test the models to see if it can drive on the track without human intervention. Also, if you try to press the controls to get the car back on track, it will immediately notify you that it shifted to manual controls. The mode screenshot can be as seen in Figure 8. Once we have mastered how the car driven controls in simulator using keyboard keys, then we get started with record button to collect data. We will save the data from it in a specified folder as you can see
Not seeing a result you expected?
Learn how you can add new datasets to our index.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Data science beginners start with curated set of data, but it's a well known fact that in a real Data Science Project, major time is spent on collecting, cleaning and organizing data . Also domain expertise is considered as important aspect of creating good ML models. Being an automobile enthusiast, I tool up this challenge to collect images of two of the popular car models from a used car website, where users upload the images of the car they want to sell and then train a Deep Neural Network to identify model of a car from car images. In my search for images I found that approximately 10 percent of the cars pictures did not represent the intended car correctly and those pictures have to be deleted from final data.
There are 4000 images of two of the popular cars (Swift and Wagonr) in India of make Maruti Suzuki with 2000 pictures belonging to each model. The data is divided into training set with 2400 images , validation set with 800 images and test set with 800 images. The data was randomized before splitting into training, test and validation set.
The starter kernal is provided for keras with CNN. I have also created github project documenting advanced techniques in pytorch and keras for image classification like data augmentation, dropout, batch normalization and transfer learning
With small dataset like this, how much accuracy can we achieve and whether more data is always better. The baseline model trained in Keras achieves 88% accuracy on validation set, can we achieve even better performance and by how much.
Is the data collected for the two car models representative of all possible car from all over country or there is sample bias .
I would also like someone to extend the concept to build a use case so that if user uploads an incorrect car picture of car , the ML model could automatically flag it. For example user uploading incorrect model or an image which is not a car