14 datasets found

NASA Mars Rover
kaggle.com
zip
Updated Oct 8, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Kush Tripathi (2023). NASA Mars Rover [Dataset]. https://www.kaggle.com/datasets/kushtripathi/nasa-mars-rover-captured-images-and-its-details
Explore at:
zip(101585155 bytes)Available download formats
Dataset updated
Oct 8, 2023
Authors
Kush Tripathi
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Dataset Title: Exploring Mars: A Comprehensive Dataset of Rover Photos and Metadata Description

This dataset provides an extensive collection of Mars rover images paired with in-depth metadata. Sourced from various Mars missions, this dataset is a treasure trove for anyone interested in space exploration, planetary science, or computer vision.

Components:

Photos: A curated set of high-definition images taken by different cameras onboard Mars rovers. These images capture a variety of terrains, weather conditions, and other Martian phenomena.

Details: A detailed CSV file accompanies these images, containing rich metadata like the type of camera used, the corresponding Martian sol, Earth date, and the rover responsible for each image.

Dataset Origin

The dataset was compiled from various Mars missions conducted over the years. Special care has been taken to include a diverse set of images to enable a wide range of analyses and applications. Objective

As a learner delving into the field of Computer Vision, my objectives for this project are multi-fold:

Data Analysis: To perform exploratory data analysis (EDA) to understand the distribution of images based on attributes like camera type, date, and rover.

Color Analysis: To identify and visualize dominant colors across different sets of images. This could provide insights into Martian geology.

Texture and Pattern Recognition: To classify Martian terrains using texture and pattern recognition techniques.

Machine Learning: To potentially develop a predictive model that could classify images into predefined categories based on their features.

Research Questions

Which camera types have contributed the most to the dataset?

What can the dominant colors in the images tell us about Mars?

Can we classify Martian terrains into categories like rocky, sandy, and icy?

Is there a correlation between the type of terrain and other variables like camera type or date?

Tools and Technologies

I plan to utilize Python for this project, particularly libraries like OpenCV for image processing, Pandas for data manipulation, and Matplotlib/Seaborn for data visualization. For machine learning tasks, I will likely use scikit-learn or TensorFlow.

Learning and Development

This project serves as both a learning exercise and a stepping stone toward more complex computer vision projects. I aim to document my learning journey, challenges, and milestones in a series of Kaggle notebooks. Collaboration and Feedback

I warmly invite the Kaggle community to offer suggestions, critiques, or even collaborate on this venture. Your insights could be invaluable in enhancing the depth and breadth of this project.
Brain Tumor CSV
kaggle.com
zip
Updated Oct 30, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Akash Nath (2024). Brain Tumor CSV [Dataset]. https://www.kaggle.com/datasets/akashnath29/brain-tumor-csv/code
Explore at:
zip(538175483 bytes)Available download formats
Dataset updated
Oct 30, 2024
Authors
Akash Nath
License
Open Database License (ODbL) v1.0https://www.opendatacommons.org/licenses/odbl/1.0/
License information was derived automatically
Description
This dataset provides grayscale pixel values for brain tumor MRI images, stored in a CSV format for simplified access and ease of use. The goal is to create a "MNIST-like" dataset for brain tumors, where each row in the CSV file represents the pixel values of a single image in its original resolution. This format makes it convenient for researchers and developers to quickly load and analyze MRI data for brain tumor detection, classification, and segmentation tasks without needing to handle large image files directly.

Motivation and Use Cases

Brain tumor classification and segmentation are critical tasks in medical imaging, and datasets like these are valuable for developing and testing machine learning and deep learning models. While there are several publicly available brain tumor image datasets, they often consist of large image files that can be challenging to process. This CSV-based dataset addresses that by providing a compact and accessible format. Potential use cases include: - Tumor Classification: Identifying different types of brain tumors, such as glioma, meningioma, and pituitary tumors, or distinguishing between tumor and non-tumor images. - Tumor Segmentation: Applying pixel-level classification and segmentation techniques for tumor boundary detection. - Educational and Rapid Prototyping: Ideal for educational purposes or quick experimentation without requiring large image processing capabilities.

Data Structure

This dataset is structured as a single CSV file where each row represents an image, and each column represents a grayscale pixel value. The pixel values are stored as integers ranging from 0 (black) to 255 (white).

CSV File Contents

Pixel Values: Each row contains the pixel values of a single grayscale image, flattened into a 1-dimensional array. The original image dimensions vary, and rows in the CSV will correspondingly vary in length.

Simplified Access: By using a CSV format, this dataset avoids the need for specialized image processing libraries and can be easily loaded into data analysis and machine learning frameworks like Pandas, Scikit-Learn, and TensorFlow.

How to Use This Dataset

Loading the Data: The CSV can be loaded using standard data analysis libraries, making it compatible with Python, R, and other platforms.

Data Preprocessing: Users may normalize pixel values (e.g., between 0 and 1) for deep learning applications.

Splitting Data: While this dataset does not predefine training and testing splits, users can separate rows into training, validation, and test sets.

Reshaping for Models: If needed, each row can be reshaped to the original dimensions (retrieved from the subfolder structure) to view or process as an image.

Technical Details

Image Format: Grayscale MRI images, with pixel values ranging from 0 to 255.

Resolution: Original resolution, no resizing applied.

Size: Each row’s length varies according to the original dimensions of each MRI image.

Data Type: CSV file with integer pixel values.

Acknowledgments

This dataset is intended for research and educational purposes only. Users are encouraged to cite and credit the original data sources if using this dataset in any publications or projects. This is a derived CSV version aimed to simplify access and usability for machine learning and data science applications.
Code and dataset for publication "Laser Wakefield Accelerator modelling with...
zenodo.org
zip
Updated Jan 8, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
M. J. V. Streeter; M. J. V. Streeter (2023). Code and dataset for publication "Laser Wakefield Accelerator modelling with Variational Neural Networks" [Dataset]. http://doi.org/10.5281/zenodo.7510352
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.7510352
Dataset updated
Jan 8, 2023
Dataset provided by
Zenodohttp://zenodo.org/
Authors
M. J. V. Streeter; M. J. V. Streeter
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Data and code for reproducing figures in published work.

High Power Laser Science and Engineering

https://doi.org/10.1017/hpl.2022.47

Code used various python packages including tensorflow.

Conda environment was created with (on 6th Jan 2022)
conda create --name tf tensorflow notebook tensorflow-probability pandas tqdm scikit-learn matplotlib seaborn protobuf opencv scipy scikit-image scikit-optimize Pillow PyAbel libclang flatbuffers gast --channel conda-forge
m
Neural Networks in Friction Factor Analysis of Smooth Pipe Bends
data.mendeley.com
Updated Dec 19, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Adarsh Vasa (2022). Neural Networks in Friction Factor Analysis of Smooth Pipe Bends [Dataset]. http://doi.org/10.17632/sjvbwh5ckg.1
Explore at:
Unique identifier
https://doi.org/10.17632/sjvbwh5ckg.1
Dataset updated
Dec 19, 2022
Authors
Adarsh Vasa
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
PROGRAM SUMMARY No. of lines in distributed program, including test data, etc.: 481 No. of bytes in distributed program, including test data, etc.: 14540.8 Distribution format: .py, .csv Programming language: Python Computer: Any workstation or laptop computer running TensorFlow, Google Colab, Anaconda, Jupyter, pandas, NumPy, Microsoft Azure and Alteryx. Operating system: Windows and Mac OS, Linux.

Nature of problem: Navier-Stokes equations are solved numerically in ANSYS Fluent using Reynolds stress model for turbulence. The simulated values of friction factor are validated with theoretical and experimental data obtained from literature. Artificial neural networks are then used for a prediction-based augmentation of friction factor. The capabilities of the neural networks is discussed, in regard to computational cost and domain limitations.

Solution method: The simulation data is obtained through Reynolds stress modelling of fluid flow through pipe. This data is augmented using the artificial neural network model that predicts within and without data domain.

Restrictions: The code used in this research is limited to smooth pipe bends, in which friction factor is analysed using a steady state incompressible fluid flow.

Runtime: The artificial neural network produces results within a span of 20 seconds for three-dimensional geometry, using the allocated free computational resources of Google Colaboratory cloud-based computing system.
V2 Balloon Detection Dataset
kaggle.com
zip
Updated Jul 7, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
vbookshelf (2022). V2 Balloon Detection Dataset [Dataset]. https://www.kaggle.com/vbookshelf/v2-balloon-detection-dataset
Explore at:
zip(49788043 bytes)Available download formats
Dataset updated
Jul 7, 2022
Authors
vbookshelf
Description
Context

I needed a simple image dataset that I could use when trying different object detection algorithms for the first time. It had to be something that could be quickly understood and easily loaded. I didn't want spend a lot of time doing EDA or trying to remember how the data is structured. Moreover, I wanted to be able to clearly see when a model 's prediction was correct or when it had made a mistake. When working with chest x-ray images, for example, it takes an expert to know if a model's predictions are correct.

I found the Balloons dataset and simplified it. The original data is split into train and test sets and it has two json files that need to be parsed. In this new version, I copied all images into a single folder and replaced the json files with one csv file that can be easily loaded with Pandas.

Content

The dataset consists of 74 jpg images and one csv file. Each image contains one or more balloons.

The csv file has five columns:

fname - The image file name. height - The image height. width - The image width. num_balloons - The number of balloons on the image. bbox - The coordinates of each bounding box on the image.

The coordinates of each bbox are stored in a dictionary. The format is as follows:

{"xmin": 100, "ymin": 100, "xmax": 300, "ymax": 300} Where xmin and ymin are the coordinates of the top left corner, and xmax and ymax are the coordinates of the bottom right corner.

Each entry in the bbox column is a list of dictionaries. For example, if an image has two ballons and hence two bounding boxes, the entry will be as follows:

[{"xmin": 100, "ymin": 100, "xmax": 300, "ymax": 300}, {"xmin": 100, "ymin": 100, "xmax": 300, "ymax": 300}]

When loaded into a Pandas dataframe all items in the bbox column are of type string. The strings can be converted to a python lists like this:

import ast # convert each item in the bbox column from type str to type list df['bbox'] = df['bbox'].apply(ast.literal_eval)

Acknowledgements

Many thanks to Waleed Abdulla who created this dataset.

The original dataset can be downloaded and unzipped using this code:

!wget https://github.com/matterport/Mask_RCNN/releases/download/v2.1/balloon_dataset.zip !unzip balloon_dataset.zip > /dev/null

Inspiration

Can you create an app that can look at an image and tell you: - how many balloons are on the image, and - what are the colours of those balloons.

This is something that could help blind people. To help you get started here's an example of a similar project .

License

In this blog post the dataset's creator mentions that the images were sourced from Flickr. All images have a "Commercial use & mods allowed" license.

Header image by andremsantana on Pixabay.
Bird Species Image Classification Dataset
kaggle.com
Updated Jun 11, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Evil Spirit05 (2025). Bird Species Image Classification Dataset [Dataset]. https://www.kaggle.com/datasets/evilspirit05/birds-species-prediction
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jun 11, 2025
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Evil Spirit05
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
This dataset contains high-quality images of six distinct bird species, curated for use in image classification, computer vision, and biodiversity research tasks. Each bird species included in this dataset is well-represented, making it ideal for training and evaluating deep learning models.

Label Species Name Image Count
1 American Goldfinch 143
2 Emperor Penguin 139
3 Downy Woodpecker 137
4 Flamingo 132
5 Carmine Bee-eater 131
6 Barn Owl 129

📂 Dataset Highlights: * Total Images: 811 * Classes: 6 unique bird species * Balanced Labels: Nearly equal distribution across classes * Use Cases: Image classification, model benchmarking, transfer learning, educational projects, biodiversity analysis

🧠 Potential Applications: * Training deep learning models like CNNs for bird species recognition * Fine-tuning pre-trained models using a small and balanced dataset * Educational projects in ornithology and computer vision * Biodiversity and wildlife conservation tech solutions

🛠️ Suggested Tools: * Python (Pandas, NumPy, Matplotlib) * TensorFlow / PyTorch for model development * OpenCV for image preprocessing * Streamlit for creating interactive demos
h
Supporting data for “Deep learning methods and applications to digital...
datahub.hku.hk
Updated Oct 3, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Shichao Ma (2024). Supporting data for “Deep learning methods and applications to digital health” [Dataset]. http://doi.org/10.25442/hku.27060427.v1
Explore at:
Unique identifier
https://doi.org/10.25442/hku.27060427.v1
Dataset updated
Oct 3, 2024
Dataset provided by
HKU Data Repository
Authors
Shichao Ma
License
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Description
This repository contains three folders which contain either the data or the source code for the three main chapters (Chapter 3, 4, and 5) in the thesis. Those folders are 1) Dataset (Chapter 3): This file contains phonocardigrams signals (/PhysioNet2016) used in Chapter 3 and 4 as the upstream pretraining data. This is a public dataset. /SourceCode includes all the statistical analysis and visualization scripts for Chapter 3. Yaseen_dataset and PASCAL contain phonocardigrams signals with pathological features, Yaseen_dataset serves as the downstream finetuning dataset in Chapter 3, while PASCAL datasets serves as the secondary testing dataset in Chapter 3. 2) Dataset (Chapter 4): /SourceCode includes all the statistical analysis and visualization scripts for Chapter 4. 3) Dataset (Chapter 5): PAD-UFES-20_processed contains dermatology images processed from the PAD-UFES-20 dataset, which is a public dataset. The dataset is used in the Chapter 5. And /SourceCode includes all the statistical analysis and visualization scripts for Chapter 5.Several packges are mendatory to run the source code, including:Python > 3.6 (3.11 preferred), TensorFlow > 2.16, Keras > 3.3, NumPy > 1.26, Pandas > 2.2, SciPy > 1.13
D
Image enhancement code: time-resolved tomograms of EICP application using 3D...
darus.uni-stuttgart.de
Updated Feb 7, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Dongwon Lee; Holger Steeb (2023). Image enhancement code: time-resolved tomograms of EICP application using 3D U-net [Dataset]. http://doi.org/10.18419/DARUS-2991
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.18419/DARUS-2991
Dataset updated
Feb 7, 2023
Dataset provided by
DaRUS
Authors
Dongwon Lee; Holger Steeb
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Dataset funded by
DFG
Description
This dataset contains the codes to reproduce the results of "Time resolved micro-XRCT dataset of Enzymatically Induced Calcite Precipitation (EICP) in sintered glass bead columns", cf. https://doi.org/10.18419/darus-2227. The code takes "low-dose" images as an input where the images contain many artifacts and noise as a trade-off of a fast data acquisition (6 min / dataset while 3 hours / dataset ("high-dose") in normal configuration). These low quality images are able to be improved with the help of a pre-trained model. The pre-trained model provided in here is trained with pairs of "high-dose" and "low-dose" data of above mentioned EICP application. The examples of used training, input and output data can be also found in this dataset. Although we showed only limited examples in here, we would like to emphasize that the used workflow and codes can be further extended to general image enhancement applications. The code requires a Python version above 3.7.7 with packages such as tensorflow, kears, pandas, scipy, scikit, numpy and patchify libraries. For further details of operation, please refer to the readme.txt file.
Data from: Informative neural representations of unseen contents during...
openneuro.org
Updated Dec 10, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ning Mei; Roberto Santana; David Soto (2021). Informative neural representations of unseen contents during higher-order processing in human brains and deep artificial networks [Dataset]. http://doi.org/10.18112/openneuro.ds003927.v1.0.1
Explore at:
Unique identifier
https://doi.org/10.18112/openneuro.ds003927.v1.0.1
Dataset updated
Dec 10, 2021
Dataset provided by
OpenNeurohttps://openneuro.org/
Authors
Ning Mei; Roberto Santana; David Soto
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
This fMRI dataset was collected for the study "Informative neural representations of unseen contents during higher-order processing in human brains and deep artificial networks".

Code corresponding to the dataste: https://github.com/nmningmei/unconfeats

System Information

Platform: Linux-3.10.0-514.el7.x86_64-x86_64-with-centos-7.3.1611-Core

CPU: x86_64: 16 cores

Python environment

Python: 3.6.3 |Anaconda, Inc.| (default, Nov 20 2017, 20:41:42) [GCC 7.2.0]

Numpy: 1.19.1

Scipy: 1.3.1

Matplotlib: 3.1.3

Scikit-learn: 0.24.2

Seaborn: 0.11.1

Pandas: 1.0.1

Tensorflow: 2.0.0

Pytorch: 1.7.1

Nilearn: 0.7.1

Nipype: 1.4.2

LegrandNico/metadPy ## R environment - R base

R: 4.0.3 # for 3-way repeated measure ANOVAs ## Brain image processing backends

mricrogl

mricron: 10.2014

FSL: 6.0.0

Freesurfer: 6.0.0
Historical Data of Stocks Listed on NSE
kaggle.com
zip
Updated Dec 23, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sampath Gudibettumane (2024). Historical Data of Stocks Listed on NSE [Dataset]. https://www.kaggle.com/datasets/paramamithra/historical-data-of-stocks-listed-on-nse
Explore at:
zip(22 bytes)Available download formats
Dataset updated
Dec 23, 2024
Authors
Sampath Gudibettumane
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Overview

This dataset provides daily stock prices for all companies listed on the National Stock Exchange (NSE) of India. The data spans several years and includes essential trading information that can be used for various financial analyses, stock market research, and machine learning applications.

Content

The dataset includes the following columns:

Date: The date of the trading day in YYYY-MM-DD format.

Open: The opening price of the stock on the given date.

High: The highest price of the stock on the given date.

Low: The lowest price of the stock on the given date.

Close: The closing price of the stock on the given date.

Adj Close: The adjusted closing price of the stock on the given date, which accounts for dividends, stock splits, and other corporate actions.

Volume: The number of shares traded on the given date.

Symbol: The unique ticker symbol of the stock.

Data Source

The data has been sourced using the Yahoo Finance API, providing a reliable and comprehensive view of stock performance over time.

Usage

This dataset is ideal for:

Time series analysis and forecasting of stock prices.

Developing and testing trading algorithms.

Financial market research and trend analysis.

Machine learning projects related to finance and economics.

File Format

The dataset is available in CSV format, making it easy to load into data analysis and machine learning libraries such as pandas, scikit-learn, and TensorFlow.
Emotion Prediction with Quantum5 Neural Network AI
kaggle.com
zip
Updated Oct 19, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
EMİRHAN BULUT (2025). Emotion Prediction with Quantum5 Neural Network AI [Dataset]. https://www.kaggle.com/datasets/emirhanai/emotion-prediction-with-semi-supervised-learning
Explore at:
zip(2332683 bytes)Available download formats
Dataset updated
Oct 19, 2025
Authors
EMİRHAN BULUT
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Emotion Prediction with Quantum5 Neural Network AI Machine Learning - By Emirhan BULUT

V1

I have created an artificial intelligence software that can make an emotion prediction based on the text you have written using the Semi Supervised Learning method and the RC algorithm. I used very simple codes and it was a software that focused on solving the problem. I aim to create the 2nd version of the software using RNN (Recurrent Neural Network). I hope I was able to create an example for you to use in your thesis and projects.

V2

I decided to apply a technique that I had developed in the emotion dataset that I had used Semi-Supervised learning in Machine Learning methods before. This technique is produced according to Quantum5 laws. I developed a smart artificial intelligence software that can predict emotion with Quantum5 neuronal networks. I share this software with all humanity as open source on Kaggle. It is my first open source project in NLP system with Quantum technology. Developing the NLP system with Quantum technology is very exciting!

Happy learning!

Emirhan BULUT

Head of AI and AI Inventor

Emirhan BULUT. (2022). Emotion Prediction with Quantum5 Neural Network AI [Data set]. Kaggle. https://doi.org/10.34740/KAGGLE/DS/2129637

The coding language used:

Python 3.9.8

Libraries Used:

Keras

Tensorflow

NumPy

Pandas

Scikit-learn (SKLEARN)

https://raw.githubusercontent.com/emirhanai/Emotion-Prediction-with-Semi-Supervised-Learning-of-Machine-Learning-Software-with-RC-Algorithm---By/main/Quantum%205.png" alt="Emotion Prediction with Quantum5 Neural Network on AI - Emirhan BULUT">

https://raw.githubusercontent.com/emirhanai/Emotion-Prediction-with-Semi-Supervised-Learning-of-Machine-Learning-Software-with-RC-Algorithm---By/main/Emotion%20Prediction%20with%20Semi%20Supervised%20Learning%20of%20Machine%20Learning%20Software%20with%20RC%20Algorithm%20-%20By%20Emirhan%20BULUT.png" alt="Emotion Prediction with Semi Supervised Learning of Machine Learning Software with RC Algorithm - Emirhan BULUT">

Developer Information:

Name-Surname: Emirhan BULUT

Contact (Email) : emirhan@isap.solutions

LinkedIn : https://www.linkedin.com/in/artificialintelligencebulut/

Kaggle: https://www.kaggle.com/emirhanai

Official Website: https://www.emirhanbulut.com.tr
Air Quality Index Prediction using Neural Networks
kaggle.com
zip
Updated Oct 27, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Moiz Azad (2025). Air Quality Index Prediction using Neural Networks [Dataset]. https://www.kaggle.com/datasets/moizkhan00/air-quality-index-prediction-using-neural-networks
Explore at:
zip(1290288 bytes)Available download formats
Dataset updated
Oct 27, 2025
Authors
Moiz Azad
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
🌍 Air Quality Index (AQI) Prediction using Neural Networks

This notebook focuses on predicting Air Quality Index (AQI) values by estimating Carbon Monoxide (CO) concentration using a Neural Network Regression Model trained on environmental pollutant data.

The model follows the EPA (Environmental Protection Agency) standard formula for converting CO concentration (in ppm) to AQI levels.

⚙️ Workflow Overview

Data Preprocessing

Cleaned and normalized the dataset

Removed date/time and irrelevant columns

Scaled input and output features using MinMaxScaler

Model Building (Neural Network)

Built a deep regression model using TensorFlow/Keras

Activation: ReLU

Optimizer: Adam

Loss: Mean Squared Error (MSE)

Prediction Phase

Model predicts CO concentration based on given input features

Predictions are inverse-transformed to get real-world ppm values

AQI Calculation (EPA Standard)

AQI computed using the official EPA breakpoint formula

Converts CO ppm into an AQI score ranging from 0–500

Visualization

Distribution of pollutants

Correlation heatmap

Comparison of Predicted CO vs AQI Levels

AQI Category visualization

🧠 Why This Project?

Air pollution is one of the most pressing global issues today.
By combining machine learning with environmental science, this notebook helps predict pollution levels and interpret air quality using AI-driven insights.

📊 Tech Stack

Python

TensorFlow / Keras

NumPy, Pandas, Matplotlib, Seaborn

Scikit-learn

🏁 Results

✅ Accurate CO prediction using neural network regression
✅ Dynamic AQI computation based on EPA standards
✅ Clear and intuitive visualizations

🚀 "AI can’t clean the air — but it can help us understand how bad it really is."
GitHub Commit Messages Dataset
kaggle.com
zip
Updated Mar 2, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Dhruvil Dave (2021). GitHub Commit Messages Dataset [Dataset]. https://www.kaggle.com/dsv/1988456
Explore at:
zip(561489165 bytes)Available download formats
Dataset updated
Mar 2, 2021
Authors
Dhruvil Dave
License
Open Data Commons Attribution License (ODC-By) v1.0https://www.opendatacommons.org/licenses/by/1.0/
License information was derived automatically
Description
https://github.githubassets.com/images/modules/site/home/footer-illustration.svg" alt="GitHub">

Image credits: https://github.com

Introduction

This is a dataset that contains all commit messages and its related metadata from 32 popular GitHub repositories. These repositories are:

tensorflow/tensorflow

pytorch/pytorch

torvalds/linux

python/cpython

rust-lang/rust

microsoft/TypeScript

microsoft/vscode

golang/go

numpy/numpy

scikit-learn/scikit-learn

openbsd/src

freebsd/freebsd-src

pandas-dev/pandas

scipy/scipy

tidyverse/ggplot2

kubernetes/kubernetes

postgres/postgres

nodejs/node

facebook/react

angular/angular

matplotlib/matplotlib

apache/httpd

nginx/nginx

opencv/opencv

ipython/ipython

rstudio/rstudio

jupyterlab/jupyterlab

gcc-mirror/gcc

apple/swift

denoland/deno

apache/spark

llvm/llvm-project

Credits

Image credits: Unsplash - yancymin
Image Classification by CNN
kaggle.com
zip
Updated Mar 4, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Harsh Jaglan (2024). Image Classification by CNN [Dataset]. https://www.kaggle.com/datasets/harshjaglan01/image-classification-by-cnn/code
Explore at:
zip(311627190 bytes)Available download formats
Dataset updated
Mar 4, 2024
Authors
Harsh Jaglan
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
Automated Flower Identification Using Convolutional Neural Networks

This project aims to develop a model for identifying five different flower species (rose, tulip, sunflower, dandelion, daisy) using Convolutional Neural Networks (CNNs).

Description

The dataset consists of 5,000 images (1,000 images per class) collected from various online sources. The model achieved an accuracy of 98.58% on the test set. Usage

This project requires Python 3.x and the following libraries:

TensorFlow: For making Neural Networks numpy: For numerical computing and array operations. pandas: For data manipulation and analysis. matplotlib: For creating visualizations such as line plots, bar plots, and histograms. seaborn: For advanced data visualization and creating statistically-informed graphics. scikit-learn: For machine learning algorithms and model training. To run the project:

Clone this repository.

Install the required libraries. Run the Jupyter Notebook: jupyter notebook flower_classification.ipynb Additional Information Link to code: https://github.com/Harshjaglan01/flower-classification-cnn License: MIT License
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Label	Species Name	Image Count
1	American Goldfinch	143
2	Emperor Penguin	139
3	Downy Woodpecker	137
4	Flamingo	132
5	Carmine Bee-eater	131
6	Barn Owl	129

Facebook

Twitter

Click to copy link

Link copied

Cite

Kush Tripathi (2023). NASA Mars Rover [Dataset]. https://www.kaggle.com/datasets/kushtripathi/nasa-mars-rover-captured-images-and-its-details

NASA Mars Rover

A Comprehensive Dataset of Rover Photos and Metadata Description

Explore at:

zip(101585155 bytes)Available download formats

Dataset updated

Oct 8, 2023

Authors

Kush Tripathi

License

https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

Description

Dataset Title: Exploring Mars: A Comprehensive Dataset of Rover Photos and Metadata Description

This dataset provides an extensive collection of Mars rover images paired with in-depth metadata. Sourced from various Mars missions, this dataset is a treasure trove for anyone interested in space exploration, planetary science, or computer vision.

Components:

Photos: A curated set of high-definition images taken by different cameras onboard Mars rovers. These images capture a variety of terrains, weather conditions, and other Martian phenomena.
Details: A detailed CSV file accompanies these images, containing rich metadata like the type of camera used, the corresponding Martian sol, Earth date, and the rover responsible for each image.

Dataset Origin

The dataset was compiled from various Mars missions conducted over the years. Special care has been taken to include a diverse set of images to enable a wide range of analyses and applications. Objective

As a learner delving into the field of Computer Vision, my objectives for this project are multi-fold:

Data Analysis: To perform exploratory data analysis (EDA) to understand the distribution of images based on attributes like camera type, date, and rover.
Color Analysis: To identify and visualize dominant colors across different sets of images. This could provide insights into Martian geology.
Texture and Pattern Recognition: To classify Martian terrains using texture and pattern recognition techniques.
Machine Learning: To potentially develop a predictive model that could classify images into predefined categories based on their features.

Research Questions

Which camera types have contributed the most to the dataset?
What can the dominant colors in the images tell us about Mars?
Can we classify Martian terrains into categories like rocky, sandy, and icy?
Is there a correlation between the type of terrain and other variables like camera type or date?

Tools and Technologies

I plan to utilize Python for this project, particularly libraries like OpenCV for image processing, Pandas for data manipulation, and Matplotlib/Seaborn for data visualization. For machine learning tasks, I will likely use scikit-learn or TensorFlow.

Learning and Development

This project serves as both a learning exercise and a stepping stone toward more complex computer vision projects. I aim to document my learning journey, challenges, and milestones in a series of Kaggle notebooks. Collaboration and Feedback

I warmly invite the Kaggle community to offer suggestions, critiques, or even collaborate on this venture. Your insights could be invaluable in enhancing the depth and breadth of this project.

Clear search

Close search

Google apps

Main menu

NASA Mars Rover

Brain Tumor CSV

Motivation and Use Cases

Data Structure

CSV File Contents

How to Use This Dataset

Technical Details

Acknowledgments

Code and dataset for publication "Laser Wakefield Accelerator modelling with...

Neural Networks in Friction Factor Analysis of Smooth Pipe Bends

V2 Balloon Detection Dataset

Context

Content

Acknowledgements

Inspiration

License

Bird Species Image Classification Dataset

This dataset contains high-quality images of six distinct bird species, curated for use in image classification, computer vision, and biodiversity research tasks. Each bird species included in this dataset is well-represented, making it ideal for training and evaluating deep learning models.

Supporting data for “Deep learning methods and applications to digital...

Image enhancement code: time-resolved tomograms of EICP application using 3D...

Data from: Informative neural representations of unseen contents during...

System Information

Python environment

Historical Data of Stocks Listed on NSE

Overview

Content

Data Source

Usage

File Format

Emotion Prediction with Quantum5 Neural Network AI

Emotion Prediction with Quantum5 Neural Network AI Machine Learning - By Emirhan BULUT

The coding language used:

Libraries Used:

Developer Information:

Air Quality Index Prediction using Neural Networks

🌍 Air Quality Index (AQI) Prediction using Neural Networks

⚙️ Workflow Overview

🧠 Why This Project?

📊 Tech Stack

🏁 Results

GitHub Commit Messages Dataset

Introduction

Credits

Image Classification by CNN

Automated Flower Identification Using Convolutional Neural Networks

Description

This project requires Python 3.x and the following libraries:

Clone this repository.

NASA Mars Rover

A Comprehensive Dataset of Rover Photos and Metadata Description