100+ datasets found

c
Walmart Products Dataset – Free Product Data CSV
crawlfeeds.com
csv, zip
Updated Dec 2, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Crawl Feeds (2025). Walmart Products Dataset – Free Product Data CSV [Dataset]. https://crawlfeeds.com/datasets/walmart-products-free-dataset
Explore at:
zip, csvAvailable download formats
Dataset updated
Dec 2, 2025
Dataset authored and provided by
Crawl Feeds
License
https://crawlfeeds.com/privacy_policyhttps://crawlfeeds.com/privacy_policy
Description
Looking for a free Walmart product dataset? The Walmart Products Free Dataset delivers a ready-to-use ecommerce product data CSV containing ~2,100 verified product records from Walmart.com. It includes vital details like product titles, prices, categories, brand info, availability, and descriptions — perfect for data analysis, price comparison, market research, or building machine-learning models.

Key Features

Complete Product Metadata: Each entry includes URL, title, brand, SKU, price, currency, description, availability, delivery method, average rating, total ratings, image links, unique ID, and timestamp.

CSV Format, Ready to Use: Download instantly - no need for scraping, cleaning or formatting.

Good for E-commerce Research & ML: Ideal for product cataloging, price tracking, demand forecasting, recommendation systems, or data-driven projects.

Free & Easy Access: Priced at USD $0.0, making it a great starting point for developers, data analysts or students.

Who Benefits?

Data analysts & researchers exploring e-commerce trends or product catalog data.

Developers & data scientists building price-comparison tools, recommendation engines or ML models.

E-commerce strategists/marketers need product metadata for competitive analysis or market research.

Students/hobbyists needing a free dataset for learning or demo projects.

Why Use This Dataset Instead of Manual Scraping?

Time-saving: No need to write scrapers or deal with rate limits.

Clean, structured data: All records are verified and already formatted in CSV, saving hours of cleaning.

Risk-free: Avoid Terms-of-Service issues or IP blocks that come with manual scraping.
Instant access: Free and immediately downloadable.
Data from: NICHE: A Curated Dataset of Engineered Machine Learning Projects...
figshare.com
txt
Updated May 30, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ratnadira Widyasari; Zhou YANG; Ferdian Thung; Sheng Qin Sim; Fiona Wee; Camellia Lok; Jack Phan; Haodi Qi; Constance Tan; Qijin Tay; David LO (2023). NICHE: A Curated Dataset of Engineered Machine Learning Projects in Python [Dataset]. http://doi.org/10.6084/m9.figshare.21967265.v1
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.21967265.v1
Dataset updated
May 30, 2023
Dataset provided by
Figsharehttp://figshare.com/
figshare
Authors
Ratnadira Widyasari; Zhou YANG; Ferdian Thung; Sheng Qin Sim; Fiona Wee; Camellia Lok; Jack Phan; Haodi Qi; Constance Tan; Qijin Tay; David LO
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Machine learning (ML) has gained much attention and has been incorporated into our daily lives. While there are numerous publicly available ML projects on open source platforms such as GitHub, there have been limited attempts in filtering those projects to curate ML projects of high quality. The limited availability of such high-quality dataset poses an obstacle to understanding ML projects. To help clear this obstacle, we present NICHE, a manually labelled dataset consisting of 572 ML projects. Based on evidences of good software engineering practices, we label 441 of these projects as engineered and 131 as non-engineered. In this repository we provide "NICHE.csv" file that contains the list of the project names along with their labels, descriptive information for every dimension, and several basic statistics, such as the number of stars and commits. This dataset can help researchers understand the practices that are followed in high-quality ML projects. It can also be used as a benchmark for classifiers designed to identify engineered ML projects.

GitHub page: https://github.com/soarsmu/NICHE
BIG DATA PROJECT
kaggle.com
zip
Updated Jun 7, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Glitch_in_Vector (2024). BIG DATA PROJECT [Dataset]. https://www.kaggle.com/datasets/ermohammadamin/big-data-project
Explore at:
zip(6814558981 bytes)Available download formats
Dataset updated
Jun 7, 2024
Authors
Glitch_in_Vector
Description
Dataset

This dataset was created by Glitch_in_Vector

Contents

Chunk_0 for me, Choose others as you want.
d
Project Management
catalog.data.gov
datasets.ai
+2more
Updated May 2, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Office of Project Management (2025). Project Management [Dataset]. https://catalog.data.gov/dataset/project-management
Explore at:
Dataset updated
May 2, 2025
Dataset provided by
Office of Project Management
Description
the Department of Energy’s Enterprise Project Management Organization (EPMO), providing leadership and assistance in developing and implementing DOE-wide policies, procedures, programs, and management systems pertaining to project management, and independently monitors, assesses, and reports on project execution performance. The office validates project performance baselines–scope, cost and schedule–of the Department’s largest construction and environmental clean-up projects prior to budget request to Congress—an active project portfolio totaling over $30 billion. The office also serves as Executive Secretariat for the Department’s Energy Systems Acquisition Advisory Board (ESAAB) and the Project Management Risk Committee (PMRC). In these capacities, the Director is accountable to the Deputy Secretary.
h
open-data-project
huggingface.co
Updated Nov 21, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Damien Johnston (2024). open-data-project [Dataset]. https://huggingface.co/datasets/damien-johnston/open-data-project
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Nov 21, 2024
Authors
Damien Johnston
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
damien-johnston/open-data-project dataset hosted on Hugging Face and contributed by the HF Datasets community
Data from: Project 2 Dataset
kaggle.com
zip
Updated Nov 7, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jordan Hill NMTAFE (2024). Project 2 Dataset [Dataset]. https://www.kaggle.com/datasets/jordanhillnmtafe/project-2-dataset
Explore at:
zip(4187580 bytes)Available download formats
Dataset updated
Nov 7, 2024
Authors
Jordan Hill NMTAFE
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
Dataset

This dataset was created by Jordan Hill NMTAFE

Released under MIT

Contents
d
Capital Projects
catalog.data.gov
data.wprdc.org
+2more
Updated Jan 24, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
City of Pittsburgh (2023). Capital Projects [Dataset]. https://catalog.data.gov/dataset/capital-projects-cfebb
Explore at:
Dataset updated
Jan 24, 2023
Dataset provided by
City of Pittsburgh
Description
City of Pittsburgh Capital Projects Budgets NOTE: The data in this dataset has not updated since 2021 because of a broken data feed. We're working to fix it.
d
Transportation Projects in Your Neighborhood
catalog.data.gov
datasets.ai
+3more
Updated Jul 19, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
State of New York (2025). Transportation Projects in Your Neighborhood [Dataset]. https://catalog.data.gov/dataset/transportation-projects-in-your-neighborhood
Explore at:
Dataset updated
Jul 19, 2025
Dataset provided by
State of New York
Description
This data set contains DOT construction project information. The data is refreshed nightly from multiple data sources, therefore the data becomes stale rather quickly.
Project Data Cost for Prediction
kaggle.com
zip
Updated Sep 9, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Edgar Poe (2022). Project Data Cost for Prediction [Dataset]. https://www.kaggle.com/datasets/edgarpoe/project-data-cost-for-prediction
Explore at:
zip(5157 bytes)Available download formats
Dataset updated
Sep 9, 2022
Authors
Edgar Poe
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
This dataset is constructed from project activity experience.

Columns: not done - Projects that didn't worked out until accomplishment (0 = done // 1 = not done) time required - Time in hours estimated for the accomplishment cost - Cost per hour
Materials Project Data
figshare.com
txt
Updated May 30, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Anubhav Jain; Shyue Ping Ong; Geoffroy Hautier; Wei Chen; William Davidson Richards; Stephen Dacek; Shreyas Cholia; Dan Gunter; David Skinner; Gerbrand Ceder; Kristin Persson; Hacking Materials (2023). Materials Project Data [Dataset]. http://doi.org/10.6084/m9.figshare.7227749.v1
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.7227749.v1
Dataset updated
May 30, 2023
Dataset provided by
Figsharehttp://figshare.com/
figshare
Authors
Anubhav Jain; Shyue Ping Ong; Geoffroy Hautier; Wei Chen; William Davidson Richards; Stephen Dacek; Shreyas Cholia; Dan Gunter; David Skinner; Gerbrand Ceder; Kristin Persson; Hacking Materials
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
A complete copy of the Materials Project database as of 10/18/2018. Mp_all files contain structure data for each material while mp_nostruct does not.Available as Monty Encoder encoded JSON and as CSV. Recommended access method for these particular files is with the matminer Python package using the datasets module. Access to the current Materials Project is recommended through their API (good), pymatgen (better), or matminer (best).Note on citations: If you found this dataset useful and would like to cite it in your work, please be sure to cite its original sources below rather than or in addition to this page.Dataset discussed in:A. Jain*, S.P. Ong*, G. Hautier, W. Chen, W.D. Richards, S. Dacek, S. Cholia, D. Gunter, D. Skinner, G. Ceder, K.A. Persson (*=equal contributions) The Materials Project: A materials genome approach to accelerating materials innovation APL Materials, 2013, 1(1), 011002.Dataset sourced from:https://materialsproject.org/Citations for specific material properties available here:https://materialsproject.org/citing
g
Insurance Dataset
gts.ai
json
Updated Oct 16, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
GTS (2022). Insurance Dataset [Dataset]. https://gts.ai/case-study/insurance-dataset-annotation-services-for-precision-data-analysis/
Explore at:
jsonAvailable download formats
Dataset updated
Oct 16, 2022
Dataset provided by
GLOBOSE TECHNOLOGY SOLUTIONS PRIVATE LIMITED
Authors
GTS
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
The Insurance Dataset project is an extensive initiative focused on collecting and analyzing insurance-related data from various sources.
Agile Project Data
kaggle.com
zip
Updated Oct 7, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Next789 (2024). Agile Project Data [Dataset]. https://www.kaggle.com/datasets/next789/agile-project-data
Explore at:
zip(649 bytes)Available download formats
Dataset updated
Oct 7, 2024
Authors
Next789
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
Dataset Variables

Agile Effectiveness (measured on a Likert scale from 2 to 5): This variable captures how respondents perceive the effectiveness of Agile methodology in enhancing project management processes.

Risk Mitigation (Likert scale 2 to 5): This variable reflects respondents' views on how well Agile methodology supports the mitigation of risks throughout the project lifecycle.

Management Satisfaction (Likert scale 2 to 5): This variable measures how satisfied the management is with the outcomes of projects where Agile methodologies were implemented.

Supply Chain Improvement (Likert scale 2 to 5): This variable captures the perceived improvements in supply chain processes that result from using Agile methods.

Time Efficiency (Likert scale 2 to 5): This measures the impact of Agile methodology on improving the efficiency of time management within projects.

Cost Savings (percentage from 10% to 48%): This variable quantifies the percentage of cost savings achieved as a result of implementing Agile methods.

Project Success (binary: 0 = Failure, 1 = Success): This is the dependent variable and represents whether or not the project was considered successful.
m
Data extracted from GitHub repositories (training and test data-sets)
data.mendeley.com
Updated Aug 1, 2019
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Youcef Bouziane (2019). Data extracted from GitHub repositories (training and test data-sets) [Dataset]. http://doi.org/10.17632/gt3f4jnbvn.3
Explore at:
Unique identifier
https://doi.org/10.17632/gt3f4jnbvn.3
Dataset updated
Aug 1, 2019
Authors
Youcef Bouziane
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset contains the SQL tables of the training and test datasets used in our experimentation. These tables contain the preprocessed textual data (in a form of tokens) extracted from each training and test project. Besides the preprocessed textual data, this dataset also contains meta-data about the projects, GitHub topics, and GitHub collections. The GitHub projects are identified by the tuple “Owner” and “Name”. The descriptions of the table fields are attached to their respective data descriptions.
d
Pre-Application Projects
catalog.data.gov
data.oregon.gov
Updated Oct 11, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
data.oregon.gov (2025). Pre-Application Projects [Dataset]. https://catalog.data.gov/dataset/current-orca-projects
Explore at:
Dataset updated
Oct 11, 2025
Dataset provided by
data.oregon.gov
Description
This list includes all pipeline projects that have submitted an Intake. Some may be held at Intake due to early concept status or because the developer has reached their maximum project limit in ORCA.
d
Smart City Challenge Finalists Project Proposals - Calibration Data
catalog.data.gov
data.virginia.gov
+3more
Updated Mar 16, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
USDOT (2025). Smart City Challenge Finalists Project Proposals - Calibration Data [Dataset]. https://catalog.data.gov/dataset/smart-city-challenge-finalists-project-proposals-calibration-data
Explore at:
Dataset updated
Mar 16, 2025
Dataset provided by
USDOT
Description
Analysis of the projects proposed by the seven finalists to USDOT's Smart City Challenge, including challenge addressed, proposed project category, and project description. The time reported for the speed profiles are between 2:00PM to 8:00PM in increments of 10 minutes.
Z
UCI and OpenML Data Sets for Ordinal Quantification
data.niaid.nih.gov
zenodo.org
+1more
Updated Jul 25, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Bunse, Mirko; Moreo, Alejandro; Sebastiani, Fabrizio; Senz, Martin (2023). UCI and OpenML Data Sets for Ordinal Quantification [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_8177301
Explore at:
Dataset updated
Jul 25, 2023
Dataset provided by
Consiglio Nazionale delle Ricerche
TU Dortmund University
Authors
Bunse, Mirko; Moreo, Alejandro; Sebastiani, Fabrizio; Senz, Martin
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
These four labeled data sets are targeted at ordinal quantification. The goal of quantification is not to predict the label of each individual instance, but the distribution of labels in unlabeled sets of data.

With the scripts provided, you can extract CSV files from the UCI machine learning repository and from OpenML. The ordinal class labels stem from a binning of a continuous regression label.

We complement this data set with the indices of data items that appear in each sample of our evaluation. Hence, you can precisely replicate our samples by drawing the specified data items. The indices stem from two evaluation protocols that are well suited for ordinal quantification. To this end, each row in the files app_val_indices.csv, app_tst_indices.csv, app-oq_val_indices.csv, and app-oq_tst_indices.csv represents one sample.

Our first protocol is the artificial prevalence protocol (APP), where all possible distributions of labels are drawn with an equal probability. The second protocol, APP-OQ, is a variant thereof, where only the smoothest 20% of all APP samples are considered. This variant is targeted at ordinal quantification tasks, where classes are ordered and a similarity of neighboring classes can be assumed.

Usage

You can extract four CSV files through the provided script extract-oq.jl, which is conveniently wrapped in a Makefile. The Project.toml and Manifest.toml specify the Julia package dependencies, similar to a requirements file in Python.

Preliminaries: You have to have a working Julia installation. We have used Julia v1.6.5 in our experiments.

Data Extraction: In your terminal, you can call either

make

(recommended), or

julia --project="." --eval "using Pkg; Pkg.instantiate()" julia --project="." extract-oq.jl

Outcome: The first row in each CSV file is the header. The first column, named "class_label", is the ordinal class.

Further Reading

Implementation of our experiments: https://github.com/mirkobunse/regularized-oq
R
Data from: Bio Project Dataset
universe.roboflow.com
zip
Updated Feb 11, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
BIO Project (2025). Bio Project Dataset [Dataset]. https://universe.roboflow.com/bio-project-aynkq/bio-project/dataset/1
Explore at:
zipAvailable download formats
Dataset updated
Feb 11, 2025
Dataset authored and provided by
BIO Project
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Variables measured
Objects Bounding Boxes
Description
Bio Project

## Overview Bio Project is a dataset for object detection tasks - it contains Objects annotations for 831 images. ## Getting Started You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model. ## License This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
R
Data from: First Project Dataset
universe.roboflow.com
zip
Updated Jun 9, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Joshika (2022). First Project Dataset [Dataset]. https://universe.roboflow.com/joshika/first-project-xreqr/model/1
Explore at:
zipAvailable download formats
Dataset updated
Jun 9, 2022
Dataset authored and provided by
Joshika
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Variables measured
Cells Bounding Boxes
Description
First Project

## Overview First Project is a dataset for object detection tasks - it contains Cells annotations for 364 images. ## Getting Started You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model. ## License This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
B
Data Cleaning Sample
borealisdata.ca
dataone.org
Updated Jul 13, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Rong Luo (2023). Data Cleaning Sample [Dataset]. http://doi.org/10.5683/SP3/ZCN177
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.5683/SP3/ZCN177
Dataset updated
Jul 13, 2023
Dataset provided by
Borealis
Authors
Rong Luo
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
Sample data for exercises in Further Adventures in Data Cleaning.
R
Data from: Project Tank Dataset
universe.roboflow.com
zip
Updated Sep 9, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Project tank (2023). Project Tank Dataset [Dataset]. https://universe.roboflow.com/project-tank/project-tank/dataset/1
Explore at:
zipAvailable download formats
Dataset updated
Sep 9, 2023
Dataset authored and provided by
Project tank
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Variables measured
Tanks And Enemies Bounding Boxes
Description
Project Tank

## Overview Project Tank is a dataset for object detection tasks - it contains Tanks And Enemies annotations for 924 images. ## Getting Started You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model. ## License This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).

Facebook

Twitter

Click to copy link

Link copied

Cite

Crawl Feeds (2025). Walmart Products Dataset – Free Product Data CSV [Dataset]. https://crawlfeeds.com/datasets/walmart-products-free-dataset

Walmart Products Dataset – Free Product Data CSV

Walmart Products Dataset – Free Product Data CSV from Walmart.com

Explore at:

zip, csvAvailable download formats

Dataset updated

Dec 2, 2025

Dataset authored and provided by

Crawl Feeds

License

https://crawlfeeds.com/privacy_policyhttps://crawlfeeds.com/privacy_policy

Description

Looking for a free Walmart product dataset? The Walmart Products Free Dataset delivers a ready-to-use ecommerce product data CSV containing ~2,100 verified product records from Walmart.com. It includes vital details like product titles, prices, categories, brand info, availability, and descriptions — perfect for data analysis, price comparison, market research, or building machine-learning models.

Key Features

Complete Product Metadata: Each entry includes URL, title, brand, SKU, price, currency, description, availability, delivery method, average rating, total ratings, image links, unique ID, and timestamp.

CSV Format, Ready to Use: Download instantly - no need for scraping, cleaning or formatting.

Good for E-commerce Research & ML: Ideal for product cataloging, price tracking, demand forecasting, recommendation systems, or data-driven projects.

Free & Easy Access: Priced at USD $0.0, making it a great starting point for developers, data analysts or students.

Who Benefits?

Data analysts & researchers exploring e-commerce trends or product catalog data.
Developers & data scientists building price-comparison tools, recommendation engines or ML models.
E-commerce strategists/marketers need product metadata for competitive analysis or market research.
Students/hobbyists needing a free dataset for learning or demo projects.

Why Use This Dataset Instead of Manual Scraping?

Time-saving: No need to write scrapers or deal with rate limits.
Clean, structured data: All records are verified and already formatted in CSV, saving hours of cleaning.
Risk-free: Avoid Terms-of-Service issues or IP blocks that come with manual scraping.
Instant access: Free and immediately downloadable.

Clear search

Close search

Google apps

Main menu

Walmart Products Dataset – Free Product Data CSV

Key Features

Who Benefits?

Why Use This Dataset Instead of Manual Scraping?

Data from: NICHE: A Curated Dataset of Engineered Machine Learning Projects...

BIG DATA PROJECT

Dataset

Contents

Project Management

open-data-project

Data from: Project 2 Dataset

Dataset

Contents

Capital Projects

Transportation Projects in Your Neighborhood

Project Data Cost for Prediction

Materials Project Data

Insurance Dataset

Agile Project Data

Data extracted from GitHub repositories (training and test data-sets)

Pre-Application Projects

Smart City Challenge Finalists Project Proposals - Calibration Data

UCI and OpenML Data Sets for Ordinal Quantification

Data from: Bio Project Dataset

Bio Project

Data from: First Project Dataset

First Project

Data Cleaning Sample

Data from: Project Tank Dataset

Project Tank

Walmart Products Dataset – Free Product Data CSV

Walmart Products Dataset – Free Product Data CSV from Walmart.com

Key Features

Who Benefits?

Why Use This Dataset Instead of Manual Scraping?