Facebook
Twitterhttps://crawlfeeds.com/privacy_policyhttps://crawlfeeds.com/privacy_policy
Looking for a free Walmart product dataset? The Walmart Products Free Dataset delivers a ready-to-use ecommerce product data CSV containing ~2,100 verified product records from Walmart.com. It includes vital details like product titles, prices, categories, brand info, availability, and descriptions — perfect for data analysis, price comparison, market research, or building machine-learning models.
Complete Product Metadata: Each entry includes URL, title, brand, SKU, price, currency, description, availability, delivery method, average rating, total ratings, image links, unique ID, and timestamp.
CSV Format, Ready to Use: Download instantly - no need for scraping, cleaning or formatting.
Good for E-commerce Research & ML: Ideal for product cataloging, price tracking, demand forecasting, recommendation systems, or data-driven projects.
Free & Easy Access: Priced at USD $0.0, making it a great starting point for developers, data analysts or students.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Machine learning (ML) has gained much attention and has been incorporated into our daily lives. While there are numerous publicly available ML projects on open source platforms such as GitHub, there have been limited attempts in filtering those projects to curate ML projects of high quality. The limited availability of such high-quality dataset poses an obstacle to understanding ML projects. To help clear this obstacle, we present NICHE, a manually labelled dataset consisting of 572 ML projects. Based on evidences of good software engineering practices, we label 441 of these projects as engineered and 131 as non-engineered. In this repository we provide "NICHE.csv" file that contains the list of the project names along with their labels, descriptive information for every dimension, and several basic statistics, such as the number of stars and commits. This dataset can help researchers understand the practices that are followed in high-quality ML projects. It can also be used as a benchmark for classifiers designed to identify engineered ML projects.
GitHub page: https://github.com/soarsmu/NICHE
Facebook
TwitterThis dataset was created by Glitch_in_Vector
Chunk_0 for me, Choose others as you want.
Facebook
Twitterthe Department of Energy’s Enterprise Project Management Organization (EPMO), providing leadership and assistance in developing and implementing DOE-wide policies, procedures, programs, and management systems pertaining to project management, and independently monitors, assesses, and reports on project execution performance. The office validates project performance baselines–scope, cost and schedule–of the Department’s largest construction and environmental clean-up projects prior to budget request to Congress—an active project portfolio totaling over $30 billion. The office also serves as Executive Secretariat for the Department’s Energy Systems Acquisition Advisory Board (ESAAB) and the Project Management Risk Committee (PMRC). In these capacities, the Director is accountable to the Deputy Secretary.
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
damien-johnston/open-data-project dataset hosted on Hugging Face and contributed by the HF Datasets community
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
This dataset was created by Jordan Hill NMTAFE
Released under MIT
Facebook
TwitterCity of Pittsburgh Capital Projects Budgets NOTE: The data in this dataset has not updated since 2021 because of a broken data feed. We're working to fix it.
Facebook
TwitterThis data set contains DOT construction project information. The data is refreshed nightly from multiple data sources, therefore the data becomes stale rather quickly.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset is constructed from project activity experience.
Columns: not done - Projects that didn't worked out until accomplishment (0 = done // 1 = not done) time required - Time in hours estimated for the accomplishment cost - Cost per hour
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
A complete copy of the Materials Project database as of 10/18/2018. Mp_all files contain structure data for each material while mp_nostruct does not.Available as Monty Encoder encoded JSON and as CSV. Recommended access method for these particular files is with the matminer Python package using the datasets module. Access to the current Materials Project is recommended through their API (good), pymatgen (better), or matminer (best).Note on citations: If you found this dataset useful and would like to cite it in your work, please be sure to cite its original sources below rather than or in addition to this page.Dataset discussed in:A. Jain*, S.P. Ong*, G. Hautier, W. Chen, W.D. Richards, S. Dacek, S. Cholia, D. Gunter, D. Skinner, G. Ceder, K.A. Persson (*=equal contributions) The Materials Project: A materials genome approach to accelerating materials innovation APL Materials, 2013, 1(1), 011002.Dataset sourced from:https://materialsproject.org/Citations for specific material properties available here:https://materialsproject.org/citing
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
The Insurance Dataset project is an extensive initiative focused on collecting and analyzing insurance-related data from various sources.
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Dataset Variables
Agile Effectiveness (measured on a Likert scale from 2 to 5): This variable captures how respondents perceive the effectiveness of Agile methodology in enhancing project management processes.
Risk Mitigation (Likert scale 2 to 5): This variable reflects respondents' views on how well Agile methodology supports the mitigation of risks throughout the project lifecycle.
Management Satisfaction (Likert scale 2 to 5): This variable measures how satisfied the management is with the outcomes of projects where Agile methodologies were implemented.
Supply Chain Improvement (Likert scale 2 to 5): This variable captures the perceived improvements in supply chain processes that result from using Agile methods.
Time Efficiency (Likert scale 2 to 5): This measures the impact of Agile methodology on improving the efficiency of time management within projects.
Cost Savings (percentage from 10% to 48%): This variable quantifies the percentage of cost savings achieved as a result of implementing Agile methods.
Project Success (binary: 0 = Failure, 1 = Success): This is the dependent variable and represents whether or not the project was considered successful.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset contains the SQL tables of the training and test datasets used in our experimentation. These tables contain the preprocessed textual data (in a form of tokens) extracted from each training and test project. Besides the preprocessed textual data, this dataset also contains meta-data about the projects, GitHub topics, and GitHub collections. The GitHub projects are identified by the tuple “Owner” and “Name”. The descriptions of the table fields are attached to their respective data descriptions.
Facebook
TwitterThis list includes all pipeline projects that have submitted an Intake. Some may be held at Intake due to early concept status or because the developer has reached their maximum project limit in ORCA.
Facebook
TwitterAnalysis of the projects proposed by the seven finalists to USDOT's Smart City Challenge, including challenge addressed, proposed project category, and project description. The time reported for the speed profiles are between 2:00PM to 8:00PM in increments of 10 minutes.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
These four labeled data sets are targeted at ordinal quantification. The goal of quantification is not to predict the label of each individual instance, but the distribution of labels in unlabeled sets of data.
With the scripts provided, you can extract CSV files from the UCI machine learning repository and from OpenML. The ordinal class labels stem from a binning of a continuous regression label.
We complement this data set with the indices of data items that appear in each sample of our evaluation. Hence, you can precisely replicate our samples by drawing the specified data items. The indices stem from two evaluation protocols that are well suited for ordinal quantification. To this end, each row in the files app_val_indices.csv, app_tst_indices.csv, app-oq_val_indices.csv, and app-oq_tst_indices.csv represents one sample.
Our first protocol is the artificial prevalence protocol (APP), where all possible distributions of labels are drawn with an equal probability. The second protocol, APP-OQ, is a variant thereof, where only the smoothest 20% of all APP samples are considered. This variant is targeted at ordinal quantification tasks, where classes are ordered and a similarity of neighboring classes can be assumed.
Usage
You can extract four CSV files through the provided script extract-oq.jl, which is conveniently wrapped in a Makefile. The Project.toml and Manifest.toml specify the Julia package dependencies, similar to a requirements file in Python.
Preliminaries: You have to have a working Julia installation. We have used Julia v1.6.5 in our experiments.
Data Extraction: In your terminal, you can call either
make
(recommended), or
julia --project="." --eval "using Pkg; Pkg.instantiate()" julia --project="." extract-oq.jl
Outcome: The first row in each CSV file is the header. The first column, named "class_label", is the ordinal class.
Further Reading
Implementation of our experiments: https://github.com/mirkobunse/regularized-oq
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
## Overview
Bio Project is a dataset for object detection tasks - it contains Objects annotations for 831 images.
## Getting Started
You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
## License
This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
## Overview
First Project is a dataset for object detection tasks - it contains Cells annotations for 364 images.
## Getting Started
You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
## License
This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Sample data for exercises in Further Adventures in Data Cleaning.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
## Overview
Project Tank is a dataset for object detection tasks - it contains Tanks And Enemies annotations for 924 images.
## Getting Started
You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
## License
This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
Facebook
Twitterhttps://crawlfeeds.com/privacy_policyhttps://crawlfeeds.com/privacy_policy
Looking for a free Walmart product dataset? The Walmart Products Free Dataset delivers a ready-to-use ecommerce product data CSV containing ~2,100 verified product records from Walmart.com. It includes vital details like product titles, prices, categories, brand info, availability, and descriptions — perfect for data analysis, price comparison, market research, or building machine-learning models.
Complete Product Metadata: Each entry includes URL, title, brand, SKU, price, currency, description, availability, delivery method, average rating, total ratings, image links, unique ID, and timestamp.
CSV Format, Ready to Use: Download instantly - no need for scraping, cleaning or formatting.
Good for E-commerce Research & ML: Ideal for product cataloging, price tracking, demand forecasting, recommendation systems, or data-driven projects.
Free & Easy Access: Priced at USD $0.0, making it a great starting point for developers, data analysts or students.