100+ datasets found
  1. h

    amazon-product-data-sample

    • huggingface.co
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Iftach Arbel, amazon-product-data-sample [Dataset]. https://huggingface.co/datasets/iarbel/amazon-product-data-sample
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Authors
    Iftach Arbel
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    Dataset Card for "amazon-product-data-filter"

      Dataset Summary
    

    The Amazon Product Dataset contains product listing data from the Amazon US website. It can be used for various NLP and classification tasks, such as text generation, product type classification, attribute extraction, image recognition and more. NOTICE: This is a sample of the full Amazon Product Dataset, which contains 1K examples. Follow the link to gain access to the full dataset.

      Languages… See the full description on the dataset page: https://huggingface.co/datasets/iarbel/amazon-product-data-sample.
    
  2. Company Datasets for Business Profiling

    • datarade.ai
    Updated Feb 23, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Oxylabs (2017). Company Datasets for Business Profiling [Dataset]. https://datarade.ai/data-products/company-datasets-for-business-profiling-oxylabs
    Explore at:
    .json, .xml, .csv, .xlsAvailable download formats
    Dataset updated
    Feb 23, 2017
    Dataset authored and provided by
    Oxylabs
    Area covered
    Canada, Northern Mariana Islands, British Indian Ocean Territory, Isle of Man, Andorra, Moldova (Republic of), Tunisia, Taiwan, Bangladesh, Nepal
    Description

    Company Datasets for valuable business insights!

    Discover new business prospects, identify investment opportunities, track competitor performance, and streamline your sales efforts with comprehensive Company Datasets.

    These datasets are sourced from top industry providers, ensuring you have access to high-quality information:

    • Owler: Gain valuable business insights and competitive intelligence. -AngelList: Receive fresh startup data transformed into actionable insights. -CrunchBase: Access clean, parsed, and ready-to-use business data from private and public companies. -Craft.co: Make data-informed business decisions with Craft.co's company datasets. -Product Hunt: Harness the Product Hunt dataset, a leader in curating the best new products.

    We provide fresh and ready-to-use company data, eliminating the need for complex scraping and parsing. Our data includes crucial details such as:

    • Company name;
    • Size;
    • Founding date;
    • Location;
    • Industry;
    • Revenue;
    • Employee count;
    • Competitors.

    You can choose your preferred data delivery method, including various storage options, delivery frequency, and input/output formats.

    Receive datasets in CSV, JSON, and other formats, with storage options like AWS S3 and Google Cloud Storage. Opt for one-time, monthly, quarterly, or bi-annual data delivery.

    With Oxylabs Datasets, you can count on:

    • Fresh and accurate data collected and parsed by our expert web scraping team.
    • Time and resource savings, allowing you to focus on data analysis and achieving your business goals.
    • A customized approach tailored to your specific business needs.
    • Legal compliance in line with GDPR and CCPA standards, thanks to our membership in the Ethical Web Data Collection Initiative.

    Pricing Options:

    Standard Datasets: choose from various ready-to-use datasets with standardized data schemas, priced from $1,000/month.

    Custom Datasets: Tailor datasets from any public web domain to your unique business needs. Contact our sales team for custom pricing.

    Experience a seamless journey with Oxylabs:

    • Understanding your data needs: We work closely to understand your business nature and daily operations, defining your unique data requirements.
    • Developing a customized solution: Our experts create a custom framework to extract public data using our in-house web scraping infrastructure.
    • Delivering data sample: We provide a sample for your feedback on data quality and the entire delivery process.
    • Continuous data delivery: We continuously collect public data and deliver custom datasets per the agreed frequency.

    Unlock the power of data with Oxylabs' Company Datasets and supercharge your business insights today!

  3. Product Catalog Dataset

    • brightdata.com
    .json, .csv, .xlsx
    Updated Apr 22, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bright Data (2024). Product Catalog Dataset [Dataset]. https://brightdata.com/products/datasets/product-catalog
    Explore at:
    .json, .csv, .xlsxAvailable download formats
    Dataset updated
    Apr 22, 2024
    Dataset authored and provided by
    Bright Datahttps://brightdata.com/
    License

    https://brightdata.com/licensehttps://brightdata.com/license

    Area covered
    Worldwide
    Description

    The Product Catalog Data provides a comprehensive overview of products across various categories. This dataset includes detailed product titles, descriptions, barcodes, category-specific attributes, weight, measurements, and imagery. It's tailored for marketplaces, eCommerce sites, and data analysts who require in-depth product information to enhance user experience, SEO, and product categorization.

    Popular Attributes:

    ✔ Detailed product information

    ✔ High-quality imagery

    ✔ Extensive attribute coverage

    ✔ Ideal for UX and SEO optimization

    ✔ Comprehensive product categorization

    Key Information:

    Rich dataset with 30+ attributes per product

    Pricing: Flexible subscription models

    Update Frequency: Daily updates

    Coverage: Global and specific markets

    Historical Data: 12 Months +

  4. Dairy Supply Chain Sales Dataset

    • zenodo.org
    • data.niaid.nih.gov
    pdf, zip
    Updated Jul 12, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dimitris Iatropoulos; Konstantinos Georgakidis; Konstantinos Georgakidis; Ilias Siniosoglou; Ilias Siniosoglou; Christos Chaschatzis; Christos Chaschatzis; Anna Triantafyllou; Anna Triantafyllou; Athanasios Liatifis; Athanasios Liatifis; Dimitrios Pliatsios; Dimitrios Pliatsios; Thomas Lagkas; Thomas Lagkas; Vasileios Argyriou; Vasileios Argyriou; Panagiotis Sarigiannidis; Panagiotis Sarigiannidis; Dimitris Iatropoulos (2024). Dairy Supply Chain Sales Dataset [Dataset]. http://doi.org/10.21227/smv6-z405
    Explore at:
    zip, pdfAvailable download formats
    Dataset updated
    Jul 12, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Dimitris Iatropoulos; Konstantinos Georgakidis; Konstantinos Georgakidis; Ilias Siniosoglou; Ilias Siniosoglou; Christos Chaschatzis; Christos Chaschatzis; Anna Triantafyllou; Anna Triantafyllou; Athanasios Liatifis; Athanasios Liatifis; Dimitrios Pliatsios; Dimitrios Pliatsios; Thomas Lagkas; Thomas Lagkas; Vasileios Argyriou; Vasileios Argyriou; Panagiotis Sarigiannidis; Panagiotis Sarigiannidis; Dimitris Iatropoulos
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    1.Introduction

    Sales data collection is a crucial aspect of any manufacturing industry as it provides valuable insights about the performance of products, customer behaviour, and market trends. By gathering and analysing this data, manufacturers can make informed decisions about product development, pricing, and marketing strategies in Internet of Things (IoT) business environments like the dairy supply chain.

    One of the most important benefits of the sales data collection process is that it allows manufacturers to identify their most successful products and target their efforts towards those areas. For example, if a manufacturer could notice that a particular product is selling well in a certain region, this information could be utilised to develop new products, optimise the supply chain or improve existing ones to meet the changing needs of customers.

    This dataset includes information about 7 of MEVGAL’s products [1]. According to the above information the data published will help researchers to understand the dynamics of the dairy market and its consumption patterns, which is creating the fertile ground for synergies between academia and industry and eventually help the industry in making informed decisions regarding product development, pricing and market strategies in the IoT playground. The use of this dataset could also aim to understand the impact of various external factors on the dairy market such as the economic, environmental, and technological factors. It could help in understanding the current state of the dairy industry and identifying potential opportunities for growth and development.

    2. Citation

    Please cite the following papers when using this dataset:

    1. I. Siniosoglou, K. Xouveroudis, V. Argyriou, T. Lagkas, S. K. Goudos, K. E. Psannis and P. Sarigiannidis, "Evaluating the Effect of Volatile Federated Timeseries on Modern DNNs: Attention over Long/Short Memory," in the 12th International Conference on Circuits and Systems Technologies (MOCAST 2023), April 2023, Accepted

    3. Dataset Modalities

    The dataset includes data regarding the daily sales of a series of dairy product codes offered by MEVGAL. In particular, the dataset includes information gathered by the logistics division and agencies within the industrial infrastructures overseeing the production of each product code. The products included in this dataset represent the daily sales and logistics of a variety of yogurt-based stock. Each of the different files include the logistics for that product on a daily basis for three years, from 2020 to 2022.

    3.1 Data Collection

    The process of building this dataset involves several steps to ensure that the data is accurate, comprehensive and relevant.

    The first step is to determine the specific data that is needed to support the business objectives of the industry, i.e., in this publication’s case the daily sales data.

    Once the data requirements have been identified, the next step is to implement an effective sales data collection method. In MEVGAL’s case this is conducted through direct communication and reports generated each day by representatives & selling points.

    It is also important for MEVGAL to ensure that the data collection process conducted is in an ethical and compliant manner, adhering to data privacy laws and regulation. The industry also has a data management plan in place to ensure that the data is securely stored and protected from unauthorised access.

    The published dataset is consisted of 13 features providing information about the date and the number of products that have been sold. Finally, the dataset was anonymised in consideration to the privacy requirement of the data owner (MEVGAL).

    File

    Period

    Number of Samples (days)

    product 1 2020.xlsx

    01/01/2020–31/12/2020

    363

    product 1 2021.xlsx

    01/01/2021–31/12/2021

    364

    product 1 2022.xlsx

    01/01/2022–31/12/2022

    365

    product 2 2020.xlsx

    01/01/2020–31/12/2020

    363

    product 2 2021.xlsx

    01/01/2021–31/12/2021

    364

    product 2 2022.xlsx

    01/01/2022–31/12/2022

    365

    product 3 2020.xlsx

    01/01/2020–31/12/2020

    363

    product 3 2021.xlsx

    01/01/2021–31/12/2021

    364

    product 3 2022.xlsx

    01/01/2022–31/12/2022

    365

    product 4 2020.xlsx

    01/01/2020–31/12/2020

    363

    product 4 2021.xlsx

    01/01/2021–31/12/2021

    364

    product 4 2022.xlsx

    01/01/2022–31/12/2022

    364

    product 5 2020.xlsx

    01/01/2020–31/12/2020

    363

    product 5 2021.xlsx

    01/01/2021–31/12/2021

    364

    product 5 2022.xlsx

    01/01/2022–31/12/2022

    365

    product 6 2020.xlsx

    01/01/2020–31/12/2020

    362

    product 6 2021.xlsx

    01/01/2021–31/12/2021

    364

    product 6 2022.xlsx

    01/01/2022–31/12/2022

    365

    product 7 2020.xlsx

    01/01/2020–31/12/2020

    362

    product 7 2021.xlsx

    01/01/2021–31/12/2021

    364

    product 7 2022.xlsx

    01/01/2022–31/12/2022

    365

    3.2 Dataset Overview

    The following table enumerates and explains the features included across all of the included files.

    Feature

    Description

    Unit

    Day

    day of the month

    -

    Month

    Month

    -

    Year

    Year

    -

    daily_unit_sales

    Daily sales - the amount of products, measured in units, that during that specific day were sold

    units

    previous_year_daily_unit_sales

    Previous Year’s sales - the amount of products, measured in units, that during that specific day were sold the previous year

    units

    percentage_difference_daily_unit_sales

    The percentage difference between the two above values

    %

    daily_unit_sales_kg

    The amount of products, measured in kilograms, that during that specific day were sold

    kg

    previous_year_daily_unit_sales_kg

    Previous Year’s sales - the amount of products, measured in kilograms, that during that specific day were sold, the previous year

    kg

    percentage_difference_daily_unit_sales_kg

    The percentage difference between the two above values

    kg

    daily_unit_returns_kg

    The percentage of the products that were shipped to selling points and were returned

    %

    previous_year_daily_unit_returns_kg

    The percentage of the products that were shipped to

  5. u

    Product Exchange/Bartering Data

    • cseweb.ucsd.edu
    json
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    UCSD CSE Research Project, Product Exchange/Bartering Data [Dataset]. https://cseweb.ucsd.edu/~jmcauley/datasets.html
    Explore at:
    jsonAvailable download formats
    Dataset authored and provided by
    UCSD CSE Research Project
    Description

    These datasets contain peer-to-peer trades from various recommendation platforms.

    Metadata includes

    • peer-to-peer trades

    • have and want lists

    • image data (tradesy)

  6. Retail Transactions Dataset

    • kaggle.com
    Updated May 18, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Prasad Patil (2024). Retail Transactions Dataset [Dataset]. https://www.kaggle.com/datasets/prasad22/retail-transactions-dataset
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 18, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Prasad Patil
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    This dataset was created to simulate a market basket dataset, providing insights into customer purchasing behavior and store operations. The dataset facilitates market basket analysis, customer segmentation, and other retail analytics tasks. Here's more information about the context and inspiration behind this dataset:

    Context:

    Retail businesses, from supermarkets to convenience stores, are constantly seeking ways to better understand their customers and improve their operations. Market basket analysis, a technique used in retail analytics, explores customer purchase patterns to uncover associations between products, identify trends, and optimize pricing and promotions. Customer segmentation allows businesses to tailor their offerings to specific groups, enhancing the customer experience.

    Inspiration:

    The inspiration for this dataset comes from the need for accessible and customizable market basket datasets. While real-world retail data is sensitive and often restricted, synthetic datasets offer a safe and versatile alternative. Researchers, data scientists, and analysts can use this dataset to develop and test algorithms, models, and analytical tools.

    Dataset Information:

    The columns provide information about the transactions, customers, products, and purchasing behavior, making the dataset suitable for various analyses, including market basket analysis and customer segmentation. Here's a brief explanation of each column in the Dataset:

    • Transaction_ID: A unique identifier for each transaction, represented as a 10-digit number. This column is used to uniquely identify each purchase.
    • Date: The date and time when the transaction occurred. It records the timestamp of each purchase.
    • Customer_Name: The name of the customer who made the purchase. It provides information about the customer's identity.
    • Product: A list of products purchased in the transaction. It includes the names of the products bought.
    • Total_Items: The total number of items purchased in the transaction. It represents the quantity of products bought.
    • Total_Cost: The total cost of the purchase, in currency. It represents the financial value of the transaction.
    • Payment_Method: The method used for payment in the transaction, such as credit card, debit card, cash, or mobile payment.
    • City: The city where the purchase took place. It indicates the location of the transaction.
    • Store_Type: The type of store where the purchase was made, such as a supermarket, convenience store, department store, etc.
    • Discount_Applied: A binary indicator (True/False) representing whether a discount was applied to the transaction.
    • Customer_Category: A category representing the customer's background or age group.
    • Season: The season in which the purchase occurred, such as spring, summer, fall, or winter.
    • Promotion: The type of promotion applied to the transaction, such as "None," "BOGO (Buy One Get One)," or "Discount on Selected Items."

    Use Cases:

    • Market Basket Analysis: Discover associations between products and uncover buying patterns.
    • Customer Segmentation: Group customers based on purchasing behavior.
    • Pricing Optimization: Optimize pricing strategies and identify opportunities for discounts and promotions.
    • Retail Analytics: Analyze store performance and customer trends.

    Note: This dataset is entirely synthetic and was generated using the Python Faker library, which means it doesn't contain real customer data. It's designed for educational and research purposes.

  7. b

    eBay Datasets

    • brightdata.com
    .json, .csv, .xlsx
    Updated Apr 30, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bright Data (2022). eBay Datasets [Dataset]. https://brightdata.com/products/datasets/ebay
    Explore at:
    .json, .csv, .xlsxAvailable download formats
    Dataset updated
    Apr 30, 2022
    Dataset authored and provided by
    Bright Data
    License

    https://brightdata.com/licensehttps://brightdata.com/license

    Description

    Access our extensive eBay datasets that provide detailed information on product listings and seller performance. Gain insights into product details, pricing, item condition, seller ratings, shipping policies, and customer reviews. Free samples are available for evaluation. 400K+ records available Price starts at $250/100K records Data formats are available in JSON, NDJSON, CSV, XLSX and Parquet. 100% ethical and compliant data collection Included datapoints:

    Product ID & URL Product Title & Images Seller Name, Rating & Reviews Price & Currency Item Condition Available & Sold Count Item Location & Shipping Details Return Policy Product Specifications Product Ratings & Customer Reviews Related & Sponsored Items And more

  8. Shopee Dataset

    • brightdata.com
    .json, .csv, .xlsx
    Updated Jul 30, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bright Data (2024). Shopee Dataset [Dataset]. https://brightdata.com/products/datasets/shopee
    Explore at:
    .json, .csv, .xlsxAvailable download formats
    Dataset updated
    Jul 30, 2025
    Dataset authored and provided by
    Bright Datahttps://brightdata.com/
    License

    https://brightdata.com/licensehttps://brightdata.com/license

    Area covered
    Worldwide
    Description

    The Shopee Products Dataset is a comprehensive resource that empowers businesses, researchers, and analysts to gain a holistic view of the Shopee e-commerce ecosystem. Whether your goal is to conduct market analysis, optimize pricing strategies, understand customer behavior, or evaluate competitors, this dataset offers the essential information you need to make informed decisions and succeed in the dynamic world of Shopee. At its core, this dataset provides key attributes such as product ID, title, ratings, reviews, pricing details, and seller information, among others. These fundamental data elements offer insights into product performance, customer sentiment, and seller credibility.

  9. Zalando Dataset

    • brightdata.com
    .json, .csv, .xlsx
    Updated Apr 17, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bright Data (2024). Zalando Dataset [Dataset]. https://brightdata.com/products/datasets/zalando
    Explore at:
    .json, .csv, .xlsxAvailable download formats
    Dataset updated
    Apr 17, 2024
    Dataset authored and provided by
    Bright Datahttps://brightdata.com/
    License

    https://brightdata.com/licensehttps://brightdata.com/license

    Area covered
    Worldwide
    Description

    Use our Zalando DE & UK products dataset to get a complete snapshot of new products, categories, pricing, and consumer reviews. Depending on your needs, you may purchase the entire dataset or a customized subset. Popular use cases: Identify product inventory gaps and increased demand for certain products, analyze consumer sentiment and define a pricing strategy by locating similar products and categories among your competitors. Beat your eCommerce competitors using a Zalando.de & Zalando.co.uk products dataset to get a complete overview of product pricing, product strategies, and customer reviews. The dataset includes all major data points: Product SKU Currency Timestamp Price Similar products Bought together products Top reviews Rating and more

  10. Amazon Products Database contains data on keywords and product listings...

    • datarade.ai
    .json
    Updated Sep 27, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    DataForSEO (2023). Amazon Products Database contains data on keywords and product listings ranking for them [Dataset]. https://datarade.ai/data-products/amazon-products-database-contains-data-on-keywords-and-produc-dataforseo
    Explore at:
    .jsonAvailable download formats
    Dataset updated
    Sep 27, 2023
    Dataset provided by
    Authors
    DataForSEO
    Area covered
    Saudi Arabia, United States of America, Egypt, United Arab Emirates
    Description

    First of all, Amazon product datasets are indispensable for reverse engineering your rivals. For example, you can collect a list of keywords you already rank for or want to, and go through DataForSEO Amazon Products Database to find other sellers appearing as the top results for these terms.

    Next, you can narrow down the scope of your contenders to those performing the best. To do so, you can filter out sellers who won the “Amazon’s Choice” and those whose products got listed multiple times on the first page.

    Once you’ve compiled the final list of your challengers, Amazon Products Database will help you to quickly examine product titles, descriptions, prices, images, and other details that will let you grasp the main contributors to your competitors’ success. Once you’ve figured that out, you can start optimizing your product listings and pricing strategies to increase conversions.

    However, the number of use cases for Amazon product data isn’t limited to competitor analysis. It can be applied to monitoring product rankings, running price comparisons, and more.

  11. h

    Amazon-Product-Description

    • huggingface.co
    Updated Apr 8, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ateeq Azam (2025). Amazon-Product-Description [Dataset]. https://huggingface.co/datasets/Ateeqq/Amazon-Product-Description
    Explore at:
    Dataset updated
    Apr 8, 2025
    Authors
    Ateeq Azam
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Description

    Amazon Product Description Dataset

    This dataset is a cleaned version of Amazon Product Data. Cleaned by team at https://exnrt.com

    421K Unique Examples Empty description rows are being removed. Description Smaller then 200 characters are removed Convert to Proper Format Remove non-ASCII characters from both column And few more techniques

      Original Dataset
    

    This original dataset has 10 Million Examples. Original, Un-cleaned DataSet:… See the full description on the dataset page: https://huggingface.co/datasets/Ateeqq/Amazon-Product-Description.

  12. U

    LCMAP Hawaii Reference Data Product land cover, land use and change process...

    • data.usgs.gov
    • datasets.ai
    • +3more
    Updated Jan 7, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Josephine Horton; Steve Stehman; Roger Auch; Steven Kambly; Janis (CTR) (2025). LCMAP Hawaii Reference Data Product land cover, land use and change process attributes [Dataset]. http://doi.org/10.5066/P9X42T97
    Explore at:
    Dataset updated
    Jan 7, 2025
    Dataset provided by
    United States Geological Surveyhttp://www.usgs.gov/
    Authors
    Josephine Horton; Steve Stehman; Roger Auch; Steven Kambly; Janis (CTR)
    License

    U.S. Government Workshttps://www.usa.gov/government-works
    License information was derived automatically

    Time period covered
    Jan 1, 2000 - Dec 31, 2019
    Area covered
    Hawaii
    Description

    This product contains plot location data for LCMAP Hawaii Reference Data in a .shp format as well as annual land cover, land use, and change process variables for each reference data plot in a separate .csv table. The same information available in the.csv file is also provided in a .xlsx format. The LCMAP Hawaii Reference Data Product was utilized for evaluation and validation of the Land Change Monitoring, Assessment, and Projection (LCMAP) land cover and land cover change products. The LCMAP Hawaii Reference Data Product includes the collection of an independent dataset of 600 30-meter by 30-meter plots across the island chain of Hawaii. The LCMAP Hawaii Reference Data Products collected variables related to primary and secondary land use, primary and secondary land cover(s), change processes, and other ancillary variables annually across Hawaii from 2000-2019. The sites in this dataset were collected via manual image interpretation. These samples were selected using a strat ...

  13. Walmart Datasets

    • brightdata.com
    .json, .csv, .xlsx
    Updated Dec 23, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bright Data (2024). Walmart Datasets [Dataset]. https://brightdata.com/products/datasets/walmart
    Explore at:
    .json, .csv, .xlsxAvailable download formats
    Dataset updated
    Dec 23, 2024
    Dataset authored and provided by
    Bright Datahttps://brightdata.com/
    License

    https://brightdata.com/licensehttps://brightdata.com/license

    Area covered
    Worldwide
    Description

    Use our constantly updated Walmart products dataset to get a complete snapshot of new products, categories, pricing, and consumer reviews. You may purchase the entire dataset or a customized subset, depending on your needs. Popular use cases: Identify product inventory gaps and increased demand for certain products, analyze consumer sentiment and define a pricing strategy by locating similar products and categories among your competitors. The dataset includes all major data points: product, SKU, GTIN, currency,timestamp, price,a nd more. Get your Walmart dataset today!

  14. d

    Ecommerce Data - Product data, Seller data, Market data, Pricing data|...

    • datarade.ai
    Updated Jan 29, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    APISCRAPY (2024). Ecommerce Data - Product data, Seller data, Market data, Pricing data| Scrape all publicly available eCommerce data| 50% Cost Saving | Free Sample [Dataset]. https://datarade.ai/data-products/apiscrapy-mobile-app-data-api-scraping-service-app-intel-apiscrapy
    Explore at:
    .bin, .json, .xml, .csv, .xls, .sql, .txtAvailable download formats
    Dataset updated
    Jan 29, 2024
    Dataset authored and provided by
    APISCRAPY
    Area covered
    Bosnia and Herzegovina, United States of America, China, Åland Islands, Norway, Spain, Malta, Switzerland, Isle of Man, Ukraine
    Description

    Note:- Only publicly available data can be worked upon

    In today's ever-evolving Ecommerce landscape, success hinges on the ability to harness the power of data. APISCRAPY is your strategic ally, dedicated to providing a comprehensive solution for extracting critical Ecommerce data, including Ecommerce market data, Ecommerce product data, and Ecommerce datasets. With the Ecommerce arena being more competitive than ever, having a data-driven approach is no longer a luxury but a necessity.

    APISCRAPY's forte lies in its ability to unearth valuable Ecommerce market data. We recognize that understanding the market dynamics, trends, and fluctuations is essential for making informed decisions.

    APISCRAPY's AI-driven ecommerce data scraping service presents several advantages for individuals and businesses seeking comprehensive insights into the ecommerce market. Here are key benefits associated with their advanced data extraction technology:

    1. Ecommerce Product Data: APISCRAPY's AI-driven approach ensures the extraction of detailed Ecommerce Product Data, including product specifications, images, and pricing information. This comprehensive data is valuable for market analysis and strategic decision-making.

    2. Data Customization: APISCRAPY enables users to customize the data extraction process, ensuring that the extracted ecommerce data aligns precisely with their informational needs. This customization option adds versatility to the service.

    3. Efficient Data Extraction: APISCRAPY's technology streamlines the data extraction process, saving users time and effort. The efficiency of the extraction workflow ensures that users can obtain relevant ecommerce data swiftly and consistently.

    4. Realtime Insights: Businesses can gain real-time insights into the dynamic Ecommerce Market by accessing rapidly extracted data. This real-time information is crucial for staying ahead of market trends and making timely adjustments to business strategies.

    5. Scalability: The technology behind APISCRAPY allows scalable extraction of ecommerce data from various sources, accommodating evolving data needs and handling increased volumes effortlessly.

    Beyond the broader market, a deeper dive into specific products can provide invaluable insights. APISCRAPY excels in collecting Ecommerce product data, enabling businesses to analyze product performance, pricing strategies, and customer reviews.

    To navigate the complexities of the Ecommerce world, you need access to robust datasets. APISCRAPY's commitment to providing comprehensive Ecommerce datasets ensures businesses have the raw materials required for effective decision-making.

    Our primary focus is on Amazon data, offering businesses a wealth of information to optimize their Amazon presence. By doing so, we empower our clients to refine their strategies, enhance their products, and make data-backed decisions.

    [Tags: Ecommerce data, Ecommerce Data Sample, Ecommerce Product Data, Ecommerce Datasets, Ecommerce market data, Ecommerce Market Datasets, Ecommerce Sales data, Ecommerce Data API, Amazon Ecommerce API, Ecommerce scraper, Ecommerce Web Scraping, Ecommerce Data Extraction, Ecommerce Crawler, Ecommerce data scraping, Amazon Data, Ecommerce web data]

  15. Store Sales - T.S Forecasting...Merged Dataset

    • kaggle.com
    Updated Dec 15, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Shramana Bhattacharya (2021). Store Sales - T.S Forecasting...Merged Dataset [Dataset]. https://www.kaggle.com/shramanabhattacharya/store-sales-ts-forecastingmerged-dataset/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Dec 15, 2021
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Shramana Bhattacharya
    Description

    This dataset is a merged dataset created from the data provided in the competition "Store Sales - Time Series Forecasting". The other datasets that were provided there apart from train and test (for example holidays_events, oil, stores, etc.) could not be used in the final prediction. According to my understanding, through the EDA of the merged dataset, we will be able to get a clearer picture of the other factors that might also affect the final prediction of grocery sales. Therefore, I created this merged dataset and posted it here for the further scope of analysis.

    ##### Data Description Data Field Information (This is a copy of the description as provided in the actual dataset)

    Train.csv - id: store id - date: date of the sale - store_nbr: identifies the store at which the products are sold. -**family**: identifies the type of product sold. - sales: gives the total sales for a product family at a particular store at a given date. Fractional values are possible since products can be sold in fractional units (1.5 kg of cheese, for instance, as opposed to 1 bag of chips). - onpromotion: gives the total number of items in a product family that were being promoted at a store on a given date. - Store metadata, including ****city, state, type, and cluster.**** - cluster is a grouping of similar stores. - Holidays and Events, with metadata NOTE: Pay special attention to the transferred column. A holiday that is transferred officially falls on that calendar day but was moved to another date by the government. A transferred day is more like a normal day than a holiday. To find the day that it was celebrated, look for the corresponding row where the type is Transfer. For example, the holiday Independencia de Guayaquil was transferred from 2012-10-09 to 2012-10-12, which means it was celebrated on 2012-10-12. Days that are type Bridge are extra days that are added to a holiday (e.g., to extend the break across a long weekend). These are frequently made up by the type Work Day which is a day not normally scheduled for work (e.g., Saturday) that is meant to pay back the Bridge. Additional holidays are days added to a regular calendar holiday, for example, as typically happens around Christmas (making Christmas Eve a holiday). - dcoilwtico: Daily oil price. Includes values during both the train and test data timeframes. (Ecuador is an oil-dependent country and its economic health is highly vulnerable to shocks in oil prices.)

    **Note: ***There is a transaction column in the training dataset which displays the sales transactions on that particular date. * Test.csv - The test data, having the same features like the training data. You will predict the target sales for the dates in this file. - The dates in the test data are for the 15 days after the last date in the training data. **Note: ***There is a no transaction column in the test dataset as was there in the training dataset. Therefore, while building the model, you might exclude this column and may use it only for EDA.*

    submission.csv - A sample submission file in the correct format.

  16. Pinterest Fashion Compatibility Dataset

    • kaggle.com
    Updated Oct 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ahmad (2023). Pinterest Fashion Compatibility Dataset [Dataset]. https://www.kaggle.com/datasets/pypiahmad/shop-the-look-dataset
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Oct 30, 2023
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Ahmad
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    The Pinterest Fashion Compatibility dataset comprises images showcasing fashion products, each annotated with bounding boxes and associated with links directing to the corresponding products. This dataset facilitates the exploration of scene-based complementary product recommendation, aiming to complete the look presented in each scene by recommending compatible fashion items.

    Basic Statistics: - Scenes: 47,739 - Products: 38,111 - Scene-Product Pairs: 93,274

    Metadata: - Product IDs: Identifiers for the products featured in the images. - Bounding Boxes: Coordinates specifying the location of each product within the image.

    Example (fashion.json): The dataset contains JSON entries where each entry associates a product with a scene, along with the bounding box coordinates for the product within the scene. json { "product": "0027e30879ce3d87f82f699f148bff7e", "scene": "cdab9160072dd1800038227960ff6467", "bbox": [ 0.434097, 0.859363, 0.560254, 1.0 ] }

    Citation: If you utilize this dataset, please cite the following paper: Title: Complete the Look: Scene-based complementary product recommendation Authors: Wang-Cheng Kang, Eric Kim, Jure Leskovec, Charles Rosenberg, Julian McAuley Published in: CVPR, 2019 Link to paper

    Code and Additional Resources: For additional resources, sample code, and instructions on how to collect the product images from Pinterest, you can visit the GitHub repository.

    This dataset provides a rich ground for research and development in the domain of fashion-based image recognition, product recommendation, and the exploration of fashion styles and trends through machine learning and computer vision techniques.

  17. u

    Pinterest Fashion Compatibility

    • cseweb.ucsd.edu
    • beta.data.urbandatacentre.ca
    json
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    UCSD CSE Research Project, Pinterest Fashion Compatibility [Dataset]. https://cseweb.ucsd.edu/~jmcauley/datasets.html
    Explore at:
    jsonAvailable download formats
    Dataset authored and provided by
    UCSD CSE Research Project
    Description

    This dataset contains images (scenes) containing fashion products, which are labeled with bounding boxes and links to the corresponding products.

    Metadata includes

    • product IDs

    • bounding boxes

    Basic Statistics:

    • Scenes: 47,739

    • Products: 38,111

    • Scene-Product Pairs: 93,274

  18. h

    Prada.Product.prices.Sweden

    • huggingface.co
    Updated Nov 17, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Data Boutique (2023). Prada.Product.prices.Sweden [Dataset]. https://huggingface.co/datasets/DBQ/Prada.Product.prices.Sweden
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Nov 17, 2023
    Dataset authored and provided by
    Data Boutique
    License

    https://choosealicense.com/licenses/unknown/https://choosealicense.com/licenses/unknown/

    Area covered
    Sweden
    Description

    Prada web scraped data

      About the website
    

    The Luxury Fashion Industry in the EMEA region, particularly in Sweden, is a thriving market with high demand for exclusive and high-end products. Prada, a renowned player in this industry, holds a significant presence. The industry is currently experiencing a significant shift towards digitalization and online retail, also known as Ecommerce, fueled by changing consumer behaviors and advancements in technology. A concrete example… See the full description on the dataset page: https://huggingface.co/datasets/DBQ/Prada.Product.prices.Sweden.

  19. Facebook Datasets

    • brightdata.com
    .json, .csv, .xlsx
    Updated Jul 16, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bright Data (2025). Facebook Datasets [Dataset]. https://brightdata.com/products/datasets/facebook
    Explore at:
    .json, .csv, .xlsxAvailable download formats
    Dataset updated
    Jul 16, 2025
    Dataset authored and provided by
    Bright Datahttps://brightdata.com/
    License

    https://brightdata.com/licensehttps://brightdata.com/license

    Area covered
    Worldwide
    Description

    Access our extensive Facebook datasets that provide detailed information on public posts, pages, and user engagement. Gain insights into post performance, audience interactions, page details, and content trends with our ethically sourced data. Free samples are available for evaluation. Over 940M records available Price starts at $250/100K records Data formats are available in JSON, NDJSON, CSV, XLSX and Parquet. 100% ethical and compliant data collection Included datapoints:

    Post ID Post Content & URL Date Posted Hashtags Number of Comments Number of Shares Likes & Reaction Counts (by type) Video View Count Page Name & Category Page Followers & Likes Page Verification Status Page Website & Contact Info Is Sponsored Post Attachments (Images/Videos) External Link Data And much more

  20. Market Basket Analysis

    • kaggle.com
    Updated Dec 9, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Aslan Ahmedov (2021). Market Basket Analysis [Dataset]. https://www.kaggle.com/datasets/aslanahmedov/market-basket-analysis
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Dec 9, 2021
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Aslan Ahmedov
    Description

    Market Basket Analysis

    Market basket analysis with Apriori algorithm

    The retailer wants to target customers with suggestions on itemset that a customer is most likely to purchase .I was given dataset contains data of a retailer; the transaction data provides data around all the transactions that have happened over a period of time. Retailer will use result to grove in his industry and provide for customer suggestions on itemset, we be able increase customer engagement and improve customer experience and identify customer behavior. I will solve this problem with use Association Rules type of unsupervised learning technique that checks for the dependency of one data item on another data item.

    Introduction

    Association Rule is most used when you are planning to build association in different objects in a set. It works when you are planning to find frequent patterns in a transaction database. It can tell you what items do customers frequently buy together and it allows retailer to identify relationships between the items.

    An Example of Association Rules

    Assume there are 100 customers, 10 of them bought Computer Mouth, 9 bought Mat for Mouse and 8 bought both of them. - bought Computer Mouth => bought Mat for Mouse - support = P(Mouth & Mat) = 8/100 = 0.08 - confidence = support/P(Mat for Mouse) = 0.08/0.09 = 0.89 - lift = confidence/P(Computer Mouth) = 0.89/0.10 = 8.9 This just simple example. In practice, a rule needs the support of several hundred transactions, before it can be considered statistically significant, and datasets often contain thousands or millions of transactions.

    Strategy

    • Data Import
    • Data Understanding and Exploration
    • Transformation of the data – so that is ready to be consumed by the association rules algorithm
    • Running association rules
    • Exploring the rules generated
    • Filtering the generated rules
    • Visualization of Rule

    Dataset Description

    • File name: Assignment-1_Data
    • List name: retaildata
    • File format: . xlsx
    • Number of Row: 522065
    • Number of Attributes: 7

      • BillNo: 6-digit number assigned to each transaction. Nominal.
      • Itemname: Product name. Nominal.
      • Quantity: The quantities of each product per transaction. Numeric.
      • Date: The day and time when each transaction was generated. Numeric.
      • Price: Product price. Numeric.
      • CustomerID: 5-digit number assigned to each customer. Nominal.
      • Country: Name of the country where each customer resides. Nominal.

    imagehttps://user-images.githubusercontent.com/91852182/145270162-fc53e5a3-4ad1-4d06-b0e0-228aabcf6b70.png">

    Libraries in R

    First, we need to load required libraries. Shortly I describe all libraries.

    • arules - Provides the infrastructure for representing, manipulating and analyzing transaction data and patterns (frequent itemsets and association rules).
    • arulesViz - Extends package 'arules' with various visualization. techniques for association rules and item-sets. The package also includes several interactive visualizations for rule exploration.
    • tidyverse - The tidyverse is an opinionated collection of R packages designed for data science.
    • readxl - Read Excel Files in R.
    • plyr - Tools for Splitting, Applying and Combining Data.
    • ggplot2 - A system for 'declaratively' creating graphics, based on "The Grammar of Graphics". You provide the data, tell 'ggplot2' how to map variables to aesthetics, what graphical primitives to use, and it takes care of the details.
    • knitr - Dynamic Report generation in R.
    • magrittr- Provides a mechanism for chaining commands with a new forward-pipe operator, %>%. This operator will forward a value, or the result of an expression, into the next function call/expression. There is flexible support for the type of right-hand side expressions.
    • dplyr - A fast, consistent tool for working with data frame like objects, both in memory and out of memory.
    • tidyverse - This package is designed to make it easy to install and load multiple 'tidyverse' packages in a single step.

    imagehttps://user-images.githubusercontent.com/91852182/145270210-49c8e1aa-9753-431b-a8d5-99601bc76cb5.png">

    Data Pre-processing

    Next, we need to upload Assignment-1_Data. xlsx to R to read the dataset.Now we can see our data in R.

    imagehttps://user-images.githubusercontent.com/91852182/145270229-514f0983-3bbb-4cd3-be64-980e92656a02.png"> imagehttps://user-images.githubusercontent.com/91852182/145270251-6f6f6472-8817-435c-a995-9bc4bfef10d1.png">

    After we will clear our data frame, will remove missing values.

    imagehttps://user-images.githubusercontent.com/91852182/145270286-05854e1a-2b6c-490e-ab30-9e99e731eacb.png">

    To apply Association Rule mining, we need to convert dataframe into transaction data to make all items that are bought together in one invoice will be in ...

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Iftach Arbel, amazon-product-data-sample [Dataset]. https://huggingface.co/datasets/iarbel/amazon-product-data-sample

amazon-product-data-sample

iarbel/amazon-product-data-sample

Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Authors
Iftach Arbel
License

Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically

Description

Dataset Card for "amazon-product-data-filter"

  Dataset Summary

The Amazon Product Dataset contains product listing data from the Amazon US website. It can be used for various NLP and classification tasks, such as text generation, product type classification, attribute extraction, image recognition and more. NOTICE: This is a sample of the full Amazon Product Dataset, which contains 1K examples. Follow the link to gain access to the full dataset.

  Languages… See the full description on the dataset page: https://huggingface.co/datasets/iarbel/amazon-product-data-sample.
Search
Clear search
Close search
Google apps
Main menu