100+ datasets found

Dairy Supply Chain Sales Dataset

zenodo.org
data.niaid.nih.gov

pdf, zip

Updated Jul 12, 2024

+ more versions

Facebook

Twitter

Click to copy link

Link copied

Cite

Dimitris Iatropoulos; Konstantinos Georgakidis; Konstantinos Georgakidis; Ilias Siniosoglou; Ilias Siniosoglou; Christos Chaschatzis; Christos Chaschatzis; Anna Triantafyllou; Anna Triantafyllou; Athanasios Liatifis; Athanasios Liatifis; Dimitrios Pliatsios; Dimitrios Pliatsios; Thomas Lagkas; Thomas Lagkas; Vasileios Argyriou; Vasileios Argyriou; Panagiotis Sarigiannidis; Panagiotis Sarigiannidis; Dimitris Iatropoulos (2024). Dairy Supply Chain Sales Dataset [Dataset]. http://doi.org/10.21227/smv6-z405

Explore at:

zip, pdfAvailable download formats

Unique identifier

https://doi.org/10.21227/smv6-z405

Dataset updated

Jul 12, 2024

Dataset provided by

Zenodohttp://zenodo.org/

Authors

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

1.Introduction

Sales data collection is a crucial aspect of any manufacturing industry as it provides valuable insights about the performance of products, customer behaviour, and market trends. By gathering and analysing this data, manufacturers can make informed decisions about product development, pricing, and marketing strategies in Internet of Things (IoT) business environments like the dairy supply chain.

One of the most important benefits of the sales data collection process is that it allows manufacturers to identify their most successful products and target their efforts towards those areas. For example, if a manufacturer could notice that a particular product is selling well in a certain region, this information could be utilised to develop new products, optimise the supply chain or improve existing ones to meet the changing needs of customers.

This dataset includes information about 7 of MEVGAL’s products [1]. According to the above information the data published will help researchers to understand the dynamics of the dairy market and its consumption patterns, which is creating the fertile ground for synergies between academia and industry and eventually help the industry in making informed decisions regarding product development, pricing and market strategies in the IoT playground. The use of this dataset could also aim to understand the impact of various external factors on the dairy market such as the economic, environmental, and technological factors. It could help in understanding the current state of the dairy industry and identifying potential opportunities for growth and development.

2. Citation

Please cite the following papers when using this dataset:

I. Siniosoglou, K. Xouveroudis, V. Argyriou, T. Lagkas, S. K. Goudos, K. E. Psannis and P. Sarigiannidis, "Evaluating the Effect of Volatile Federated Timeseries on Modern DNNs: Attention over Long/Short Memory," in the 12th International Conference on Circuits and Systems Technologies (MOCAST 2023), April 2023, Accepted

3. Dataset Modalities

The dataset includes data regarding the daily sales of a series of dairy product codes offered by MEVGAL. In particular, the dataset includes information gathered by the logistics division and agencies within the industrial infrastructures overseeing the production of each product code. The products included in this dataset represent the daily sales and logistics of a variety of yogurt-based stock. Each of the different files include the logistics for that product on a daily basis for three years, from 2020 to 2022.

3.1 Data Collection

The process of building this dataset involves several steps to ensure that the data is accurate, comprehensive and relevant.

The first step is to determine the specific data that is needed to support the business objectives of the industry, i.e., in this publication’s case the daily sales data.

Once the data requirements have been identified, the next step is to implement an effective sales data collection method. In MEVGAL’s case this is conducted through direct communication and reports generated each day by representatives & selling points.

It is also important for MEVGAL to ensure that the data collection process conducted is in an ethical and compliant manner, adhering to data privacy laws and regulation. The industry also has a data management plan in place to ensure that the data is securely stored and protected from unauthorised access.

The published dataset is consisted of 13 features providing information about the date and the number of products that have been sold. Finally, the dataset was anonymised in consideration to the privacy requirement of the data owner (MEVGAL).

File	Period	Number of Samples (days)
product 1 2020.xlsx	01/01/2020–31/12/2020	363
product 1 2021.xlsx	01/01/2021–31/12/2021	364
product 1 2022.xlsx	01/01/2022–31/12/2022	365
product 2 2020.xlsx	01/01/2020–31/12/2020	363
product 2 2021.xlsx	01/01/2021–31/12/2021	364
product 2 2022.xlsx	01/01/2022–31/12/2022	365
product 3 2020.xlsx	01/01/2020–31/12/2020	363
product 3 2021.xlsx	01/01/2021–31/12/2021	364
product 3 2022.xlsx	01/01/2022–31/12/2022	365
product 4 2020.xlsx	01/01/2020–31/12/2020	363
product 4 2021.xlsx	01/01/2021–31/12/2021	364
product 4 2022.xlsx	01/01/2022–31/12/2022	364
product 5 2020.xlsx	01/01/2020–31/12/2020	363
product 5 2021.xlsx	01/01/2021–31/12/2021	364
product 5 2022.xlsx	01/01/2022–31/12/2022	365
product 6 2020.xlsx	01/01/2020–31/12/2020	362
product 6 2021.xlsx	01/01/2021–31/12/2021	364
product 6 2022.xlsx	01/01/2022–31/12/2022	365
product 7 2020.xlsx	01/01/2020–31/12/2020	362
product 7 2021.xlsx	01/01/2021–31/12/2021	364
product 7 2022.xlsx	01/01/2022–31/12/2022	365

3.2 Dataset Overview

The following table enumerates and explains the features included across all of the included files.

Feature	Description	Unit
Day	day of the month	-
Month	Month	-
Year	Year	-
daily_unit_sales	Daily sales - the amount of products, measured in units, that during that specific day were sold	units
previous_year_daily_unit_sales	Previous Year’s sales - the amount of products, measured in units, that during that specific day were sold the previous year	units
percentage_difference_daily_unit_sales	The percentage difference between the two above values	%
daily_unit_sales_kg	The amount of products, measured in kilograms, that during that specific day were sold	kg
previous_year_daily_unit_sales_kg	Previous Year’s sales - the amount of products, measured in kilograms, that during that specific day were sold, the previous year	kg
percentage_difference_daily_unit_sales_kg	The percentage difference between the two above values	kg
daily_unit_returns_kg	The percentage of the products that were shipped to selling points and were returned	%
previous_year_daily_unit_returns_kg	The percentage of the products that were shipped to

Grocery Inventory
kaggle.com
Updated Mar 16, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
willian oliveira (2025). Grocery Inventory [Dataset]. http://doi.org/10.34740/kaggle/dsv/11053760
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.34740/kaggle/dsv/11053760
Dataset updated
Mar 16, 2025
Dataset provided by
Kaggle
Authors
willian oliveira
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
this graph was created in R and Canva :

https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F16731800%2F1a47e2e6e4836b86b065441359d5c9f0%2Fgraph1.gif?generation=1742159161939732&alt=media" alt=""> https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F16731800%2F87de025c5703cb69483764c4fc9c58ab%2Fgraph2.gif?generation=1742159169346925&alt=media" alt=""> https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F16731800%2Fddf5001438c97c8c030333261685849b%2Fgraph3.png?generation=1742159174793142&alt=media" alt="">

The dataset offers a comprehensive view of grocery inventory, covering 990 products across multiple categories such as Grains & Pulses, Beverages, Fruits & Vegetables, and more. It includes crucial details about each product, such as its unique identifier (Product_ID), name, category, and supplier information, including Supplier_ID and Supplier_Name. This dataset is particularly valuable for businesses aiming to optimize inventory management, sales tracking, and supply chain efficiency.

Key inventory-related fields include Stock_Quantity, which indicates the current stock level, and Reorder_Level, which determines when a product should be reordered. The Reorder_Quantity specifies how much stock to order when inventory falls below the reorder threshold. Additionally, Unit_Price provides insight into pricing, helping businesses analyze cost trends and profitability.

To manage product flow, the dataset includes dates such as Date_Received, which tracks when the product was added to the warehouse, and Last_Order_Date, marking the most recent procurement. For perishable goods, the Expiration_Date column is critical, allowing businesses to minimize waste by monitoring shelf life. The Warehouse_Location specifies where each product is stored, facilitating efficient inventory handling.

Sales and performance metrics are also included. The Sales_Volume column records the total number of units sold, providing insights into consumer demand. Inventory_Turnover_Rate helps businesses assess how quickly a product sells and is replenished, ensuring better stock management. The dataset also tracks the Status of each product, indicating whether it is Active, Discontinued, or Backordered.

The dataset serves multiple purposes in inventory management, sales performance evaluation, supplier analysis, and product lifecycle tracking. Businesses can leverage this data to refine reorder strategies, ensuring optimal stock levels and avoiding stockouts or excessive inventory. Sales analysis can help identify high-demand products and slow-moving items, enabling better decision-making in pricing and promotions. Evaluating suppliers based on their performance, pricing, and delivery efficiency helps streamline procurement and improve overall supply chain operations.

Furthermore, the dataset can support predictive analytics by employing machine learning techniques to estimate reorder quantities, forecast demand, and optimize stock replenishment. Inventory turnover insights can aid in maintaining a balanced supply, preventing unnecessary overstocking or shortages. By tracking trends in sales, businesses can refine their marketing and distribution strategies, ensuring sustained profitability.

This dataset is designed for educational and demonstration purposes, offering fictional data under the Creative Commons Attribution 4.0 International License. Users are free to analyze, modify, and apply the data while providing proper attribution. Additionally, certain products are marked as discontinued or backordered, reflecting real-world inventory dynamics. Businesses dealing with perishable goods should closely monitor expiration and last order dates to avoid losses due to spoilage.

Overall, this dataset provides a versatile resource for those interested in inventory management, sales analysis, and supply chain optimization. By leveraging the structured data, businesses can make data-driven decisions to enhance operational efficiency and maximize profitability.
X company Data analysis Project
kaggle.com
Updated Sep 6, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ahmed Samir (2023). X company Data analysis Project [Dataset]. https://www.kaggle.com/datasets/ahmedsamir11111/x-company-data-analysis-project/versions/1
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Sep 6, 2023
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Ahmed Samir
Description
About Dataset The dataset contains information about sales transactions, including details such as the customer's age, gender, location, and the products sold. The dataset includes data on both the cost of the product and the revenue generated from its sale, allowing for calculations of profit and profit margins. The dataset includes information on customer age and gender, which could be used to analyze purchasing behavior across different demographic groups. The dataset likely includes both numeric and categorical data, which would require different types of analysis and visualization techniques. Overall, the dataset appears to provide a comprehensive view of sales transactions, with the potential for analysis at multiple levels, including by product, customer, and location. But it does not contain any useful information or insights for decision makers. - After understanding the dataset. - I cleaned it and add some columns & calculations like (Net profit, Age Status). - Making a model in Power Pivot, calculate some measures like (Total profit, COGS, Total revenues) and Making KPIS Model. - Then asked some questions: About Distribution What are the total revenues and profits? What is the best-selling country in terms of revenue? What are the five best-selling states in terms of revenue? What are the five lowest-selling states in terms of revenues? What is the position of age in relation to revenues? About profitability What are the total revenues and profits? Monthly position in terms of revenues and profits? Months position in terms of COGS? What are the top category-selling in terms of revenues & Profit? What are the three best-selling sub-category in terms of profit? About KPIS Explain to me each salesperson's position in terms of Target

Then Answering that questions, analysis the data and Visualize with Dashboards.
Cross sell data
kaggle.com
Updated Dec 30, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
AbhishekSatheesh (2020). Cross sell data [Dataset]. https://www.kaggle.com/datasets/zenblade93/cross-sell-data/data
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Dec 30, 2020
Dataset provided by
Kagglehttp://kaggle.com/
Authors
AbhishekSatheesh
Description
Dataset

This dataset was created by AbhishekSatheesh

Contents
Single Family Loan Sale Initiative
catalog.data.gov
s.cnmilf.com
Updated Mar 1, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. Department of Housing and Urban Development (2024). Single Family Loan Sale Initiative [Dataset]. https://catalog.data.gov/dataset/single-family-loan-sale-initiative-neighborhood-stabilization-outcome-pool-offering
Explore at:
Dataset updated
Mar 1, 2024
Dataset provided by
United States Department of Housing and Urban Developmenthttp://www.hud.gov/
Description
The FHA Office of Housing last conducted a series of mortgage loan sales under the Single Family Loan Sale (SFLS) Initiative in 2016. The current sales structure consisted of whole loan, competitive auctions, offering for purchase defaulted single family mortgages provided by FHA-approved loan servicers. The loans sold contained specified representations and warranties and may be sold with post-sale restrictions and/or reporting requirements. FHA sold loans in large national pools, as well as loan pools in designated geographical areas that are aimed at a neighborhood stabilization outcome (“NSO pools”).
Retail sale - monthly data
data.europa.eu
service.tib.eu
csv, html, tsv, xml
Updated Apr 29, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Eurostat (2022). Retail sale - monthly data [Dataset]. https://data.europa.eu/data/datasets/haup9tqynbnrsqn5fnelrw?locale=en
Explore at:
tsv(265269), csv, xml, htmlAvailable download formats
Dataset updated
Apr 29, 2022
Dataset authored and provided by
Eurostathttps://ec.europa.eu/eurostat
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Retail sale - monthly data
Company Datasets for Business Profiling
datarade.ai
Updated Feb 23, 2017
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Oxylabs (2017). Company Datasets for Business Profiling [Dataset]. https://datarade.ai/data-products/company-datasets-for-business-profiling-oxylabs
Explore at:
.json, .xml, .csv, .xlsAvailable download formats
Dataset updated
Feb 23, 2017
Dataset authored and provided by
Oxylabs
Area covered
Taiwan, Bangladesh, Andorra, Tunisia, British Indian Ocean Territory, Nepal, Canada, Northern Mariana Islands, Isle of Man, Moldova (Republic of)
Description
Company Datasets for valuable business insights!

Discover new business prospects, identify investment opportunities, track competitor performance, and streamline your sales efforts with comprehensive Company Datasets.

These datasets are sourced from top industry providers, ensuring you have access to high-quality information:

Owler: Gain valuable business insights and competitive intelligence. -AngelList: Receive fresh startup data transformed into actionable insights. -CrunchBase: Access clean, parsed, and ready-to-use business data from private and public companies. -Craft.co: Make data-informed business decisions with Craft.co's company datasets. -Product Hunt: Harness the Product Hunt dataset, a leader in curating the best new products.

We provide fresh and ready-to-use company data, eliminating the need for complex scraping and parsing. Our data includes crucial details such as:

Company name;

Size;

Founding date;

Location;

Industry;

Revenue;

Employee count;

Competitors.

You can choose your preferred data delivery method, including various storage options, delivery frequency, and input/output formats.

Receive datasets in CSV, JSON, and other formats, with storage options like AWS S3 and Google Cloud Storage. Opt for one-time, monthly, quarterly, or bi-annual data delivery.

With Oxylabs Datasets, you can count on:

Fresh and accurate data collected and parsed by our expert web scraping team.

Time and resource savings, allowing you to focus on data analysis and achieving your business goals.

A customized approach tailored to your specific business needs.

Legal compliance in line with GDPR and CCPA standards, thanks to our membership in the Ethical Web Data Collection Initiative.

Pricing Options:

Standard Datasets: choose from various ready-to-use datasets with standardized data schemas, priced from $1,000/month.

Custom Datasets: Tailor datasets from any public web domain to your unique business needs. Contact our sales team for custom pricing.

Experience a seamless journey with Oxylabs:

Understanding your data needs: We work closely to understand your business nature and daily operations, defining your unique data requirements.

Developing a customized solution: Our experts create a custom framework to extract public data using our in-house web scraping infrastructure.

Delivering data sample: We provide a sample for your feedback on data quality and the entire delivery process.

Continuous data delivery: We continuously collect public data and deliver custom datasets per the agreed frequency.

Unlock the power of data with Oxylabs' Company Datasets and supercharge your business insights today!
Sales Dataset
kaggle.com
Updated Jul 21, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ahmed Mohamed Ibrahim Mohamed (2024). Sales Dataset [Dataset]. https://www.kaggle.com/datasets/ahmedmohamedibrahim1/sales-dataset/data
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jul 21, 2024
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Ahmed Mohamed Ibrahim Mohamed
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
****Attribute information:****

Row ID: A unique identifier for each row in the table Order ID: The identifier for each sales order Order Date: The date the order was placed Ship Date: The date the order was shipped Delivery Duration: The amount of time it took to deliver the order Ship Mode: The shipping method used for the order Customer ID: The identifier for the customer who placed the order Customer Name: The name of the customer who placed the order Country: The customer's country City: The customer's city State: The customer's state Postal Code: The customer's postal code Region: The customer's region Product ID: The identifier for the product that was ordered Category: The category of the product that was ordered (e.g., furniture, office supplies, technology) Sub-Category - This attribute likely refers to a subcategory within a larger product category (e.g., Tables within Furniture). (Bookcases - Chairs - Labels - Tables - Storage - Furnishings - Art - Phones - Binders - Appliances - Paper - Others). Product Name - This attribute specifies the name of the product sold. (Bush Somerset Collection Bookcase - Hon Deluxe Fabric Upholstered Stacking Chairs, Rounded Back - Self-Adhesive Address Labels for Typewriters by Universal - Bretford CP4500 Series Slim Rectangular Table - Others).

Sales - This attribute shows the total sales amount for each product. Values are listed in currency format Quantity - This attribute specifies the number of units sold for each product. Integer values. Discount - This attribute indicates the discount offered on the product. Discount Value - This attribute shows the total discount amount applied to the product. Profit - This attribute shows the profit earned on the sale of each product. COGS - This attribute likely refers to each product's Cost of Goods Sold. COGS = Sales - Profit
V
Market Sale Ratio
data.virginia.gov
catalog.data.gov
+3more
Updated Apr 25, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Fairfax County (2025). Market Sale Ratio [Dataset]. https://data.virginia.gov/dataset/market-sale-ratio
Explore at:
csv, arcgis geoservices rest api, zip, geojson, kml, htmlAvailable download formats
Dataset updated
Apr 25, 2025
Dataset provided by
County of Fairfax
Authors
Fairfax County
Description
Residential market value estimates and most recent sales values for owned properties at a parcel level within Fairfax County as of the VALID_TO date in the attribute table.

For methodology and a data dictionary please view the IPLS data dictionary
d
Real Estate Sales 2001-2022 GL
catalog.data.gov
data.ct.gov
Updated Dec 20, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
data.ct.gov (2024). Real Estate Sales 2001-2022 GL [Dataset]. https://catalog.data.gov/dataset/real-estate-sales-2001-2018
Explore at:
Dataset updated
Dec 20, 2024
Dataset provided by
data.ct.gov
Description
The Office of Policy and Management maintains a listing of all real estate sales with a sales price of $2,000 or greater that occur between October 1 and September 30 of each year. For each sale record, the file includes: town, property address, date of sale, property type (residential, apartment, commercial, industrial or vacant land), sales price, and property assessment. Data are collected in accordance with Connecticut General Statutes, section 10-261a and 10-261b: https://www.cga.ct.gov/current/pub/chap_172.htm#sec_10-261a and https://www.cga.ct.gov/current/pub/chap_172.htm#sec_10-261b. Annual real estate sales are reported by grand list year (October 1 through September 30 each year). For instance, sales from 2018 GL are from 10/01/2018 through 9/30/2019. Some municipalities may not report data for certain years because when a municipality implements a revaluation, they are not required to submit sales data for the twelve months following implementation.
SKU-Level Transaction Data | Point-of-Sale (POS) Data | 1M+ Grocery,...
datarade.ai
Updated Jan 29, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
MealMe (2025). SKU-Level Transaction Data | Point-of-Sale (POS) Data | 1M+ Grocery, Restaurant, and Retail stores stores with SKU level transactions [Dataset]. https://datarade.ai/data-products/sku-level-transaction-data-point-of-sale-pos-data-1m-g-mealme
Explore at:
.bin, .json, .xml, .csv, .xls, .sql, .txtAvailable download formats
Dataset updated
Jan 29, 2025
Dataset provided by
MealMe, Inc.
Authors
MealMe
Area covered
Åland Islands, Swaziland, Ghana, Moldova (Republic of), Japan, New Zealand, Ecuador, Kosovo, Indonesia, Slovenia
Description
MealMe provides comprehensive grocery and retail SKU-level product data, including real-time pricing, from the top 100 retailers in the USA and Canada. Our proprietary technology ensures accurate and up-to-date insights, empowering businesses to excel in competitive intelligence, pricing strategies, and market analysis.

Retailers Covered: MealMe’s database includes detailed SKU-level data and pricing from leading grocery and retail chains such as Walmart, Target, Costco, Kroger, Safeway, Publix, Whole Foods, Aldi, ShopRite, BJ’s Wholesale Club, Sprouts Farmers Market, Albertsons, Ralphs, Pavilions, Gelson’s, Vons, Shaw’s, Metro, and many more. Our coverage spans the most influential retailers across North America, ensuring businesses have the insights needed to stay competitive in dynamic markets.

Key Features: SKU-Level Granularity: Access detailed product-level data, including product descriptions, categories, brands, and variations. Real-Time Pricing: Monitor current pricing trends across major retailers for comprehensive market comparisons. Regional Insights: Analyze geographic price variations and inventory availability to identify trends and opportunities. Customizable Solutions: Tailored data delivery options to meet the specific needs of your business or industry. Use Cases: Competitive Intelligence: Gain visibility into pricing, product availability, and assortment strategies of top retailers like Walmart, Costco, and Target. Pricing Optimization: Use real-time data to create dynamic pricing models that respond to market conditions. Market Research: Identify trends, gaps, and consumer preferences by analyzing SKU-level data across leading retailers. Inventory Management: Streamline operations with accurate, real-time inventory availability. Retail Execution: Ensure on-shelf product availability and compliance with merchandising strategies. Industries Benefiting from Our Data CPG (Consumer Packaged Goods): Optimize product positioning, pricing, and distribution strategies. E-commerce Platforms: Enhance online catalogs with precise pricing and inventory information. Market Research Firms: Conduct detailed analyses to uncover industry trends and opportunities. Retailers: Benchmark against competitors like Kroger and Aldi to refine assortments and pricing. AI & Analytics Companies: Fuel predictive models and business intelligence with reliable SKU-level data. Data Delivery and Integration MealMe offers flexible integration options, including APIs and custom data exports, for seamless access to real-time data. Whether you need large-scale analysis or continuous updates, our solutions scale with your business needs.

Why Choose MealMe? Comprehensive Coverage: Data from the top 100 grocery and retail chains in North America, including Walmart, Target, and Costco. Real-Time Accuracy: Up-to-date pricing and product information ensures competitive edge. Customizable Insights: Tailored datasets align with your specific business objectives. Proven Expertise: Trusted by diverse industries for delivering actionable insights. MealMe empowers businesses to unlock their full potential with real-time, high-quality grocery and retail data. For more information or to schedule a demo, contact us today!
Z
Data from: Malware Finances and Operations: a Data-Driven Study of the Value...
data.niaid.nih.gov
zenodo.org
Updated Jun 20, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Nurmi, Juha (2023). Malware Finances and Operations: a Data-Driven Study of the Value Chain for Infections and Compromised Access [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_8047204
Explore at:
Dataset updated
Jun 20, 2023
Dataset provided by
Niemelä, Mikko
Nurmi, Juha
Brumley, Billy
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Description

The datasets demonstrate the malware economy and the value chain published in our paper, Malware Finances and Operations: a Data-Driven Study of the Value Chain for Infections and Compromised Access, at the 12th International Workshop on Cyber Crime (IWCC 2023), part of the ARES Conference, published by the International Conference Proceedings Series of the ACM ICPS.

Using the well-documented scripts, it is straightforward to reproduce our findings. It takes an estimated 1 hour of human time and 3 hours of computing time to duplicate our key findings from MalwareInfectionSet; around one hour with VictimAccessSet; and minutes to replicate the price calculations using AccountAccessSet. See the included README.md files and Python scripts.

We choose to represent each victim by a single JavaScript Object Notation (JSON) data file. Data sources provide sets of victim JSON data files from which we've extracted the essential information and omitted Personally Identifiable Information (PII). We collected, curated, and modelled three datasets, which we publish under the Creative Commons Attribution 4.0 International License.

MalwareInfectionSet We discover (and, to the best of our knowledge, document scientifically for the first time) that malware networks appear to dump their data collections online. We collected these infostealer malware logs available for free. We utilise 245 malware log dumps from 2019 and 2020 originating from 14 malware networks. The dataset contains 1.8 million victim files, with a dataset size of 15 GB.

VictimAccessSet We demonstrate how Infostealer malware networks sell access to infected victims. Genesis Market focuses on user-friendliness and continuous supply of compromised data. Marketplace listings include everything necessary to gain access to the victim's online accounts, including passwords and usernames, but also detailed collection of information which provides a clone of the victim's browser session. Indeed, Genesis Market simplifies the import of compromised victim authentication data into a web browser session. We measure the prices on Genesis Market and how compromised device prices are determined. We crawled the website between April 2019 and May 2022, collecting the web pages offering the resources for sale. The dataset contains 0.5 million victim files, with a dataset size of 3.5 GB.

AccountAccessSet The Database marketplace operates inside the anonymous Tor network. Vendors offer their goods for sale, and customers can purchase them with Bitcoins. The marketplace sells online accounts, such as PayPal and Spotify, as well as private datasets, such as driver's licence photographs and tax forms. We then collect data from Database Market, where vendors sell online credentials, and investigate similarly. To build our dataset, we crawled the website between November 2021 and June 2022, collecting the web pages offering the credentials for sale. The dataset contains 33,896 victim files, with a dataset size of 400 MB.

Credits Authors

Billy Bob Brumley (Tampere University, Tampere, Finland)

Juha Nurmi (Tampere University, Tampere, Finland)

Mikko Niemelä (Cyber Intelligence House, Singapore)

Funding

This project has received funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme under project numbers 804476 (SCARE) and 952622 (SPIRS).

Alternative links to download: AccountAccessSet, MalwareInfectionSet, and VictimAccessSet.
d
NORA Sold Properties
catalog.data.gov
data.nola.gov
Updated Jul 5, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
data.nola.gov (2025). NORA Sold Properties [Dataset]. https://catalog.data.gov/dataset/nora-sold-properties
Explore at:
Dataset updated
Jul 5, 2025
Dataset provided by
data.nola.gov
Description
This data set is a listing of all property sales by NORA through the following disposition channels. - Auction: Properties put up for auction and sold to the highest bidder. - Development: Properties offered to development partners at a discounted rate to support the development of affordable housing. - Lot Next Door: Properties sold to adjacent parcel owners, with discount opportunities for eligible participants. - Alternative Land Use: Properties sold for development of green space and community gardens. Note: this dataset contains duplicate addresses, which likely represent reversions or quitclaims that NORA sold again.
Sales Dataset with Natural Language Statement
kaggle.com
Updated Oct 1, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Gurpreet Singh India (2024). Sales Dataset with Natural Language Statement [Dataset]. https://www.kaggle.com/datasets/gurpreetsinghindia/sales-data-with-natural-language
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Oct 1, 2024
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Gurpreet Singh India
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
This dataset contains 10,000 simulated sales transaction records, each represented in natural language with diverse sentence structures. It is designed to mimic how different users might describe the same type of transaction in varying ways, making it ideal for Natural Language Processing (NLP) tasks, text-based data extraction, and accounting automation projects.

Each record in the dataset includes the following fields:

Sale Date: The date on which the transaction took place. Customer Name: A randomly generated customer name. Product: The type of product purchased. Quantity: The quantity of the product purchased. Unit Price: The price per unit of the product. Total Amount: The total price for the purchased products. Tax Rate: The percentage of tax applied to the transaction. Payment Method: The method by which the payment was made (e.g., Credit Card, Debit Card, UPI, etc.). Sentence: A natural language description of the sales transaction. The sentence structure is varied to simulate different ways people describe the same type of sales event.

Use Cases: NLP Training: This dataset is suitable for training models to extract structured information (e.g., date, customer, amount) from natural language descriptions of sales transactions. Accounting Automation: The dataset can be used to build or test systems that automate posting of sales transactions based on unstructured text input. Text Data Preprocessing: It provides a good resource for developing methods to preprocess and standardize varying formats of text descriptions. Chatbot Training: This dataset can help train chatbots or virtual assistants that handle accounting or customer inquiries by understanding different ways of expressing the same transaction details.

Key Features: High Variability: Sentences are structured in numerous ways to simulate natural human language variations. Randomized Data: Names, dates, products, quantities, prices, and payment methods are randomized, ensuring no duplication. Multi-Field Information: Each record contains key sales information essential for accounting and business use cases.

Potential Applications: Use for Named Entity Recognition (NER) tasks. Apply for information extraction challenges. Create pattern recognition models to understand different sentence structures. Test rule-based systems or machine learning models for sales data entry and accounting automation.

License: Ensure that the dataset is appropriately licensed according to your intended use. For general public and research purposes, choose a CC0: Public Domain license, unless specific restrictions apply.
d
Real Estate Data | Property Listing, Sold Properties, Rankings, Agent...
datarade.ai
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Grepsr, Real Estate Data | Property Listing, Sold Properties, Rankings, Agent Datasets | Global Coverage | For Competitive Property Pricing and Investment [Dataset]. https://datarade.ai/data-products/real-estate-property-data-grepsr-grepsr
Explore at:
.bin, .json, .xml, .csv, .xls, .sql, .txtAvailable download formats
Dataset authored and provided by
Grepsr
Area covered
Kazakhstan, Kuwait, Holy See, Malaysia, Australia, Spain, South Sudan, Tonga, Congo (Democratic Republic of the), Iraq
Description
Extract detailed property data points — address, URL, prices, floor space, overview, parking, agents, and more — from any real estate listings. The Rankings data contains the ranking of properties as they come in the SERPs of different property listing sites. Furthermore, with our real estate agents' data, you can directly get in touch with the real estate agents/brokers via email or phone numbers.

A. Usecase/Applications possible with the data:

Property pricing - accurate property data for real estate valuation. Gather information about properties and their valuations from Federal, State, or County level websites. Monitor the real estate market across the country and decide the best time to buy or sell based on data

Secure your real estate investment - Monitor foreclosures and auctions to identify investment opportunities. Identify areas within special economic and opportunity zones such as QOZs - cross-map that with commercial or residential listings to identify leads. Ensure the safety of your investments, property, and personnel by analyzing crime data prior to investing.

Identify hot, emerging markets - Gather data about rent, demographic, and population data to expand retail and e-commerce businesses. Helps you drive better investment decisions.

Profile a building’s retrofit history - a building permit is required before the start of any construction activity of a building, such as changing the building structure, remodeling, or installing new equipment. Moreover, many large cities provide public datasets of building permits in history. Use building permits to profile a city’s building retrofit history.

Study market changes - New construction data helps measure and evaluate the size, composition, and changes occurring within the housing and construction sectors.

Finding leads - Property records can reveal a wealth of information, such as how long an owner has currently lived in a home. US Census Bureau data and City-Data.com provide profiles of towns and city neighborhoods as well as demographic statistics. This data is available for free and can help agents increase their expertise in their communities and get a feel for the local market.

Searching for Targeted Leads - Focusing on small, niche areas of the real estate market can sometimes be the most efficient method of finding leads. For example, targeting high-end home sellers may take longer to develop a lead, but the payoff could be greater. Or, you may have a special interest or background in a certain type of home that would improve your chances of connecting with potential sellers. In these cases, focused data searches may help you find the best leads and develop relationships with future sellers.

How does it work?

Analyze sample data

Customize parameters to suit your needs

Add to your projects

Contact support for further customization
u
Direct selling, by method of sale and commodity, inactive
data.urbandatacentre.ca
datasets.ai
+3more
Updated Oct 1, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). Direct selling, by method of sale and commodity, inactive [Dataset]. https://data.urbandatacentre.ca/dataset/gov-canada-86d77342-97ec-44a5-988f-9ce76df1bc7a
Explore at:
Dataset updated
Oct 1, 2024
License
Open Government Licence - Canada 2.0https://open.canada.ca/en/open-government-licence-canada
License information was derived automatically
Description
This table contains 151 series, with data for years 1966 - 1997 (not all combinations necessarily have data for all years), and is no longer being released. This table contains data described by the following dimensions (Not all combinations are available): Geography (1 item: Canada), Method of sale (6 items: Total direct sales; Sales from premises; Door-to-door sales; Sales by mail; ...), Commodity group (31 items: Total, all commodities; Meat, fish and poultry; Frozen food plans; Dairy products; ...).
LinkedIn Datasets
brightdata.com
.json, .csv, .xlsx
Updated Dec 17, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Bright Data (2021). LinkedIn Datasets [Dataset]. https://brightdata.com/products/datasets/linkedin
Explore at:
.json, .csv, .xlsxAvailable download formats
Dataset updated
Dec 17, 2021
Dataset authored and provided by
Bright Datahttps://brightdata.com/
License
https://brightdata.com/licensehttps://brightdata.com/license
Area covered
Worldwide
Description
Unlock the full potential of LinkedIn data with our extensive dataset that combines profiles, company information, and job listings into one powerful resource for business decision-making, strategic hiring, competitive analysis, and market trend insights. This all-encompassing dataset is ideal for professionals, recruiters, analysts, and marketers aiming to enhance their strategies and operations across various business functions. Dataset Features

Profiles: Dive into detailed public profiles featuring names, titles, positions, experience, education, skills, and more. Utilize this data for talent sourcing, lead generation, and investment signaling, with a refresh rate ensuring up to 30 million records per month. Companies: Access comprehensive company data including ID, country, industry, size, number of followers, website details, subsidiaries, and posts. Tailored subsets by industry or region provide invaluable insights for CRM enrichment, competitive intelligence, and understanding the startup ecosystem, updated monthly with up to 40 million records. Job Listings: Explore current job opportunities detailed with job titles, company names, locations, and employment specifics such as seniority levels and employment functions. This dataset includes direct application links and real-time application numbers, serving as a crucial tool for job seekers and analysts looking to understand industry trends and the job market dynamics.

Customizable Subsets for Specific Needs Our LinkedIn dataset offers the flexibility to tailor the dataset according to your specific business requirements. Whether you need comprehensive insights across all data points or are focused on specific segments like job listings, company profiles, or individual professional details, we can customize the dataset to match your needs. This modular approach ensures that you get only the data that is most relevant to your objectives, maximizing efficiency and relevance in your strategic applications. Popular Use Cases

Strategic Hiring and Recruiting: Track talent movement, identify growth opportunities, and enhance your recruiting efforts with targeted data. Market Analysis and Competitive Intelligence: Gain a competitive edge by analyzing company growth, industry trends, and strategic opportunities. Lead Generation and CRM Enrichment: Enrich your database with up-to-date company and professional data for targeted marketing and sales strategies. Job Market Insights and Trends: Leverage detailed job listings for a nuanced understanding of employment trends and opportunities, facilitating effective job matching and market analysis. AI-Driven Predictive Analytics: Utilize AI algorithms to analyze large datasets for predicting industry shifts, optimizing business operations, and enhancing decision-making processes based on actionable data insights.

Whether you are mapping out competitive landscapes, sourcing new talent, or analyzing job market trends, our LinkedIn dataset provides the tools you need to succeed. Customize your access to fit specific needs, ensuring that you have the most relevant and timely data at your fingertips.
d
Realtor.com Dataset | Property Listings | MLS Data | Real Estate Data |...
datarade.ai
.json, .csv, .txt
Updated Oct 4, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
CrawlBee (2023). Realtor.com Dataset | Property Listings | MLS Data | Real Estate Data | Residential Data | Realtime Real Estate Market Data [Dataset]. https://datarade.ai/data-products/crawlbee-realtor-com-dataset-property-listings-mls-dat-crawlbee
Explore at:
.json, .csv, .txtAvailable download formats
Dataset updated
Oct 4, 2023
Dataset authored and provided by
CrawlBee
Area covered
United States of America
Description
Our Realtor.com (Multiple Listing Service) dataset represents one of the most exhaustive collections of real estate data available to the industry. It consolidates data from over 500 MLS aggregators across various regions, providing an unparalleled view of the property market.

Features:

Property Listings: Each listing provides comprehensive details about a property. This includes its physical address, number of bedrooms and bathrooms, square footage, lot size, type of property (e.g., single-family home, condo, townhome), and more.

Photographs and Virtual Tours: Visuals are crucial in the property market. Most listings are accompanied by high-quality photographs and, in many cases, virtual or 3D tours that allow potential buyers to explore properties remotely.

Pricing Information: Listings provide asking prices, and the dataset frequently updates to reflect price changes. Historical price data, which includes initial listing prices and any subsequent reductions or increases, is also available.

Transaction Histories: For sold properties, the dataset provides information about the date of sale, the sale price, and any discrepancies between the listing and sale prices.

Agent and Broker Information: Each listing typically has associated details about the property's real estate professional. This might include their name, contact details, and affiliated brokerage.

Open House Schedules: Open house dates and times are listed for properties that are actively being shown to potential buyers.

Analytical Insights:

Market Trends: By analyzing the dataset over time, one can glean insights into market dynamics, such as the rate of price appreciation or depreciation in certain areas, the average time properties stay on the market, and seasonality effects.

Neighborhood Data: With comprehensive geographical data, it becomes possible to understand neighborhood-specific trends. This is invaluable for potential buyers or real estate investors looking to identify burgeoning markets.

Price Comparisons: Realtors and potential buyers can benchmark properties against similar listings in the same area to determine if a property is priced appropriately.

Utility:

For Industry Professionals and Analysts: Beyond buyers and sellers, the dataset is a trove of information for real estate agents, brokers, analysts, and investors. They can harness this data to craft strategies, predict market movements, and serve their clients better.
N
Sale City, GA Population Breakdown by Gender Dataset: Male and Female...
neilsberg.com
csv, json
Updated Feb 24, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Neilsberg Research (2025). Sale City, GA Population Breakdown by Gender Dataset: Male and Female Population Distribution // 2025 Edition [Dataset]. https://www.neilsberg.com/research/datasets/b25146a5-f25d-11ef-8c1b-3860777c1fe6/
Explore at:
json, csvAvailable download formats
Dataset updated
Feb 24, 2025
Dataset authored and provided by
Neilsberg Research
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Georgia, Sale City
Variables measured
Male Population, Female Population, Male Population as Percent of Total Population, Female Population as Percent of Total Population
Measurement technique
The data presented in this dataset is derived from the latest U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates. To measure the two variables, namely (a) population and (b) population as a percentage of the total population, we initially analyzed and categorized the data for each of the gender classifications (biological sex) reported by the US Census Bureau. For further information regarding these estimates, please feel free to reach out to us via email at research@neilsberg.com.
Dataset funded by
Neilsberg Research
Description
About this dataset

Context

The dataset tabulates the population of Sale City by gender, including both male and female populations. This dataset can be utilized to understand the population distribution of Sale City across both sexes and to determine which sex constitutes the majority.

Key observations

There is a majority of female population, with 58.09% of total population being female. Source: U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.

Content

When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.

Scope of gender :

Please note that American Community Survey asks a question about the respondents current sex, but not about gender, sexual orientation, or sex at birth. The question is intended to capture data for biological sex, not gender. Respondents are supposed to respond with the answer as either of Male or Female. Our research and this dataset mirrors the data reported as Male and Female for gender distribution analysis. No further analysis is done on the data reported from the Census Bureau.

Variables / Data Columns

Gender: This column displays the Gender (Male / Female)

Population: The population of the gender in the Sale City is shown in this column.

% of Total Population: This column displays the percentage distribution of each gender as a proportion of Sale City total population. Please note that the sum of all percentages may not equal one due to rounding of values.

Good to know

Margin of Error

Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

Custom data

If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

Inspiration

Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

Recommended for further research

This dataset is a part of the main dataset for Sale City Population by Race & Ethnicity. You can refer the same here
r
Sale database 2019-2022
redivis.com
Updated Aug 11, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Environmental Impact Data Collaborative (2022). Sale database 2019-2022 [Dataset]. https://redivis.com/datasets/sy4g-4h33mdm5n
Explore at:
Dataset updated
Aug 11, 2022
Dataset authored and provided by
Environmental Impact Data Collaborative
Description
The table Sale database 2019-2022 is part of the dataset Maryland Property Assessment - Summary, available at https://redivis.com/datasets/sy4g-4h33mdm5n. It contains 3968505 rows across 99 variables.

Facebook

Twitter

Click to copy link

Link copied

Cite

Dairy Supply Chain Sales Dataset

Explore at:

3 scholarly articles cite this dataset (View in Google Scholar)

zip, pdfAvailable download formats

Unique identifier

https://doi.org/10.21227/smv6-z405

Dataset updated

Jul 12, 2024

Dataset provided by

Zenodohttp://zenodo.org/

Authors

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

1.Introduction

2. Citation

Please cite the following papers when using this dataset:

I. Siniosoglou, K. Xouveroudis, V. Argyriou, T. Lagkas, S. K. Goudos, K. E. Psannis and P. Sarigiannidis, "Evaluating the Effect of Volatile Federated Timeseries on Modern DNNs: Attention over Long/Short Memory," in the 12th International Conference on Circuits and Systems Technologies (MOCAST 2023), April 2023, Accepted

3. Dataset Modalities

3.1 Data Collection

The process of building this dataset involves several steps to ensure that the data is accurate, comprehensive and relevant.

The first step is to determine the specific data that is needed to support the business objectives of the industry, i.e., in this publication’s case the daily sales data.

File	Period	Number of Samples (days)
product 1 2020.xlsx	01/01/2020–31/12/2020	363
product 1 2021.xlsx	01/01/2021–31/12/2021	364
product 1 2022.xlsx	01/01/2022–31/12/2022	365
product 2 2020.xlsx	01/01/2020–31/12/2020	363
product 2 2021.xlsx	01/01/2021–31/12/2021	364
product 2 2022.xlsx	01/01/2022–31/12/2022	365
product 3 2020.xlsx	01/01/2020–31/12/2020	363
product 3 2021.xlsx	01/01/2021–31/12/2021	364
product 3 2022.xlsx	01/01/2022–31/12/2022	365
product 4 2020.xlsx	01/01/2020–31/12/2020	363
product 4 2021.xlsx	01/01/2021–31/12/2021	364
product 4 2022.xlsx	01/01/2022–31/12/2022	364
product 5 2020.xlsx	01/01/2020–31/12/2020	363
product 5 2021.xlsx	01/01/2021–31/12/2021	364
product 5 2022.xlsx	01/01/2022–31/12/2022	365
product 6 2020.xlsx	01/01/2020–31/12/2020	362
product 6 2021.xlsx	01/01/2021–31/12/2021	364
product 6 2022.xlsx	01/01/2022–31/12/2022	365
product 7 2020.xlsx	01/01/2020–31/12/2020	362
product 7 2021.xlsx	01/01/2021–31/12/2021	364
product 7 2022.xlsx	01/01/2022–31/12/2022	365

3.2 Dataset Overview

The following table enumerates and explains the features included across all of the included files.

Feature	Description	Unit
Day	day of the month	-
Month	Month	-
Year	Year	-
daily_unit_sales	Daily sales - the amount of products, measured in units, that during that specific day were sold	units
previous_year_daily_unit_sales	Previous Year’s sales - the amount of products, measured in units, that during that specific day were sold the previous year	units
percentage_difference_daily_unit_sales	The percentage difference between the two above values	%
daily_unit_sales_kg	The amount of products, measured in kilograms, that during that specific day were sold	kg
previous_year_daily_unit_sales_kg	Previous Year’s sales - the amount of products, measured in kilograms, that during that specific day were sold, the previous year	kg
percentage_difference_daily_unit_sales_kg	The percentage difference between the two above values	kg
daily_unit_returns_kg	The percentage of the products that were shipped to selling points and were returned	%
previous_year_daily_unit_returns_kg	The percentage of the products that were shipped to

Clear search

Close search

Google apps

Main menu

Dairy Supply Chain Sales Dataset

Grocery Inventory

X company Data analysis Project

Cross sell data

Dataset

Contents

Single Family Loan Sale Initiative

Retail sale - monthly data

Company Datasets for Business Profiling

Sales Dataset

Market Sale Ratio

Real Estate Sales 2001-2022 GL

SKU-Level Transaction Data | Point-of-Sale (POS) Data | 1M+ Grocery,...

Data from: Malware Finances and Operations: a Data-Driven Study of the Value...

NORA Sold Properties

Sales Dataset with Natural Language Statement

Real Estate Data | Property Listing, Sold Properties, Rankings, Agent...

Direct selling, by method of sale and commodity, inactive

LinkedIn Datasets

Realtor.com Dataset | Property Listings | MLS Data | Real Estate Data |...

Sale City, GA Population Breakdown by Gender Dataset: Male and Female...

About this dataset

Content

Inspiration

Recommended for further research

Sale database 2019-2022

Dairy Supply Chain Sales DatasetSee More Versions

Dairy Supply Chain Sales Dataset