75 datasets found

c
Hacker News Sentiment Analysis Dataset
cubig.ai
zip
Updated Jul 14, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
CUBIG (2025). Hacker News Sentiment Analysis Dataset [Dataset]. https://cubig.ai/store/products/586/hacker-news-sentiment-analysis-dataset
Explore at:
zipAvailable download formats
Dataset updated
Jul 14, 2025
Dataset authored and provided by
CUBIG
License
https://cubig.ai/store/terms-of-servicehttps://cubig.ai/store/terms-of-service
Measurement technique
Privacy-preserving data transformation via differential privacy, Synthetic data generation using AI techniques for model training
Description
1) Data Introduction • The Hacker News Sentiment Analysis Dataset is a technology community public opinion analysis data that provides an emotional analysis (polarity, subjectivity, and emotional categories) of each of the top 141 hacker news posts along with the title, URL, point, and comment count.

2) Data Utilization (1) Hacker News Sentiment Analysis Dataset has characteristics that: • This dataset includes polar (-1-1), subjectivity (0-1), and category (positive/neutral/negative) columns that quantify the sentiment of comments using TextBlob, based on the latest top posts as of June 24, 2025. • It is generated through web scraping and NLP preprocessing, and allows for quantitative comparison of community responses to technology news. (2) Hacker News Sentiment Analysis Dataset can be used to: • Visualize technology trends Emotional: Connect emotional scores with post topics to visually analyze community response patterns to specific technology news such as AI and policies. • NLP Model Learning: Emotional classification models can be trained using comment data with real-world technical discussions or applied to research on the subjectivity prediction of comments.
c
ckanext-data-comparision
catalog.civicdataecosystem.org
Updated Jun 4, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). ckanext-data-comparision [Dataset]. https://catalog.civicdataecosystem.org/dataset/ckanext-data-comparision
Explore at:
Dataset updated
Jun 4, 2025
Description
The Data Comparision extension for CKAN allows users to compare data from CSV and XLSX files through visualizations. This extension aims to enhance data analysis capabilities within CKAN by providing a direct visual comparison of data sets, facilitating a better understanding of the content. The extension is compatible with CKAN 2.9, providing an extended feature set for data comparison. Key Features: CSV/XLSX Data Support: Enables the comparison of data stored in common tabular formats such as CSV and XLSX files by leveraging the extension's visualization capabilities. Visual Data Comparison: Supports visualizing the data for side-by-side comparisons, allowing users to easily identify differences and similarities between datasets. Chart.js Integration: Relies on Chart.js library for generating data visualizations, specifically for the comparison feature. This ensures compatibility and a wide selection of chart formats. Technical Integration: The extension needs to be added to the ckan.plugins setting in the CKAN configuration file (/etc/ckan/default/ckan.ini by default). It also requires installing Chart.js via npm. After these configurations and a CKAN restart, the plugin extends CKAN's user interface with data comparison features. Benefits & Impact: By facilitating a comparative view of data stored in CSV and XLSX formats, the Data Comparison extension reduces the effort needed to analyze datasets. This enables better decision-making based on clear, visually represented data comparisons. Because the data is visualized, differences and similarities are noted quicker than non-visualized data.
R
Books Dataset
universe.roboflow.com
zip
Updated Dec 8, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
npubooks (2022). Books Dataset [Dataset]. https://universe.roboflow.com/npubooks/books-ul2pd/model/54
Explore at:
zipAvailable download formats
Dataset updated
Dec 8, 2022
Dataset authored and provided by
npubooks
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Variables measured
10000 Bounding Boxes
Description
Here are a few use cases for this project:

Library Management: Use the "books" model to help librarians identify and categorize books based on their covers, making it easier to manage book inventory, re-shelve returned books, and locate misplaced books.

Bookstore Assistance: Implement the "books" model in bookstores' mobile apps or in-store kiosks, allowing customers to quickly find books they are looking for and discover related titles by snapping a picture of a book cover.

Book Club Recommendation Engine: Use the "books" model as the basis for a book club app that recommends new titles based on the book covers from previous selections, helping users explore new genres and authors.

Accessibility for Visually Impaired: Integrate the "books" model into software or devices designed for visually impaired individuals, helping them identify and select books without needing to rely on others for assistance.

Academic Research: Utilize the "books" model in academic research projects to identify books and their authors for literary, historical, or sociological studies, making it easier to analyze large datasets of visual book cover material.
Uber Customer Reviews Dataset (2024)
kaggle.com
zip
Updated Dec 19, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Kanchana1990 (2024). Uber Customer Reviews Dataset (2024) [Dataset]. https://www.kaggle.com/datasets/kanchana1990/uber-customer-reviews-dataset-2024/code
Explore at:
zip(469693 bytes)Available download formats
Dataset updated
Dec 19, 2024
Authors
Kanchana1990
License
Open Data Commons Attribution License (ODC-By) v1.0https://www.opendatacommons.org/licenses/by/1.0/
License information was derived automatically
Description
Title: Uber Customer Reviews Dataset (2024)
Subtitle: Sentiment Analysis and Insights from 12,000+ Google Play Store Reviews

Dataset Overview

This dataset contains over 12,000 customer reviews of the Uber app collected from the Google Play Store. The reviews provide insights into user experiences, including ratings, feedback on services, and developer responses. The data is cleaned and anonymized to ensure privacy compliance and ethical usage. It serves as a valuable resource for sentiment analysis, natural language processing (NLP), and machine learning applications.

Data Science Applications

Sentiment Analysis:

Classify reviews as positive, neutral, or negative based on score and content.

Identify trends in customer satisfaction over time.

Natural Language Processing (NLP):

Perform topic modeling to uncover themes in customer feedback.

Generate word clouds to visualize frequently used words in reviews.

Machine Learning:

Train a sentiment classification model using review text as input.

Predict customer satisfaction based on textual feedback.

Business Insights:

Analyze user feedback to identify strengths (e.g., "friendly drivers") and weaknesses (e.g., "long wait times").

Compare app performance across different versions (appVersion).

Time Series Analysis:

Track changes in average scores over time using the at column.

Identify seasonal patterns or anomalies in customer feedback.

Customer Behavior Analysis:

Study the correlation between score and thumbsUpCount to understand review impact.

Analyze the effect of developer replies (replyContent) on customer satisfaction.

Column Descriptors

| Column Name | Description

| userName | Anonymized username of the reviewer.
| userImage | URL of the reviewer's profile image (if available).
| content | Text content of the review describing the user's experience.
| score | Numerical rating given by the user (1–5).
| thumbsUpCount | Number of likes received by the review.
| reviewCreatedVersion | App version at the time of review creation (if available).
| at | Timestamp indicating when the review was posted.
| replyContent | Developer's response to the review (if any).
| repliedAt | Timestamp indicating when the developer replied (if any).
| appVersion | App version string associated with the review (if available).

Ethically Mined Data

This dataset was collected in compliance with ethical web scraping practices: - Data was sourced from publicly available Google Play Store reviews. - Personally identifiable information (PII) such as email addresses,images or phone numbers has been removed.

Acknowledgements

Google Play Store: For providing access to publicly available app reviews.

DALL·E 3: For generating a visually appealing dataset image.
Company Product Sales Analysis & BI Report
kaggle.com
zip
Updated Oct 25, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Oluwabori Abiodun-Johnson (2023). Company Product Sales Analysis & BI Report [Dataset]. https://www.kaggle.com/datasets/oluwaboriaj/pizza-company-sales-bi-report
Explore at:
zip(15967889 bytes)Available download formats
Dataset updated
Oct 25, 2023
Authors
Oluwabori Abiodun-Johnson
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
This is a self-guided project.

PROBLEM STATEMENT: What underlying trends could the company be missing out on in our Pizza Sales data that can aid in gap analysis of its business sales.

OBJECTIVES: 1. Generate Key Performance Indicators (KPIs) of the Pizza Sales data for insight gain into underlying business performance. 2. Visualize important aspects of the Pizza Sales data to gain insight and understand key trends\

I dived into the csv dataset to uncover patterns within the Pizza Sales data which spanned across a calendar.

Used Microsoft SQL SMSS to perform EDA (Exploratory Data Analysis); ergo, identifying trends and sales patterns.

Having completed that, I used the Microsoft Power BI to create a visualization as a means to visually represent of my analytical findings to technical and non-technical viewers.

STEPS COMPLETED: Data Importation SQL Data analysis query writing Data Cleaning Data Processing Data Visualization Report/Dashboard Development
n
Data from: Creating a multi-track classical music performance dataset for...
data-staging.niaid.nih.gov
datasetcatalog.nlm.nih.gov
+5more
zip
Updated Mar 20, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Bochen Li; Xinzhao Liu; Karthik Dinesh; Zhiyao Duan; Gaurav Sharma (2019). Creating a multi-track classical music performance dataset for multi-modal music analysis: challenges, insights, and applications [Dataset]. http://doi.org/10.5061/dryad.ng3r749
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5061/dryad.ng3r749
Dataset updated
Mar 20, 2019
Authors
Bochen Li; Xinzhao Liu; Karthik Dinesh; Zhiyao Duan; Gaurav Sharma
License
https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
Description
We introduce a dataset for facilitating audio-visual analysis of musical performances. The dataset comprises 44 simple multi-instrument classical music pieces assembled from coordinated but separately recorded performances of individual tracks. For each piece, we provide the musical score in MIDI format, the audio recordings of the individual tracks, the audio and video recording of the assembled mixture, and ground- truth annotation files including frame-level and note-level tran- scriptions. We describe our methodology for the creation of the dataset, particularly highlighting our approaches for addressing the challenges involved in maintaining synchronization and ex- pressiveness. We demonstrate the high quality of synchronization achieved with our proposed approach by comparing the dataset against existing widely-used music audio datasets. We anticipate that the dataset will be useful for the devel- opment and evaluation of existing music information retrieval (MIR) tasks, as well as for novel multi-modal tasks. We bench- mark two existing MIR tasks (multi-pitch analysis and score- informed source separation) on the dataset and compare against other existing music audio datasets. Additionally, we consider two novel multi-modal MIR tasks (visually informed multi-pitch analysis and polyphonic vibrato analysis) enabled by the dataset and provide evaluation measures and baseline systems for future comparisons (from our recent work). Finally, we propose several emerging research directions that the dataset enables.
Classic confusion matrix to visually analyze the classification performance...
plos.figshare.com
datasetcatalog.nlm.nih.gov
xls
Updated Jun 11, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
David C. Molik; DeAndre Tomlinson; Shane Davitt; Eric L. Morgan; Matthew Sisk; Benjamin Roche; Natalie Meyers; Michael E. Pfrender (2023). Classic confusion matrix to visually analyze the classification performance of an algorithm. [Dataset]. http://doi.org/10.1371/journal.pntd.0008755.t001
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pntd.0008755.t001
Dataset updated
Jun 11, 2023
Dataset provided by
PLOShttp://plos.org/
Authors
David C. Molik; DeAndre Tomlinson; Shane Davitt; Eric L. Morgan; Matthew Sisk; Benjamin Roche; Natalie Meyers; Michael E. Pfrender
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Classic confusion matrix to visually analyze the classification performance of an algorithm.
Dataset used for analysis.
plos.figshare.com
xlsx
Updated Jul 10, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Chi Thanh Bui; Thi Thuy An Ngo; Huynh Khanh Long Chau; Nguyen Phuc Nguyen Tran (2025). Dataset used for analysis. [Dataset]. http://doi.org/10.1371/journal.pone.0328093.s002
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0328093.s002
Dataset updated
Jul 10, 2025
Dataset provided by
PLOShttp://plos.org/
Authors
Chi Thanh Bui; Thi Thuy An Ngo; Huynh Khanh Long Chau; Nguyen Phuc Nguyen Tran
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
In today’s digital landscape, visual content plays a crucial role in shaping consumer behavior. This study explores how visual electronic word-of-mouth (eWOM) on social media influences online purchase intention, applying the Stimulus-Organism-Response (SOR) framework. Using Partial Least Squares Structural Equation Modeling (PLS-SEM) to analyze data from 335 social media users, this study examines the effects of visual eWOM’s quality, quantity, and credibility on consumer perceptions, attitudes, and ultimately their purchase intentions. Our findings reveal that the quality and credibility of visual eWOM significantly enhance perceived information usefulness and its adoption by consumers. Information quantity, however, primarily influences attitudes towards the information, but does not directly drive its adoption. Contrary to expectations, information usefulness alone cannot predict purchase intention. Instead, information adoption emerges as a key mediator, indicating that consumers must actively engage with and internalize visual content for it to impact their buying behavior. This underscores that the effectiveness of visual eWOM is not solely based on its characteristics but depends on consumers’ active engagement and processing. These insights highlight the need for content that is not only visually appealing but also credible and engaging to facilitate information adoption and drive purchase intentions. This study enhances the understanding of visual eWOM’s impact on online purchasing and provides valuable insights for marketers aiming to optimize digital engagement strategies.
Uber Trip Analysis with Power BI
kaggle.com
zip
Updated Jul 23, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sahil Raj (2025). Uber Trip Analysis with Power BI [Dataset]. https://www.kaggle.com/datasets/ssrai7/uber-trip-analysis-with-power-bi/code
Explore at:
zip(12995785 bytes)Available download formats
Dataset updated
Jul 23, 2025
Authors
Sahil Raj
License
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Description
🚖 Uber Data Analysis Dashboard (Power BI)

This dataset is part of a dashboard project that analyzes Uber ride behavior across different time patterns – built using Microsoft Power BI.

🔍 Project Highlights:

Analyze ride volumes across hours, days, and months

See peak times and hotspots visually

Interactive visuals built in Power BI

Cleaned and prepared Excel dataset also provided

📂 Files:

Uber Trip Details.xlsx – Cleaned dataset

Uber.pbix – Power BI Dashboard file

🌐 Related Links:

🔗 GitHub Project: https://github.com/ssrAiLab/Uber-PowerBI-Project.git

🔗 LinkedIn Post: (to be added after posting)

Feel free to fork, reuse, or share feedback!
c
Amazon pets category images dataset
crawlfeeds.com
jpg, zip
Updated Sep 14, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Crawl Feeds (2024). Amazon pets category images dataset [Dataset]. https://crawlfeeds.com/datasets/amazon-pets-category-images-dataset
Explore at:
jpg, zipAvailable download formats
Dataset updated
Sep 14, 2024
Dataset authored and provided by
Crawl Feeds
License
https://crawlfeeds.com/privacy_policyhttps://crawlfeeds.com/privacy_policy
Description
The Amazon Pets Category Images Dataset is a curated collection of high-resolution images sourced from the pet products category on Amazon. This dataset contains images across various subcategories, such as pet food, toys, grooming tools, bedding, and accessories. With a wide range of products for pets like dogs, cats, birds, and more, this dataset is perfect for researchers, developers, and businesses interested in studying product visuals, conducting market analysis, or training AI models focused on pet-related imagery.

The dataset consists solely of product images, without accompanying metadata or descriptions, offering a straightforward resource for visual analysis, product comparison, or training image-based machine learning models.

Key Features:

Product Categories: Includes pet food, toys, grooming products, bedding, and accessories.

Image Quality: High-resolution images suitable for detailed visual analysis and machine learning.

Pet Types: Covers a variety of pet-related products for dogs, cats, birds, fish, and more.

Source: Extracted from Amazon’s pet products section.

Image Count: Hundreds to thousands of images based on different product categories.

Use Cases:

Image Classification and AI Training: Use the dataset to train machine learning models for product identification and categorization.

Product Comparison: Compare different product designs and packaging in the pet industry using visual data.

Visual Marketing Research: Analyze how pet products are visually represented on Amazon for trends and branding strategies.

Content Creation: Leverage these images for creating pet-related content, including marketing, advertising, and social media.
c
Facebook Dataset
cubig.ai
zip
Updated May 20, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
CUBIG (2025). Facebook Dataset [Dataset]. https://cubig.ai/store/products/269/facebook-dataset
Explore at:
zipAvailable download formats
Dataset updated
May 20, 2025
Dataset authored and provided by
CUBIG
License
https://cubig.ai/store/terms-of-servicehttps://cubig.ai/store/terms-of-service
Measurement technique
Synthetic data generation using AI techniques for model training, Privacy-preserving data transformation via differential privacy
Description
1) Data Introduction • The Facebook Data is a social network analysis data that can be used to identify key user groups that can contribute to business growth and develop recommendation strategies, including Facebook users' activity patterns, interactions, likes, friendships, gender, and age.

2) Data Utilization (1) Facebook Data has characteristics that: • This dataset consists of numerical and categorical variables such as user ID, gender, age, number of friends, number of likes (mobile/web), number of friend requests, number of likes received/sent, and frequency of activities, allowing you to analyze user-specific behavioral characteristics and interaction patterns from multiple angles. (2) Facebook Data can be used to: • Core User Group Targeting and Recommendation Strategies: Use key characteristics such as gender, age, frequency of activity, friends and likes to identify user groups that have a significant impact on business growth and to develop customized content and advertising recommendation strategies. • Analysis of Usage Behavior and Platform Trends: Mobile and Web-based Good By analyzing data such as distribution, age and gender activity patterns, and friend relationship formation, you can visually explore changes in user usage behavior and major trends within the platform.
c
Titanic Dataset
cubig.ai
zip
Updated May 29, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
CUBIG (2025). Titanic Dataset [Dataset]. https://cubig.ai/store/products/393/titanic-dataset
Explore at:
zipAvailable download formats
Dataset updated
May 29, 2025
Dataset authored and provided by
CUBIG
License
https://cubig.ai/store/terms-of-servicehttps://cubig.ai/store/terms-of-service
Measurement technique
Synthetic data generation using AI techniques for model training, Privacy-preserving data transformation via differential privacy
Description
1) Data Introduction • Based on passenger information from the Titanic, which sank in 1912, the Titanic Dataset is a representative binary classification data that includes various demographics and boarding information such as Survived, Passengers Class, Name, Sex, Age, SibSp, Parch, Ticket, Fare, Cabin, and Embarked.

2) Data Utilization (1) Titanic Dataset has characteristics that: • It consists of a total of 891 training samples and 12 to 15 columns (numerical and categorical mix) and also includes variables such as Age, Cabin, and Embarked with some missing values, making it suitable for preprocessing and feature engineering practice. (2) Titanic Dataset can be used to: • Development of survival prediction models: Key characteristics such as passenger rating, gender, age, and fare can be used to predict survival with different machine learning classification models such as logistic regression, random forest, and SVM. • Analysis of survival influencing factors: By analyzing the correlation between variables such as gender, age, socioeconomic status, and survival rates, you can statistically and visually explore which groups have a higher survival probability.
D
Biological Data Visualization Market Report | Global Forecast From 2025 To...
dataintelo.com
csv, pdf, pptx
Updated Jan 7, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Dataintelo (2025). Biological Data Visualization Market Report | Global Forecast From 2025 To 2033 [Dataset]. https://dataintelo.com/report/global-biological-data-visualization-market
Explore at:
csv, pptx, pdfAvailable download formats
Dataset updated
Jan 7, 2025
Dataset authored and provided by
Dataintelo
License
https://dataintelo.com/privacy-and-policyhttps://dataintelo.com/privacy-and-policy
Time period covered
2024 - 2032
Area covered
Global
Description
Biological Data Visualization Market Outlook

The global biological data visualization market size was valued at approximately USD 800 million in 2023 and is expected to reach USD 2.2 billion by 2032, growing at a Compound Annual Growth Rate (CAGR) of 12%. The rising volume of biological data generated through various research activities and the increasing need for advanced analytical tools are key factors driving this market's growth. The integration of artificial intelligence and machine learning in data visualization tools, combined with the growing application of biological data visualization in personalized medicine, are also significant growth drivers.

One of the primary growth factors of the biological data visualization market is the exponential increase in biological data generation due to advancements in high-throughput technologies such as next-generation sequencing (NGS), mass spectrometry, and microarray technology. These technologies produce vast amounts of data that require sophisticated visualization tools for proper analysis and interpretation. Without effective visualization, the potential insights and discoveries within this data may remain untapped, underscoring the market's critical role in modern biological research.

Additionally, the increasing prevalence of complex diseases and the subsequent demand for personalized medicine are fueling the demand for advanced data visualization tools. Personalized medicine relies heavily on the analysis of genetic, proteomic, and other biological data to tailor treatments to individual patients. Effective visualization tools facilitate the interpretation of this complex data, enabling healthcare providers to make informed clinical decisions. This trend is expected to drive substantial growth in the biological data visualization market over the forecast period.

Moreover, there is a growing adoption of cloud-based visualization solutions. Cloud deployment offers significant advantages, including scalability, cost-effectiveness, and accessibility from various locations. This is particularly beneficial for academic and research institutions and smaller biotech companies with limited resources. The integration of cloud computing with advanced visualization tools is expected to further propel market growth, as it allows for more efficient handling and analysis of large datasets.

From a regional perspective, North America currently holds the largest market share, driven by significant investments in research and development, advanced healthcare infrastructure, and high adoption rates of advanced technologies. Europe follows closely, with substantial growth attributed to government support for research initiatives and a strong presence of pharmaceutical and biotech companies. The Asia Pacific region is anticipated to witness the highest CAGR, owing to increasing investments in biotech research, growing healthcare infrastructure, and expanding adoption of advanced technologies in countries like China and India.

In the realm of Life Sciences Analytics, the role of data visualization is becoming increasingly pivotal. Life Sciences Analytics involves the use of data-driven insights to enhance research and development, clinical trials, and patient care. By leveraging advanced visualization tools, researchers and healthcare professionals can gain a deeper understanding of complex biological data, leading to more informed decisions and innovative solutions. The integration of Life Sciences Analytics with data visualization not only facilitates the interpretation of vast datasets but also accelerates the discovery of new patterns and correlations, ultimately advancing the field of personalized medicine.

Component Analysis

The biological data visualization market by component is segmented into software and services. Software solutions constitute the bulk of the market, providing tools that are essential for processing and visually representing complex biological data. These software tools range from basic data plotting programs to advanced systems incorporating machine learning algorithms for predictive modeling. The demand for these tools is driven by their ability to handle large datasets, provide user-friendly interfaces, and offer real-time data visualization capabilities, which are crucial for both research and clinical applications.

In contrast, the services segment, although smaller, plays a crucial role in the market. Services include co
IRIS FLOWER-plot images dataset
kaggle.com
zip
Updated Jun 6, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Rijab Butt (2024). IRIS FLOWER-plot images dataset [Dataset]. https://www.kaggle.com/datasets/irijabbutt/iris-flower-plot-image-dataset
Explore at:
zip(191988944 bytes)Available download formats
Dataset updated
Jun 6, 2024
Authors
Rijab Butt
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
IRIS FLOWER SCATTER PLOT IMAGES DATASET

Overview

This dataset is derived from the well-known Iris flower dataset and contains 5000 images in PNG format. These images represent scatter plots that visually capture the relationships between different pairs of features in the Iris dataset. The original Iris dataset consists of 150 samples from three species of Iris flowers (Iris setosa, Iris versicolor, and Iris virginica), with each sample having four features: sepal length, sepal width, petal length, and petal width. The scatter plot images in this dataset provide visual insights into how these features correlate and differentiate the three species.

Dataset Description

Total Images: 5000 PNG images

Image Format: PNG

Resolution: High-resolution scatter plots (resolution details can be specified)

Source: Derived from the Iris dataset available in Scikit-learn

Feature Pairs: Scatter plots are generated for all possible pairs of features (sepal length vs. sepal width, petal length vs. petal width, etc.) ##**Features of the Dataset** Diverse Visual Representations: The dataset includes scatter plots with various feature pairings, providing comprehensive visual analysis of feature relationships. Species Differentiation: Each scatter plot clearly distinguishes between the three species of Iris flowers using different colors or markers. High Quality: The images are generated with high-quality plotting techniques to ensure clarity and precision in the representation of data points. Annotations: Scatter plots are annotated with axes labels and legends to facilitate easy interpretation. Randomized Samples: The dataset contains 5000 images, which implies multiple scatter plots for each pair of features, with randomized sample selections to cover different aspects and variations within the dataset. ##**Use Cases** Data Visualization: Ideal for educational purposes to demonstrate data visualization techniques and the importance of scatter plots in exploratory data analysis. Machine Learning: Useful for training machine learning models on image recognition tasks, particularly in distinguishing between different species based on visual patterns. Research and Analysis: Can be used in research studies that require a large number of scatter plot images for testing new algorithms in image processing or pattern recognition. ##**Conclusion** The Iris Flower Scatter Plot Images Dataset provides a rich resource for visual data analysis, machine learning training, and educational purposes. By leveraging the classic Iris dataset, it offers a unique way to explore feature relationships through high-quality scatter plot images.
R
Data from: Camera Movements Dataset
universe.roboflow.com
zip
Updated Dec 9, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Diploma thesis (2023). Camera Movements Dataset [Dataset]. https://universe.roboflow.com/diploma-thesis/camera-movements
Explore at:
zipAvailable download formats
Dataset updated
Dec 9, 2023
Dataset authored and provided by
Diploma thesis
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Variables measured
Veins EoFK Polygons
Description
Here are a few use cases for this project:

Manga Content Curation: Utilize the "elements in manga" computer vision model to categorize manga based on specific visual features (cry, vein, chibi, kimono, focus, movement), making it easier for readers to discover content according to their interests and preferences.

Automatic Manga Translation Assistance: Assist translators working on manga localization by identifying key visual elements (cry, vein, chibi, kimono, focus, movement) to inform an accurate and culturally sensitive translation of the material, preserving the original artistic intent.

Artistic Style Analysis: Compare and contrast different manga artists' styles and techniques by analyzing the use of key visual elements (cry, vein, chibi, kimono, focus, movement) in their creations, providing valuable insights for aspiring manga creators and enthusiasts.

Manga Storytelling Aid: Enhance storytelling in manga creation by using the "elements in manga" computer vision model to analyze the impact of specific visual elements (cry, vein, chibi, kimono, focus, movement) on storytelling, pacing, and emotional impact.

Manga Comics Accessibility: Improve accessibility for visually impaired readers by combining the "elements in manga" computer vision model with natural language processing to generate detailed and accurate image descriptions based on the presence of key visual elements (cry, vein, chibi, kimono, focus, movement), making manga content more accessible through screen readers or braille displays.
R
Imagedetection Dataset
universe.roboflow.com
zip
Updated Apr 1, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
custom yolov5 (2023). Imagedetection Dataset [Dataset]. https://universe.roboflow.com/custom-yolov5-fwa2b/imagedetection-kf1ww/model/2
Explore at:
zipAvailable download formats
Dataset updated
Apr 1, 2023
Dataset authored and provided by
custom yolov5
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Variables measured
Letters Bounding Boxes
Description
Here are a few use cases for this project:

Educational Application: This model could be used in educational applications or games designed for children learning to recognize letters or digits. It could help in providing immediate feedback to learners by identifying whether the written letter or digit is correct.

Document Analysis: The model could be applied for document analysis and capturing data from written or printed material, including books, bills, notes, letters, and more. The numbers and special characters capability could be used for capturing amounts, expressions, or nuances in the text.

Accessibility Software: This model could be integrated into accessibility software applications aimed at assisting visually impaired individuals. It can analyze images or real-time video to read out the identified letters, figures, and special characters.

License Plate Recognition: Given its ability to recognize a wide array of symbols, the model could be useful for extracting information from license plates, aiding in security and law enforcement settings.

Handwritten Forms Processing: This computer vision model could be utilized to extract and categorize data from handwritten forms or applications, aiding in the automation of data entry tasks in various organizations.
w
National Panel Survey 2008-2015, Uniform Panel Dataset - Tanzania
microdata.worldbank.org
datacatalog.ihsn.org
+1more
Updated Mar 17, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
National Bureau of Statistics (2021). National Panel Survey 2008-2015, Uniform Panel Dataset - Tanzania [Dataset]. https://microdata.worldbank.org/index.php/catalog/3814
Explore at:
Dataset updated
Mar 17, 2021
Dataset authored and provided by
National Bureau of Statistics
Time period covered
2008 - 2015
Area covered
Tanzania
Description
Abstract

Panel data possess several advantages over conventional cross-sectional and time-series data, including their power to isolate the effects of specific actions, treatments, and general policies often at the core of large-scale econometric development studies. While the concept of panel data alone provides the capacity for modeling the complexities of human behavior, the notion of universal panel data – in which time- and situation-driven variances leading to variations in tools, and thus results, are mitigated – can further enhance exploitation of the richness of panel information.

This Basic Information Document (BID) provides a brief overview of the Tanzania National Panel Survey (NPS), but focuses primarily on the theoretical development and application of panel data, as well as key elements of the universal panel survey instrument and datasets generated by the four rounds of the NPS. As this Basic Information Document (BID) for the UPD does not describe in detail the background, development, or use of the NPS itself, the round-specific NPS BIDs should supplement the information provided here.

The NPS Uniform Panel Dataset (UPD) consists of both survey instruments and datasets, meticulously aligned and engineered with the aim of facilitating the use of and improving access to the wealth of panel data offered by the NPS. The NPS-UPD provides a consistent and straightforward means of conducting not only user-driven analyses using convenient, standardized tools, but also for monitoring MKUKUTA, FYDP II, and other national level development indicators reported by the NPS.

The design of the NPS-UPD combines the four completed rounds of the NPS – NPS 2008/09 (R1), NPS 2010/11 (R2), NPS 2012/13 (R3), and NPS 2014/15 (R4) – into pooled, module-specific survey instruments and datasets. The panel survey instruments offer the ease of comparability over time, with modifications and variances easily identifiable as well as those aspects of the questionnaire which have remained identical and offer consistent information. By providing all module-specific data over time within compact, pooled datasets, panel datasets eliminate the need for user-generated merges between rounds and present data in a clear, logical format, increasing both the usability and comprehension of complex data.

Geographic coverage

Designed for analysis of key indicators at four primary domains of inference, namely: Dar es Salaam, other urban, rural, Zanzibar.

Analysis unit

Households

Individuals

Universe

The universe includes all households and individuals in Tanzania with the exception of those residing in military barracks or other institutions.

Kind of data

Sample survey data [ssd]

Sampling procedure

While the same sample of respondents was maintained over the first three rounds of the NPS, longitudinal surveys tend to suffer from bias introduced by households leaving the survey over time; i.e. attrition. Although the NPS maintains a highly successful recapture rate (roughly 96% retention at the household level), minimizing the escalation of this selection bias, a refresh of longitudinal cohorts was done for the NPS 2014/15 to ensure proper representativeness of estimates while maintaining a sufficient primary sample to maintain cohesion within panel analysis. A newly completed Population and Housing Census (PHC) in 2012, providing updated population figures along with changes in administrative boundaries, emboldened the opportunity to realign the NPS sample and abate collective bias potentially introduced through attrition.

To maintain the panel concept of the NPS, the sample design for NPS 2014/2015 consisted of a combination of the original NPS sample and a new NPS sample. A nationally representative sub-sample was selected to continue as part of the “Extended Panel” while an entirely new sample, “Refresh Panel”, was selected to represent national and sub-national domains. Similar to the sample in NPS 2008/2009, the sample design for the “Refresh Panel” allows analysis at four primary domains of inference, namely: Dar es Salaam, other urban areas on mainland Tanzania, rural mainland Tanzania, and Zanzibar. This new cohort in NPS 2014/2015 will be maintained and tracked in all future rounds between national censuses.

Mode of data collection

Face-to-face [f2f]

Research instrument

The format of the NPS-UPD survey instrument is similar to previously disseminated NPS survey instruments. Each module has a questionnaire and clearly identifies if the module collects information at the individual or household level. Within each module-specific questionnaire of the NPS-UPD survey instrument, there are five distinct sections, arranged vertically: (1) the UPD - “U” on the survey instrument, (2) R4, (3), R3, (4) R2, and (5) R1 – the latter 4 sections presenting each questionnaire in its original form at time of its respective dissemination.

The uppermost section of each module’s questionnaire (“U”) represents the model universal panel questionnaire, with questions generated from the comprehensive listing of questions across all four rounds of the NPS and codes generated from the comprehensive collection of codes. The following sections are arranged vertically by round, considering R4 as most recent. While not all rounds will have data reported for each question in the UPD and not each question will have reports for each of the UPD codes listed, the NPS-UPD survey instrument represents the visual, all-inclusive set of information collected by the NPS over time.

The four round-specific sections (R4, R3, R2, R1) are aligned with their UPD-equivalent question, visually presenting their contribution to compatibility with the UPD. Each round-specific section includes the original round-specific variable names, response codes and skip patterns (corresponding to their respective round-specific NPS data sets, and despite their variance from other rounds or from the comprehensive UPD code listing)4.
d
Market Value Analysis 2015
catalog.data.gov
data.nola.gov
+1more
Updated Jul 12, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
data.nola.gov (2025). Market Value Analysis 2015 [Dataset]. https://catalog.data.gov/dataset/market-value-analysis-2015
Explore at:
Dataset updated
Jul 12, 2025
Dataset provided by
data.nola.gov
Description
Our Normative Assumptions when Analyzing Markets: • Public subsidy is scarce and it alone cannot create a market; • Public subsidy must be used to leverage, or clear the path for, private investment; • In distressed markets, invest into strength (e.g., major institutions, transportation hubs, environmental amenities) – “Build from Strength”; • All parts of a city are customers of the services and resources that it has to offer; • Decisions to invest and/or deploy governmental programs must be based on objectively gathered data and sound quantitative and qualitative analysis. Preparing the MVA:1. Take all of the data layers and geocode to Census block groups.2. Inspect and validate those data layers.3. Using a statistical cluster analysis, identify areas that share a common constellation of characteristics.4. Map the result.5. Visually inspect areas of the City for conformity with the statistical/spatial representation.6. Re-solve and re-inspect until we achieve an accurate representation.
Visualization of networks – analyzing and visualizing connections between...
meta4ds.fokus.fraunhofer.de
unknown, zip
Updated Oct 4, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Zenodo (2021). Visualization of networks – analyzing and visualizing connections between (planned) NFDI consortia [Dataset]. https://meta4ds.fokus.fraunhofer.de/datasets/oai-zenodo-org-5548239?locale=en
Explore at:
zip(17160587), unknownAvailable download formats
Dataset updated
Oct 4, 2021
Dataset authored and provided by
Zenodohttp://zenodo.org/
Description
This is a fix-release for some broken links in the README. Thanks to @HenningTimm for the community-driven support. ------ Dorothea Strecker, Sama Majidian, Lukas C. Bossert, Évariste Demandt This repository contains materials used during the workshop "Visualization of networks – analyzing and visualizing connections between (planned) NFDI consortia" at the NFDI4Ing Community Meeting (NFDI4Ing Konferenz) 2021 on September 28. During the workshop, the network of (planned) NFDI consortia was visualized and analyzed using the statistical software R and the library igraph. Abstract: Currently, Germany's National Research Data Infrastructure spans a network of nine funded consortia from the first round and ten from the second round. This workshop enables you to visually display and analyze the network of consortia in your internet browser via a remote Jupyter Notebook. The workshop follows the tradition of literate programming. No prior experience in programming and no locally installed software needed – let's weave and tangle ! Slides The presentation slides for the workshop are stored in the file "NFDI4Ing_Community_Meeting_2021.pdf". JupyerNotebook for visualization of networks with R. In the interactive part of the workshop we worked with JupyterNotebooks. The documented sample solution is stored in various formats in the folder Notebook. Direct exports from JupyterNotebook are provided in the following formats: JupyterNotebook (R) PDF (via LuaLaTeX) org-mode Markdown Rscript Webpage WebSlides This repository is licensed under the MIT License.
c
Women National Basketball Association Shots Dataset
cubig.ai
zip
Updated Jun 12, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
CUBIG (2025). Women National Basketball Association Shots Dataset [Dataset]. https://cubig.ai/store/products/469/women-national-basketball-association-shots-dataset
Explore at:
zipAvailable download formats
Dataset updated
Jun 12, 2025
Dataset authored and provided by
CUBIG
License
https://cubig.ai/store/terms-of-servicehttps://cubig.ai/store/terms-of-service
Measurement technique
Privacy-preserving data transformation via differential privacy, Synthetic data generation using AI techniques for model training
Description
1) Data Introduction • The Women National Basketball Association Shots Dataset is a dataset that compiled a total of 41,497 attempted shots in the WNBA during the 2021–2022 season, including the game ID, type, success or failure, scoring value, spatial coordinates, team and score status, and remaining time per quarter.

2) Data Utilization (1) Women National Basketball Association Shots Dataset has characteristics that: • This dataset is a mixture of categorical and numerical variables, and contains both the space and context information of the shot. (2) Women National Basketball Association Shots Dataset can be used to: • Predicting Shoot Success: By training a machine learning classification model with input of x·y coordinates and match situation information, we can predict the success or failure of each shot attempt. • Shoot Position Pattern Analysis: Create a heat map or contour plot based on coordinate_x and coordinate_y to visually analyze frequently attempted shoot positions and success patterns.

Facebook

Twitter

Click to copy link

Link copied

Cite

CUBIG (2025). Hacker News Sentiment Analysis Dataset [Dataset]. https://cubig.ai/store/products/586/hacker-news-sentiment-analysis-dataset

Hacker News Sentiment Analysis Dataset

Explore at:

zipAvailable download formats

Dataset updated

Jul 14, 2025

Dataset authored and provided by

CUBIG

License

https://cubig.ai/store/terms-of-servicehttps://cubig.ai/store/terms-of-service

Measurement technique

Privacy-preserving data transformation via differential privacy, Synthetic data generation using AI techniques for model training

Description

1) Data Introduction • The Hacker News Sentiment Analysis Dataset is a technology community public opinion analysis data that provides an emotional analysis (polarity, subjectivity, and emotional categories) of each of the top 141 hacker news posts along with the title, URL, point, and comment count.

2) Data Utilization (1) Hacker News Sentiment Analysis Dataset has characteristics that: • This dataset includes polar (-1-1), subjectivity (0-1), and category (positive/neutral/negative) columns that quantify the sentiment of comments using TextBlob, based on the latest top posts as of June 24, 2025. • It is generated through web scraping and NLP preprocessing, and allows for quantitative comparison of community responses to technology news. (2) Hacker News Sentiment Analysis Dataset can be used to: • Visualize technology trends Emotional: Connect emotional scores with post topics to visually analyze community response patterns to specific technology news such as AI and policies. • NLP Model Learning: Emotional classification models can be trained using comment data with real-world technical discussions or applied to research on the subjectivity prediction of comments.

Clear search

Close search

Google apps

Main menu

Hacker News Sentiment Analysis Dataset

ckanext-data-comparision

Books Dataset

Uber Customer Reviews Dataset (2024)

Dataset Overview

Data Science Applications

Column Descriptors

Ethically Mined Data

Acknowledgements

Company Product Sales Analysis & BI Report

Data from: Creating a multi-track classical music performance dataset for...

Classic confusion matrix to visually analyze the classification performance...

Dataset used for analysis.

Uber Trip Analysis with Power BI

🚖 Uber Data Analysis Dashboard (Power BI)

🔍 Project Highlights:

📂 Files:

🌐 Related Links:

Amazon pets category images dataset

Use Cases:

Facebook Dataset

Titanic Dataset

Biological Data Visualization Market Report | Global Forecast From 2025 To...

Biological Data Visualization Market Outlook

Component Analysis

IRIS FLOWER-plot images dataset

IRIS FLOWER SCATTER PLOT IMAGES DATASET

Overview

Dataset Description

Data from: Camera Movements Dataset

Imagedetection Dataset

National Panel Survey 2008-2015, Uniform Panel Dataset - Tanzania

Abstract

Geographic coverage

Analysis unit

Universe

Kind of data

Sampling procedure

Mode of data collection

Research instrument

Market Value Analysis 2015

Visualization of networks – analyzing and visualizing connections between...

Women National Basketball Association Shots Dataset

Hacker News Sentiment Analysis Dataset