Facebook
Twitterhttps://cubig.ai/store/terms-of-servicehttps://cubig.ai/store/terms-of-service
1) Data Introduction • The Hacker News Sentiment Analysis Dataset is a technology community public opinion analysis data that provides an emotional analysis (polarity, subjectivity, and emotional categories) of each of the top 141 hacker news posts along with the title, URL, point, and comment count.
2) Data Utilization (1) Hacker News Sentiment Analysis Dataset has characteristics that: • This dataset includes polar (-1-1), subjectivity (0-1), and category (positive/neutral/negative) columns that quantify the sentiment of comments using TextBlob, based on the latest top posts as of June 24, 2025. • It is generated through web scraping and NLP preprocessing, and allows for quantitative comparison of community responses to technology news. (2) Hacker News Sentiment Analysis Dataset can be used to: • Visualize technology trends Emotional: Connect emotional scores with post topics to visually analyze community response patterns to specific technology news such as AI and policies. • NLP Model Learning: Emotional classification models can be trained using comment data with real-world technical discussions or applied to research on the subjectivity prediction of comments.
Facebook
TwitterThe Data Comparision extension for CKAN allows users to compare data from CSV and XLSX files through visualizations. This extension aims to enhance data analysis capabilities within CKAN by providing a direct visual comparison of data sets, facilitating a better understanding of the content. The extension is compatible with CKAN 2.9, providing an extended feature set for data comparison. Key Features: CSV/XLSX Data Support: Enables the comparison of data stored in common tabular formats such as CSV and XLSX files by leveraging the extension's visualization capabilities. Visual Data Comparison: Supports visualizing the data for side-by-side comparisons, allowing users to easily identify differences and similarities between datasets. Chart.js Integration: Relies on Chart.js library for generating data visualizations, specifically for the comparison feature. This ensures compatibility and a wide selection of chart formats. Technical Integration: The extension needs to be added to the ckan.plugins setting in the CKAN configuration file (/etc/ckan/default/ckan.ini by default). It also requires installing Chart.js via npm. After these configurations and a CKAN restart, the plugin extends CKAN's user interface with data comparison features. Benefits & Impact: By facilitating a comparative view of data stored in CSV and XLSX formats, the Data Comparison extension reduces the effort needed to analyze datasets. This enables better decision-making based on clear, visually represented data comparisons. Because the data is visualized, differences and similarities are noted quicker than non-visualized data.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Here are a few use cases for this project:
Library Management: Use the "books" model to help librarians identify and categorize books based on their covers, making it easier to manage book inventory, re-shelve returned books, and locate misplaced books.
Bookstore Assistance: Implement the "books" model in bookstores' mobile apps or in-store kiosks, allowing customers to quickly find books they are looking for and discover related titles by snapping a picture of a book cover.
Book Club Recommendation Engine: Use the "books" model as the basis for a book club app that recommends new titles based on the book covers from previous selections, helping users explore new genres and authors.
Accessibility for Visually Impaired: Integrate the "books" model into software or devices designed for visually impaired individuals, helping them identify and select books without needing to rely on others for assistance.
Academic Research: Utilize the "books" model in academic research projects to identify books and their authors for literary, historical, or sociological studies, making it easier to analyze large datasets of visual book cover material.
Facebook
TwitterOpen Data Commons Attribution License (ODC-By) v1.0https://www.opendatacommons.org/licenses/by/1.0/
License information was derived automatically
Title: Uber Customer Reviews Dataset (2024)
Subtitle: Sentiment Analysis and Insights from 12,000+ Google Play Store Reviews
This dataset contains over 12,000 customer reviews of the Uber app collected from the Google Play Store. The reviews provide insights into user experiences, including ratings, feedback on services, and developer responses. The data is cleaned and anonymized to ensure privacy compliance and ethical usage. It serves as a valuable resource for sentiment analysis, natural language processing (NLP), and machine learning applications.
Sentiment Analysis:
score and content.Natural Language Processing (NLP):
Machine Learning:
Business Insights:
appVersion).Time Series Analysis:
at column.Customer Behavior Analysis:
score and thumbsUpCount to understand review impact.replyContent) on customer satisfaction.| Column Name | Description
| userName | Anonymized username of the reviewer.
| userImage | URL of the reviewer's profile image (if available).
| content | Text content of the review describing the user's experience.
| score | Numerical rating given by the user (1–5).
| thumbsUpCount | Number of likes received by the review.
| reviewCreatedVersion | App version at the time of review creation (if available).
| at | Timestamp indicating when the review was posted.
| replyContent | Developer's response to the review (if any).
| repliedAt | Timestamp indicating when the developer replied (if any).
| appVersion | App version string associated with the review (if available).
This dataset was collected in compliance with ethical web scraping practices: - Data was sourced from publicly available Google Play Store reviews. - Personally identifiable information (PII) such as email addresses,images or phone numbers has been removed.
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
This is a self-guided project.
PROBLEM STATEMENT: What underlying trends could the company be missing out on in our Pizza Sales data that can aid in gap analysis of its business sales.
OBJECTIVES: 1. Generate Key Performance Indicators (KPIs) of the Pizza Sales data for insight gain into underlying business performance. 2. Visualize important aspects of the Pizza Sales data to gain insight and understand key trends\
I dived into the csv dataset to uncover patterns within the Pizza Sales data which spanned across a calendar.
Used Microsoft SQL SMSS to perform EDA (Exploratory Data Analysis); ergo, identifying trends and sales patterns.
Having completed that, I used the Microsoft Power BI to create a visualization as a means to visually represent of my analytical findings to technical and non-technical viewers.
STEPS COMPLETED: Data Importation SQL Data analysis query writing Data Cleaning Data Processing Data Visualization Report/Dashboard Development
Facebook
Twitterhttps://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
We introduce a dataset for facilitating audio-visual analysis of musical performances. The dataset comprises 44 simple multi-instrument classical music pieces assembled from coordinated but separately recorded performances of individual tracks. For each piece, we provide the musical score in MIDI format, the audio recordings of the individual tracks, the audio and video recording of the assembled mixture, and ground- truth annotation files including frame-level and note-level tran- scriptions. We describe our methodology for the creation of the dataset, particularly highlighting our approaches for addressing the challenges involved in maintaining synchronization and ex- pressiveness. We demonstrate the high quality of synchronization achieved with our proposed approach by comparing the dataset against existing widely-used music audio datasets. We anticipate that the dataset will be useful for the devel- opment and evaluation of existing music information retrieval (MIR) tasks, as well as for novel multi-modal tasks. We bench- mark two existing MIR tasks (multi-pitch analysis and score- informed source separation) on the dataset and compare against other existing music audio datasets. Additionally, we consider two novel multi-modal MIR tasks (visually informed multi-pitch analysis and polyphonic vibrato analysis) enabled by the dataset and provide evaluation measures and baseline systems for future comparisons (from our recent work). Finally, we propose several emerging research directions that the dataset enables.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Classic confusion matrix to visually analyze the classification performance of an algorithm.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
In today’s digital landscape, visual content plays a crucial role in shaping consumer behavior. This study explores how visual electronic word-of-mouth (eWOM) on social media influences online purchase intention, applying the Stimulus-Organism-Response (SOR) framework. Using Partial Least Squares Structural Equation Modeling (PLS-SEM) to analyze data from 335 social media users, this study examines the effects of visual eWOM’s quality, quantity, and credibility on consumer perceptions, attitudes, and ultimately their purchase intentions. Our findings reveal that the quality and credibility of visual eWOM significantly enhance perceived information usefulness and its adoption by consumers. Information quantity, however, primarily influences attitudes towards the information, but does not directly drive its adoption. Contrary to expectations, information usefulness alone cannot predict purchase intention. Instead, information adoption emerges as a key mediator, indicating that consumers must actively engage with and internalize visual content for it to impact their buying behavior. This underscores that the effectiveness of visual eWOM is not solely based on its characteristics but depends on consumers’ active engagement and processing. These insights highlight the need for content that is not only visually appealing but also credible and engaging to facilitate information adoption and drive purchase intentions. This study enhances the understanding of visual eWOM’s impact on online purchasing and provides valuable insights for marketers aiming to optimize digital engagement strategies.
Facebook
TwitterAttribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
This dataset is part of a dashboard project that analyzes Uber ride behavior across different time patterns – built using Microsoft Power BI.
Feel free to fork, reuse, or share feedback!
Facebook
Twitterhttps://crawlfeeds.com/privacy_policyhttps://crawlfeeds.com/privacy_policy
The Amazon Pets Category Images Dataset is a curated collection of high-resolution images sourced from the pet products category on Amazon. This dataset contains images across various subcategories, such as pet food, toys, grooming tools, bedding, and accessories. With a wide range of products for pets like dogs, cats, birds, and more, this dataset is perfect for researchers, developers, and businesses interested in studying product visuals, conducting market analysis, or training AI models focused on pet-related imagery.
The dataset consists solely of product images, without accompanying metadata or descriptions, offering a straightforward resource for visual analysis, product comparison, or training image-based machine learning models.
Key Features:
Facebook
Twitterhttps://cubig.ai/store/terms-of-servicehttps://cubig.ai/store/terms-of-service
1) Data Introduction • The Facebook Data is a social network analysis data that can be used to identify key user groups that can contribute to business growth and develop recommendation strategies, including Facebook users' activity patterns, interactions, likes, friendships, gender, and age.
2) Data Utilization (1) Facebook Data has characteristics that: • This dataset consists of numerical and categorical variables such as user ID, gender, age, number of friends, number of likes (mobile/web), number of friend requests, number of likes received/sent, and frequency of activities, allowing you to analyze user-specific behavioral characteristics and interaction patterns from multiple angles. (2) Facebook Data can be used to: • Core User Group Targeting and Recommendation Strategies: Use key characteristics such as gender, age, frequency of activity, friends and likes to identify user groups that have a significant impact on business growth and to develop customized content and advertising recommendation strategies. • Analysis of Usage Behavior and Platform Trends: Mobile and Web-based Good By analyzing data such as distribution, age and gender activity patterns, and friend relationship formation, you can visually explore changes in user usage behavior and major trends within the platform.
Facebook
Twitterhttps://cubig.ai/store/terms-of-servicehttps://cubig.ai/store/terms-of-service
1) Data Introduction • Based on passenger information from the Titanic, which sank in 1912, the Titanic Dataset is a representative binary classification data that includes various demographics and boarding information such as Survived, Passengers Class, Name, Sex, Age, SibSp, Parch, Ticket, Fare, Cabin, and Embarked.
2) Data Utilization (1) Titanic Dataset has characteristics that: • It consists of a total of 891 training samples and 12 to 15 columns (numerical and categorical mix) and also includes variables such as Age, Cabin, and Embarked with some missing values, making it suitable for preprocessing and feature engineering practice. (2) Titanic Dataset can be used to: • Development of survival prediction models: Key characteristics such as passenger rating, gender, age, and fare can be used to predict survival with different machine learning classification models such as logistic regression, random forest, and SVM. • Analysis of survival influencing factors: By analyzing the correlation between variables such as gender, age, socioeconomic status, and survival rates, you can statistically and visually explore which groups have a higher survival probability.
Facebook
Twitterhttps://dataintelo.com/privacy-and-policyhttps://dataintelo.com/privacy-and-policy
The global biological data visualization market size was valued at approximately USD 800 million in 2023 and is expected to reach USD 2.2 billion by 2032, growing at a Compound Annual Growth Rate (CAGR) of 12%. The rising volume of biological data generated through various research activities and the increasing need for advanced analytical tools are key factors driving this market's growth. The integration of artificial intelligence and machine learning in data visualization tools, combined with the growing application of biological data visualization in personalized medicine, are also significant growth drivers.
One of the primary growth factors of the biological data visualization market is the exponential increase in biological data generation due to advancements in high-throughput technologies such as next-generation sequencing (NGS), mass spectrometry, and microarray technology. These technologies produce vast amounts of data that require sophisticated visualization tools for proper analysis and interpretation. Without effective visualization, the potential insights and discoveries within this data may remain untapped, underscoring the market's critical role in modern biological research.
Additionally, the increasing prevalence of complex diseases and the subsequent demand for personalized medicine are fueling the demand for advanced data visualization tools. Personalized medicine relies heavily on the analysis of genetic, proteomic, and other biological data to tailor treatments to individual patients. Effective visualization tools facilitate the interpretation of this complex data, enabling healthcare providers to make informed clinical decisions. This trend is expected to drive substantial growth in the biological data visualization market over the forecast period.
Moreover, there is a growing adoption of cloud-based visualization solutions. Cloud deployment offers significant advantages, including scalability, cost-effectiveness, and accessibility from various locations. This is particularly beneficial for academic and research institutions and smaller biotech companies with limited resources. The integration of cloud computing with advanced visualization tools is expected to further propel market growth, as it allows for more efficient handling and analysis of large datasets.
From a regional perspective, North America currently holds the largest market share, driven by significant investments in research and development, advanced healthcare infrastructure, and high adoption rates of advanced technologies. Europe follows closely, with substantial growth attributed to government support for research initiatives and a strong presence of pharmaceutical and biotech companies. The Asia Pacific region is anticipated to witness the highest CAGR, owing to increasing investments in biotech research, growing healthcare infrastructure, and expanding adoption of advanced technologies in countries like China and India.
In the realm of Life Sciences Analytics, the role of data visualization is becoming increasingly pivotal. Life Sciences Analytics involves the use of data-driven insights to enhance research and development, clinical trials, and patient care. By leveraging advanced visualization tools, researchers and healthcare professionals can gain a deeper understanding of complex biological data, leading to more informed decisions and innovative solutions. The integration of Life Sciences Analytics with data visualization not only facilitates the interpretation of vast datasets but also accelerates the discovery of new patterns and correlations, ultimately advancing the field of personalized medicine.
The biological data visualization market by component is segmented into software and services. Software solutions constitute the bulk of the market, providing tools that are essential for processing and visually representing complex biological data. These software tools range from basic data plotting programs to advanced systems incorporating machine learning algorithms for predictive modeling. The demand for these tools is driven by their ability to handle large datasets, provide user-friendly interfaces, and offer real-time data visualization capabilities, which are crucial for both research and clinical applications.
In contrast, the services segment, although smaller, plays a crucial role in the market. Services include co
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset is derived from the well-known Iris flower dataset and contains 5000 images in PNG format. These images represent scatter plots that visually capture the relationships between different pairs of features in the Iris dataset. The original Iris dataset consists of 150 samples from three species of Iris flowers (Iris setosa, Iris versicolor, and Iris virginica), with each sample having four features: sepal length, sepal width, petal length, and petal width. The scatter plot images in this dataset provide visual insights into how these features correlate and differentiate the three species.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Here are a few use cases for this project:
Manga Content Curation: Utilize the "elements in manga" computer vision model to categorize manga based on specific visual features (cry, vein, chibi, kimono, focus, movement), making it easier for readers to discover content according to their interests and preferences.
Automatic Manga Translation Assistance: Assist translators working on manga localization by identifying key visual elements (cry, vein, chibi, kimono, focus, movement) to inform an accurate and culturally sensitive translation of the material, preserving the original artistic intent.
Artistic Style Analysis: Compare and contrast different manga artists' styles and techniques by analyzing the use of key visual elements (cry, vein, chibi, kimono, focus, movement) in their creations, providing valuable insights for aspiring manga creators and enthusiasts.
Manga Storytelling Aid: Enhance storytelling in manga creation by using the "elements in manga" computer vision model to analyze the impact of specific visual elements (cry, vein, chibi, kimono, focus, movement) on storytelling, pacing, and emotional impact.
Manga Comics Accessibility: Improve accessibility for visually impaired readers by combining the "elements in manga" computer vision model with natural language processing to generate detailed and accurate image descriptions based on the presence of key visual elements (cry, vein, chibi, kimono, focus, movement), making manga content more accessible through screen readers or braille displays.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Here are a few use cases for this project:
Educational Application: This model could be used in educational applications or games designed for children learning to recognize letters or digits. It could help in providing immediate feedback to learners by identifying whether the written letter or digit is correct.
Document Analysis: The model could be applied for document analysis and capturing data from written or printed material, including books, bills, notes, letters, and more. The numbers and special characters capability could be used for capturing amounts, expressions, or nuances in the text.
Accessibility Software: This model could be integrated into accessibility software applications aimed at assisting visually impaired individuals. It can analyze images or real-time video to read out the identified letters, figures, and special characters.
License Plate Recognition: Given its ability to recognize a wide array of symbols, the model could be useful for extracting information from license plates, aiding in security and law enforcement settings.
Handwritten Forms Processing: This computer vision model could be utilized to extract and categorize data from handwritten forms or applications, aiding in the automation of data entry tasks in various organizations.
Facebook
TwitterPanel data possess several advantages over conventional cross-sectional and time-series data, including their power to isolate the effects of specific actions, treatments, and general policies often at the core of large-scale econometric development studies. While the concept of panel data alone provides the capacity for modeling the complexities of human behavior, the notion of universal panel data – in which time- and situation-driven variances leading to variations in tools, and thus results, are mitigated – can further enhance exploitation of the richness of panel information.
This Basic Information Document (BID) provides a brief overview of the Tanzania National Panel Survey (NPS), but focuses primarily on the theoretical development and application of panel data, as well as key elements of the universal panel survey instrument and datasets generated by the four rounds of the NPS. As this Basic Information Document (BID) for the UPD does not describe in detail the background, development, or use of the NPS itself, the round-specific NPS BIDs should supplement the information provided here.
The NPS Uniform Panel Dataset (UPD) consists of both survey instruments and datasets, meticulously aligned and engineered with the aim of facilitating the use of and improving access to the wealth of panel data offered by the NPS. The NPS-UPD provides a consistent and straightforward means of conducting not only user-driven analyses using convenient, standardized tools, but also for monitoring MKUKUTA, FYDP II, and other national level development indicators reported by the NPS.
The design of the NPS-UPD combines the four completed rounds of the NPS – NPS 2008/09 (R1), NPS 2010/11 (R2), NPS 2012/13 (R3), and NPS 2014/15 (R4) – into pooled, module-specific survey instruments and datasets. The panel survey instruments offer the ease of comparability over time, with modifications and variances easily identifiable as well as those aspects of the questionnaire which have remained identical and offer consistent information. By providing all module-specific data over time within compact, pooled datasets, panel datasets eliminate the need for user-generated merges between rounds and present data in a clear, logical format, increasing both the usability and comprehension of complex data.
Designed for analysis of key indicators at four primary domains of inference, namely: Dar es Salaam, other urban, rural, Zanzibar.
The universe includes all households and individuals in Tanzania with the exception of those residing in military barracks or other institutions.
Sample survey data [ssd]
While the same sample of respondents was maintained over the first three rounds of the NPS, longitudinal surveys tend to suffer from bias introduced by households leaving the survey over time; i.e. attrition. Although the NPS maintains a highly successful recapture rate (roughly 96% retention at the household level), minimizing the escalation of this selection bias, a refresh of longitudinal cohorts was done for the NPS 2014/15 to ensure proper representativeness of estimates while maintaining a sufficient primary sample to maintain cohesion within panel analysis. A newly completed Population and Housing Census (PHC) in 2012, providing updated population figures along with changes in administrative boundaries, emboldened the opportunity to realign the NPS sample and abate collective bias potentially introduced through attrition.
To maintain the panel concept of the NPS, the sample design for NPS 2014/2015 consisted of a combination of the original NPS sample and a new NPS sample. A nationally representative sub-sample was selected to continue as part of the “Extended Panel” while an entirely new sample, “Refresh Panel”, was selected to represent national and sub-national domains. Similar to the sample in NPS 2008/2009, the sample design for the “Refresh Panel” allows analysis at four primary domains of inference, namely: Dar es Salaam, other urban areas on mainland Tanzania, rural mainland Tanzania, and Zanzibar. This new cohort in NPS 2014/2015 will be maintained and tracked in all future rounds between national censuses.
Face-to-face [f2f]
The format of the NPS-UPD survey instrument is similar to previously disseminated NPS survey instruments. Each module has a questionnaire and clearly identifies if the module collects information at the individual or household level. Within each module-specific questionnaire of the NPS-UPD survey instrument, there are five distinct sections, arranged vertically: (1) the UPD - “U” on the survey instrument, (2) R4, (3), R3, (4) R2, and (5) R1 – the latter 4 sections presenting each questionnaire in its original form at time of its respective dissemination.
The uppermost section of each module’s questionnaire (“U”) represents the model universal panel questionnaire, with questions generated from the comprehensive listing of questions across all four rounds of the NPS and codes generated from the comprehensive collection of codes. The following sections are arranged vertically by round, considering R4 as most recent. While not all rounds will have data reported for each question in the UPD and not each question will have reports for each of the UPD codes listed, the NPS-UPD survey instrument represents the visual, all-inclusive set of information collected by the NPS over time.
The four round-specific sections (R4, R3, R2, R1) are aligned with their UPD-equivalent question, visually presenting their contribution to compatibility with the UPD. Each round-specific section includes the original round-specific variable names, response codes and skip patterns (corresponding to their respective round-specific NPS data sets, and despite their variance from other rounds or from the comprehensive UPD code listing)4.
Facebook
TwitterOur Normative Assumptions when Analyzing Markets: • Public subsidy is scarce and it alone cannot create a market; • Public subsidy must be used to leverage, or clear the path for, private investment; • In distressed markets, invest into strength (e.g., major institutions, transportation hubs, environmental amenities) – “Build from Strength”; • All parts of a city are customers of the services and resources that it has to offer; • Decisions to invest and/or deploy governmental programs must be based on objectively gathered data and sound quantitative and qualitative analysis. Preparing the MVA:1. Take all of the data layers and geocode to Census block groups.2. Inspect and validate those data layers.3. Using a statistical cluster analysis, identify areas that share a common constellation of characteristics.4. Map the result.5. Visually inspect areas of the City for conformity with the statistical/spatial representation.6. Re-solve and re-inspect until we achieve an accurate representation.
Facebook
TwitterThis is a fix-release for some broken links in the README. Thanks to @HenningTimm for the community-driven support. ------ Dorothea Strecker, Sama Majidian, Lukas C. Bossert, Évariste Demandt This repository contains materials used during the workshop "Visualization of networks – analyzing and visualizing connections between (planned) NFDI consortia" at the NFDI4Ing Community Meeting (NFDI4Ing Konferenz) 2021 on September 28. During the workshop, the network of (planned) NFDI consortia was visualized and analyzed using the statistical software R and the library igraph. Abstract: Currently, Germany's National Research Data Infrastructure spans a network of nine funded consortia from the first round and ten from the second round. This workshop enables you to visually display and analyze the network of consortia in your internet browser via a remote Jupyter Notebook. The workshop follows the tradition of literate programming. No prior experience in programming and no locally installed software needed – let's weave and tangle ! Slides The presentation slides for the workshop are stored in the file "NFDI4Ing_Community_Meeting_2021.pdf". JupyerNotebook for visualization of networks with R. In the interactive part of the workshop we worked with JupyterNotebooks. The documented sample solution is stored in various formats in the folder Notebook. Direct exports from JupyterNotebook are provided in the following formats: JupyterNotebook (R) PDF (via LuaLaTeX) org-mode Markdown Rscript Webpage WebSlides This repository is licensed under the MIT License.
Facebook
Twitterhttps://cubig.ai/store/terms-of-servicehttps://cubig.ai/store/terms-of-service
1) Data Introduction • The Women National Basketball Association Shots Dataset is a dataset that compiled a total of 41,497 attempted shots in the WNBA during the 2021–2022 season, including the game ID, type, success or failure, scoring value, spatial coordinates, team and score status, and remaining time per quarter.
2) Data Utilization (1) Women National Basketball Association Shots Dataset has characteristics that: • This dataset is a mixture of categorical and numerical variables, and contains both the space and context information of the shot. (2) Women National Basketball Association Shots Dataset can be used to: • Predicting Shoot Success: By training a machine learning classification model with input of x·y coordinates and match situation information, we can predict the success or failure of each shot attempt. • Shoot Position Pattern Analysis: Create a heat map or contour plot based on coordinate_x and coordinate_y to visually analyze frequently attempted shoot positions and success patterns.
Facebook
Twitterhttps://cubig.ai/store/terms-of-servicehttps://cubig.ai/store/terms-of-service
1) Data Introduction • The Hacker News Sentiment Analysis Dataset is a technology community public opinion analysis data that provides an emotional analysis (polarity, subjectivity, and emotional categories) of each of the top 141 hacker news posts along with the title, URL, point, and comment count.
2) Data Utilization (1) Hacker News Sentiment Analysis Dataset has characteristics that: • This dataset includes polar (-1-1), subjectivity (0-1), and category (positive/neutral/negative) columns that quantify the sentiment of comments using TextBlob, based on the latest top posts as of June 24, 2025. • It is generated through web scraping and NLP preprocessing, and allows for quantitative comparison of community responses to technology news. (2) Hacker News Sentiment Analysis Dataset can be used to: • Visualize technology trends Emotional: Connect emotional scores with post topics to visually analyze community response patterns to specific technology news such as AI and policies. • NLP Model Learning: Emotional classification models can be trained using comment data with real-world technical discussions or applied to research on the subjectivity prediction of comments.