100+ datasets found

student data analysis
kaggle.com
Updated Nov 17, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
maira javeed (2023). student data analysis [Dataset]. https://www.kaggle.com/datasets/mairajaveed/student-data-analysis
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Nov 17, 2023
Dataset provided by
Kagglehttp://kaggle.com/
Authors
maira javeed
Description
In this project, we aim to analyze and gain insights into the performance of students based on various factors that influence their academic achievements. We have collected data related to students' demographic information, family background, and their exam scores in different subjects.

**********Key Objectives:*********

Performance Evaluation: Evaluate and understand the academic performance of students by analyzing their scores in various subjects.

Identifying Underlying Factors: Investigate factors that might contribute to variations in student performance, such as parental education, family size, and student attendance.

Visualizing Insights: Create data visualizations to present the findings effectively and intuitively.

Dataset Details:

The dataset used in this analysis contains information about students, including their age, gender, parental education, lunch type, and test scores in subjects like mathematics, reading, and writing.

Analysis Highlights:

We will perform a comprehensive analysis of the dataset, including data cleaning, exploration, and visualization to gain insights into various aspects of student performance.

By employing statistical methods and machine learning techniques, we will determine the significant factors that affect student performance.

Why This Matters:

Understanding the factors that influence student performance is crucial for educators, policymakers, and parents. This analysis can help in making informed decisions to improve educational outcomes and provide support where it is most needed.

Acknowledgments:

We would like to express our gratitude to [mention any data sources or collaborators] for making this dataset available.

Please Note:

This project is meant for educational and analytical purposes. The dataset used is fictitious and does not represent any specific educational institution or individuals.
f
Orange dataset table
figshare.com
xlsx
Updated Mar 4, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Rui Simões (2022). Orange dataset table [Dataset]. http://doi.org/10.6084/m9.figshare.19146410.v1
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.19146410.v1
Dataset updated
Mar 4, 2022
Dataset provided by
figshare
Authors
Rui Simões
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The complete dataset used in the analysis comprises 36 samples, each described by 11 numeric features and 1 target. The attributes considered were caspase 3/7 activity, Mitotracker red CMXRos area and intensity (3 h and 24 h incubations with both compounds), Mitosox oxidation (3 h incubation with the referred compounds) and oxidation rate, DCFDA fluorescence (3 h and 24 h incubations with either compound) and oxidation rate, and DQ BSA hydrolysis. The target of each instance corresponds to one of the 9 possible classes (4 samples per class): Control, 6.25, 12.5, 25 and 50 µM for 6-OHDA and 0.03, 0.06, 0.125 and 0.25 µM for rotenone. The dataset is balanced, it does not contain any missing values and data was standardized across features. The small number of samples prevented a full and strong statistical analysis of the results. Nevertheless, it allowed the identification of relevant hidden patterns and trends.

Exploratory data analysis, information gain, hierarchical clustering, and supervised predictive modeling were performed using Orange Data Mining version 3.25.1 [41]. Hierarchical clustering was performed using the Euclidean distance metric and weighted linkage. Cluster maps were plotted to relate the features with higher mutual information (in rows) with instances (in columns), with the color of each cell representing the normalized level of a particular feature in a specific instance. The information is grouped both in rows and in columns by a two-way hierarchical clustering method using the Euclidean distances and average linkage. Stratified cross-validation was used to train the supervised decision tree. A set of preliminary empirical experiments were performed to choose the best parameters for each algorithm, and we verified that, within moderate variations, there were no significant changes in the outcome. The following settings were adopted for the decision tree algorithm: minimum number of samples in leaves: 2; minimum number of samples required to split an internal node: 5; stop splitting when majority reaches: 95%; criterion: gain ratio. The performance of the supervised model was assessed using accuracy, precision, recall, F-measure and area under the ROC curve (AUC) metrics.
Web Data Analysis
kaggle.com
Updated Feb 25, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Anup Pandey (2021). Web Data Analysis [Dataset]. https://www.kaggle.com/datasets/pandanup/web-data-analysis/data
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Feb 25, 2021
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Anup Pandey
Description
Dataset

This dataset was created by Anup Pandey

Contents
g
Insurance Dataset
gts.ai
json
Updated Oct 16, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
GTS (2022). Insurance Dataset [Dataset]. https://gts.ai/case-study/insurance-dataset-annotation-services-for-precision-data-analysis/
Explore at:
jsonAvailable download formats
Dataset updated
Oct 16, 2022
Dataset provided by
GLOBOSE TECHNOLOGY SOLUTIONS PRIVATE LIMITED
Authors
GTS
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
The Insurance Dataset project is an extensive initiative focused on collecting and analyzing insurance-related data from various sources.
Customer Segmentation Data
kaggle.com
Updated Mar 11, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Raval Smit (2024). Customer Segmentation Data [Dataset]. https://www.kaggle.com/datasets/ravalsmit/customer-segmentation-data
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Mar 11, 2024
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Raval Smit
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
This dataset provides comprehensive customer data suitable for segmentation analysis. It includes anonymized demographic, transactional, and behavioral attributes, allowing for detailed exploration of customer segments. Leveraging this dataset, marketers, data scientists, and business analysts can uncover valuable insights to optimize targeted marketing strategies and enhance customer engagement. Whether you're looking to understand customer behavior or improve campaign effectiveness, this dataset offers a rich resource for actionable insights and informed decision-making.

Key Features:

Anonymized demographic, transactional, and behavioral data. Suitable for customer segmentation analysis. Opportunities to optimize targeted marketing strategies. Valuable insights for improving campaign effectiveness. Ideal for marketers, data scientists, and business analysts.

Usage Examples:

Segmenting customers based on demographic attributes. Analyzing purchase behavior to identify high-value customer segments. Optimizing marketing campaigns for targeted engagement. Understanding customer preferences and tailoring product offerings accordingly. Evaluating the effectiveness of marketing strategies and iterating for improvement. Explore this dataset to unlock actionable insights and drive success in your marketing initiatives!
Walmart products free dataset
crawlfeeds.com
csv, zip
Updated Apr 27, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Crawl Feeds (2025). Walmart products free dataset [Dataset]. https://crawlfeeds.com/datasets/walmart-products-free-dataset
Explore at:
zip, csvAvailable download formats
Dataset updated
Apr 27, 2025
Dataset authored and provided by
Crawl Feeds
License
https://crawlfeeds.com/privacy_policyhttps://crawlfeeds.com/privacy_policy
Description
Discover the Walmart Products Free Dataset, featuring 2,000 records in CSV format. This dataset includes detailed information about various Walmart products, such as names, prices, categories, and descriptions.

It’s perfect for data analysis, e-commerce research, and machine learning projects. Download now and kickstart your insights with accurate, real-world data.
Sales Performance Report DQLab Store
kaggle.com
Updated Oct 4, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Dhawy Farras Putra (2021). Sales Performance Report DQLab Store [Dataset]. https://www.kaggle.com/datasets/dhawyfarrasputra/sales-performance-report-dqlab-store/code
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Oct 4, 2021
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Dhawy Farras Putra
Description
Context

You are provided with historical sales data from 2009 to 2012. This data contain 3 product category which are office supplies, technology, and furniture. Each category has several sub-categories. The company also runs promotional in the form of a discount.

There is two CSV file provided in the dataset. The raw_data.csv is the unformatted file that has 5499 rows and 1 column. While clean_data.csv is a formatted file that has 5499 rows and 10 columns.

Content

Attribute Information: - order_id : unique order number - order_status : status of the order, whether is finished or returned - customer : customer name - order_date : date of the order - order_quantity : the quantity on a particular order - sales : sales generated on a particular order, the value is in IDR(Indonesia Rupiah) currency - discount : a discount percentage - discount_value : a sales multiply by discount, the value is in IDR(Indonesia Rupiah) currency - product_category : a category of the product - product_sub_category : a subcategory from product category

Acknowledgements

DQLab is an Online Data Science Learning Center to produce data practitioners who can make an impact. This dataset is part of a project in order to build analytical skills and apply knowledge to industry problems.

Source

Project Data Analysis for Retail: Sales Performance Report: https://academy.dqlab.id/main/package/project/182?pf=0
A
‘Deep Learning A-Z - ANN dataset’ analyzed by Analyst-2
analyst-2.ai
Updated Nov 21, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com) (2021). ‘Deep Learning A-Z - ANN dataset’ analyzed by Analyst-2 [Dataset]. https://analyst-2.ai/analysis/kaggle-deep-learning-a-z-ann-dataset-e900/a6df077a/?iid=030-068&v=presentation
Explore at:
Dataset updated
Nov 21, 2021
Dataset authored and provided by
Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com)
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Analysis of ‘Deep Learning A-Z - ANN dataset’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/filippoo/deep-learning-az-ann on 21 November 2021.

--- Dataset description provided by original source is as follows ---

Context

This is the dataset used in the section "ANN (Artificial Neural Networks)" of the Udemy course from Kirill Eremenko (Data Scientist & Forex Systems Expert) and Hadelin de Ponteves (Data Scientist), called Deep Learning A-Z™: Hands-On Artificial Neural Networks. The dataset is very useful for beginners of Machine Learning, and a simple playground where to compare several techniques/skills.

It can be freely downloaded here: https://www.superdatascience.com/deep-learning/

The story: A bank is investigating a very high rate of customer leaving the bank. Here is a 10.000 records dataset to investigate and predict which of the customers are more likely to leave the bank soon.

The story of the story: I'd like to compare several techniques (better if not alone, and with the experience of several Kaggle users) to improve my basic knowledge on Machine Learning.

Content

I will write more later, but the columns names are very self-explaining.

Acknowledgements

Udemy instructors Kirill Eremenko (Data Scientist & Forex Systems Expert) and Hadelin de Ponteves (Data Scientist), and their efforts to provide this dataset to their students.

Inspiration

Which methods score best with this dataset? Which are fastest (or, executable in a decent time)? Which are the basic steps with such a simple dataset, very useful to beginners?

--- Original source retains full ownership of the source dataset ---
w
Dataset of books called Data analysis with SPSS : a first course in applied...
workwithdata.com
Updated Apr 17, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Work With Data (2025). Dataset of books called Data analysis with SPSS : a first course in applied statistics [Dataset]. https://www.workwithdata.com/datasets/books?f=1&fcol0=book&fop0=%3D&fval0=Data+analysis+with+SPSS+%3A+a+first+course+in+applied+statistics
Explore at:
Dataset updated
Apr 17, 2025
Dataset authored and provided by
Work With Data
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset is about books. It has 2 rows and is filtered where the book is Data analysis with SPSS : a first course in applied statistics. It features 7 columns including author, publication date, language, and book publisher.
A
Artificial Intelligence Training Dataset Report
archivemarketresearch.com
doc, pdf, ppt
Updated Feb 21, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Archive Market Research (2025). Artificial Intelligence Training Dataset Report [Dataset]. https://www.archivemarketresearch.com/reports/artificial-intelligence-training-dataset-38645
Explore at:
pdf, ppt, docAvailable download formats
Dataset updated
Feb 21, 2025
Dataset authored and provided by
Archive Market Research
License
https://www.archivemarketresearch.com/privacy-policyhttps://www.archivemarketresearch.com/privacy-policy
Time period covered
2025 - 2033
Area covered
Global
Variables measured
Market Size
Description
The global Artificial Intelligence (AI) Training Dataset market is projected to reach $1605.2 million by 2033, exhibiting a CAGR of 9.4% from 2025 to 2033. The surge in demand for AI training datasets is driven by the increasing adoption of AI and machine learning technologies in various industries such as healthcare, financial services, and manufacturing. Moreover, the growing need for reliable and high-quality data for training AI models is further fueling the market growth. Key market trends include the increasing adoption of cloud-based AI training datasets, the emergence of synthetic data generation, and the growing focus on data privacy and security. The market is segmented by type (image classification dataset, voice recognition dataset, natural language processing dataset, object detection dataset, and others) and application (smart campus, smart medical, autopilot, smart home, and others). North America is the largest regional market, followed by Europe and Asia Pacific. Key companies operating in the market include Appen, Speechocean, TELUS International, Summa Linguae Technologies, and Scale AI. Artificial Intelligence (AI) training datasets are critical for developing and deploying AI models. These datasets provide the data that AI models need to learn, and the quality of the data directly impacts the performance of the model. The AI training dataset market landscape is complex, with many different providers offering datasets for a variety of applications. The market is also rapidly evolving, as new technologies and techniques are developed for collecting, labeling, and managing AI training data.
A
‘E-Shop Clothing Dataset’ analyzed by Analyst-2
analyst-2.ai
Updated Aug 11, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com) (2021). ‘E-Shop Clothing Dataset’ analyzed by Analyst-2 [Dataset]. https://analyst-2.ai/analysis/kaggle-e-shop-clothing-dataset-5607/latest
Explore at:
Dataset updated
Aug 11, 2021
Dataset authored and provided by
Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com)
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Analysis of ‘E-Shop Clothing Dataset’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/adityawisnugrahas/eshop-clothing-dataset on 11 August 2021.

--- Dataset description provided by original source is as follows ---

Data description “e-shop clothing 2008”

Variables:

YEAR (2008)

========================================================

MONTH -> from April (4) to August (8)

========================================================

DAY -> day number of the month

========================================================

ORDER -> sequence of clicks during one session

========================================================

COUNTRY -> variable indicating the country of origin of the IP address with the following categories:

1-Australia 2-Austria 3-Belgium 4-British Virgin Islands 5-Cayman Islands 6-Christmas Island 7-Croatia 8-Cyprus 9-Czech Republic 10-Denmark 11-Estonia 12-unidentified 13-Faroe Islands 14-Finland 15-France 16-Germany 17-Greece 18-Hungary 19-Iceland 20-India 21-Ireland 22-Italy 23-Latvia 24-Lithuania 25-Luxembourg 26-Mexico 27-Netherlands 28-Norway 29-Poland 30-Portugal 31-Romania 32-Russia 33-San Marino 34-Slovakia 35-Slovenia 36-Spain 37-Sweden 38-Switzerland 39-Ukraine 40-United Arab Emirates 41-United Kingdom 42-USA 43-biz (.biz) 44-com (.com) 45-int (.int) 46-net (.net) 47-org (*.org)

========================================================

SESSION ID -> variable indicating session id (short record)

========================================================

PAGE 1 (MAIN CATEGORY) -> concerns the main product category: 1-trousers 2-skirts 3-blouses 4-sale

========================================================

PAGE 2 (CLOTHING MODEL) -> contains information about the code for each product (217 products)

========================================================

COLOUR -> colour of product

1-beige 2-black 3-blue 4-brown 5-burgundy 6-gray 7-green 8-navy blue 9-of many colors 10-olive 11-pink 12-red 13-violet 14-white

========================================================

LOCATION -> photo location on the page, the screen has been divided into six parts:

1-top left 2-top in the middle 3-top right 4-bottom left 5-bottom in the middle 6-bottom right

========================================================

MODEL PHOTOGRAPHY -> variable with two categories:

1-en face 2-profile

========================================================

PRICE -> price in US dollars

========================================================

PRICE 2 -> variable informing whether the price of a particular product is higher than the average price for the entire product category

1-yes 2-no

========================================================

PAGE -> page number within the e-store website (from 1 to 5)

++++++++++++++++++++++++++++++++++++++++++++++++++++++++

I want to know how to solve this data regarding any problem (clustering, regression, classification, EDA)

Source: https://archive.ics.uci.edu/ml/datasets/clickstream+data+for+online+shopping

--- Original source retains full ownership of the source dataset ---
BBC Datasets
brightdata.com
.json, .csv, .xlsx
Updated Sep 6, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Bright Data (2025). BBC Datasets [Dataset]. https://brightdata.com/products/datasets/bbc
Explore at:
.json, .csv, .xlsxAvailable download formats
Dataset updated
Sep 6, 2025
Dataset authored and provided by
Bright Datahttps://brightdata.com/
License
https://brightdata.com/licensehttps://brightdata.com/license
Area covered
Worldwide
Description
Unlock the full potential of BBC broadcast data with our comprehensive dataset featuring transcripts, program schedules, headlines, topics, and multimedia resources. This all-in-one dataset is designed to empower media analysts, researchers, journalists, and advocacy groups with actionable insights for media analysis, transparency studies, and editorial assessments.

Dataset Features

Transcripts: Access detailed broadcast transcripts, including headlines, content, author details, and publication dates. Perfect for analyzing media framing, topic frequency, and news narratives across various programs. Program Schedules: Explore program schedules with accurate timing, show names, and related metadata to track news coverage patterns and identify trends. Topics and Keywords: Analyze categorized topics and keywords to understand content diversity, editorial focus, and recurring themes in news broadcasts. Multimedia Content: Gain access to videos, images, and related articles linked to each broadcast for a holistic understanding of the news presentation. Metadata: Includes critical data points like publication dates, last updates, content URLs, and unique IDs for easier referencing and cross-analysis.

Customizable Subsets for Specific Needs Our CNN dataset is fully customizable to match your research or analytical goals. Focus on transcripts for in-depth media framing analysis, extract multimedia for content visualization studies, or dive into program schedules for broadcast trend analysis. Tailor the dataset to ensure it aligns with your objectives for maximum efficiency and relevance.

Popular Use Cases

Media Analysis: Evaluate news framing, content diversity, and topic coverage to assess editorial direction and media focus. Transparency Studies: Analyze journalistic standards, corrections, and retractions to assess media integrity and accountability. Audience Engagement: Identify recurring topics and trends in news content to understand audience preferences and behavior. Market Analysis: Track media coverage of key industries, companies, and topics to analyze public sentiment and industry relevance. Journalistic Integrity: Use transcripts and metadata to evaluate adherence to reporting practices, fairness, and transparency in news coverage. Research and Scholarly Studies: Leverage transcripts and multimedia to support academic studies in journalism, media criticism, and political discourse analysis.

Whether you are evaluating transparency, conducting media criticism, or tracking broadcast trends, our BBC dataset provides you with the tools and insights needed for in-depth research and strategic analysis. Customize your access to focus on the most relevant data points for your unique needs.
R
Data Analytics 2 Dataset
universe.roboflow.com
zip
Updated Jan 1, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
TRAFFIC SIGNS DATA ANALYTICS (2025). Data Analytics 2 Dataset [Dataset]. https://universe.roboflow.com/traffic-signs-data-analytics/data-analytics-2
Explore at:
zipAvailable download formats
Dataset updated
Jan 1, 2025
Dataset authored and provided by
TRAFFIC SIGNS DATA ANALYTICS
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Variables measured
TRAFFIC LIGHTS Gztl Bounding Boxes
Description
DATA ANALYTICS 2

## Overview DATA ANALYTICS 2 is a dataset for object detection tasks - it contains TRAFFIC LIGHTS Gztl annotations for 8,579 images. ## Getting Started You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model. ## License This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
w
Dataset of books called Longitudinal data analysis
workwithdata.com
Updated Apr 17, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Work With Data (2025). Dataset of books called Longitudinal data analysis [Dataset]. https://www.workwithdata.com/datasets/books?f=1&fcol0=book&fop0=%3D&fval0=Longitudinal+data+analysis
Explore at:
Dataset updated
Apr 17, 2025
Dataset authored and provided by
Work With Data
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset is about books. It has 1 row and is filtered where the book is Longitudinal data analysis. It features 7 columns including author, publication date, language, and book publisher.
Project Management Dataset
kaggle.com
Updated Feb 21, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Hosam Mhmd Ali (2025). Project Management Dataset [Dataset]. https://www.kaggle.com/datasets/hosammhmdali/project-management-dataset/data
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Feb 21, 2025
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Hosam Mhmd Ali
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
لينك شرح المشروع علي اليوتيوب https://user264629.psee.ly/69ch9f
w
Dataset of book subjects that contain Data analysis in business research : a...
workwithdata.com
Updated Nov 7, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Work With Data (2024). Dataset of book subjects that contain Data analysis in business research : a step-by-step nonparametric approach [Dataset]. https://www.workwithdata.com/datasets/book-subjects?f=1&fcol0=j0-book&fop0=%3D&fval0=Data+analysis+in+business+research+:+a+step-by-step+nonparametric+approach&j=1&j0=books
Explore at:
Dataset updated
Nov 7, 2024
Dataset authored and provided by
Work With Data
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset is about book subjects. It has 2 rows and is filtered where the books is Data analysis in business research : a step-by-step nonparametric approach. It features 10 columns including number of authors, number of books, earliest publication date, and latest publication date.
d
Job Postings Dataset for Labour Market Research and Insights
datarade.ai
Updated Sep 20, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Oxylabs (2023). Job Postings Dataset for Labour Market Research and Insights [Dataset]. https://datarade.ai/data-products/job-postings-dataset-for-labour-market-research-and-insights-oxylabs
Explore at:
.json, .xml, .csv, .xlsAvailable download formats
Dataset updated
Sep 20, 2023
Dataset authored and provided by
Oxylabs
Area covered
Anguilla, Zambia, Switzerland, Jamaica, Luxembourg, British Indian Ocean Territory, Togo, Tajikistan, Sierra Leone, Kyrgyzstan
Description
Introducing Job Posting Datasets: Uncover labor market insights!

Elevate your recruitment strategies, forecast future labor industry trends, and unearth investment opportunities with Job Posting Datasets.

Job Posting Datasets Source:

Indeed: Access datasets from Indeed, a leading employment website known for its comprehensive job listings.

Glassdoor: Receive ready-to-use employee reviews, salary ranges, and job openings from Glassdoor.

StackShare: Access StackShare datasets to make data-driven technology decisions.

Job Posting Datasets provide meticulously acquired and parsed data, freeing you to focus on analysis. You'll receive clean, structured, ready-to-use job posting data, including job titles, company names, seniority levels, industries, locations, salaries, and employment types.

Choose your preferred dataset delivery options for convenience:

Receive datasets in various formats, including CSV, JSON, and more. Opt for storage solutions such as AWS S3, Google Cloud Storage, and more. Customize data delivery frequencies, whether one-time or per your agreed schedule.

Why Choose Oxylabs Job Posting Datasets:

Fresh and accurate data: Access clean and structured job posting datasets collected by our seasoned web scraping professionals, enabling you to dive into analysis.

Time and resource savings: Focus on data analysis and your core business objectives while we efficiently handle the data extraction process cost-effectively.

Customized solutions: Tailor our approach to your business needs, ensuring your goals are met.

Legal compliance: Partner with a trusted leader in ethical data collection. Oxylabs is a founding member of the Ethical Web Data Collection Initiative, aligning with GDPR and CCPA best practices.

Pricing Options:

Standard Datasets: choose from various ready-to-use datasets with standardized data schemas, priced from $1,000/month.

Custom Datasets: Tailor datasets from any public web domain to your unique business needs. Contact our sales team for custom pricing.

Experience a seamless journey with Oxylabs:

Understanding your data needs: We work closely to understand your business nature and daily operations, defining your unique data requirements.

Developing a customized solution: Our experts create a custom framework to extract public data using our in-house web scraping infrastructure.

Delivering data sample: We provide a sample for your feedback on data quality and the entire delivery process.

Continuous data delivery: We continuously collect public data and deliver custom datasets per the agreed frequency.

Effortlessly access fresh job posting data with Oxylabs Job Posting Datasets.
A
‘Harry Potter Movies Dataset’ analyzed by Analyst-2
analyst-2.ai
Updated Sep 30, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com) (2021). ‘Harry Potter Movies Dataset’ analyzed by Analyst-2 [Dataset]. https://analyst-2.ai/analysis/kaggle-harry-potter-movies-dataset-4175/64069c86/?iid=001-085&v=presentation
Explore at:
Dataset updated
Sep 30, 2021
Dataset authored and provided by
Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com)
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Analysis of ‘Harry Potter Movies Dataset’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/kornflex/harry-potter-movies-dataset on 30 September 2021.

--- Dataset description provided by original source is as follows ---

Harry Potter movies datasets

Content

This repo contains scripts/transcripts of the Harry Potter movie saga.

movies.csv
This file contains several infos of the movies

movie: movie name

released_year: the year of the movie release

running_time: the running time of the movie in minutes

budget: budget of the movie in $

box_office: movie box office in $

movies.csv
This file contains dialogs of the movie

movie: movie name

chapter: chapter of the movie according to the script

character: character speaking

dialog: dialog of the character speaking

Notes

I'm not totally sure that the scripts are 100% complete and sometimes scenes from the script did not make it to the movie. Feel free to notify if you see missing parts !

--- Original source retains full ownership of the source dataset ---
Retail Sales Dataset
kaggle.com
Updated Aug 22, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mohammad Talib (2023). Retail Sales Dataset [Dataset]. https://www.kaggle.com/datasets/mohammadtalib786/retail-sales-dataset/data
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Aug 22, 2023
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Mohammad Talib
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Welcome to the Retail Sales and Customer Demographics Dataset! This synthetic dataset has been meticulously crafted to simulate a dynamic retail environment, providing an ideal playground for those eager to sharpen their data analysis skills through exploratory data analysis (EDA). With a focus on retail sales and customer characteristics, this dataset invites you to unravel intricate patterns, draw insights, and gain a deeper understanding of customer behavior.

****Dataset Overview:**

This dataset is a snapshot of a fictional retail landscape, capturing essential attributes that drive retail operations and customer interactions. It includes key details such as Transaction ID, Date, Customer ID, Gender, Age, Product Category, Quantity, Price per Unit, and Total Amount. These attributes enable a multifaceted exploration of sales trends, demographic influences, and purchasing behaviors.

Why Explore This Dataset?

Realistic Representation: Though synthetic, the dataset mirrors real-world retail scenarios, allowing you to practice analysis within a familiar context.

Diverse Insights: From demographic insights to product preferences, the dataset offers a broad spectrum of factors to investigate.

Hypothesis Generation: As you perform EDA, you'll have the chance to formulate hypotheses that can guide further analysis and experimentation.

Applied Learning: Uncover actionable insights that retailers could use to enhance their strategies and customer experiences.

Questions to Explore:

How does customer age and gender influence their purchasing behavior?

Are there discernible patterns in sales across different time periods?

Which product categories hold the highest appeal among customers?

What are the relationships between age, spending, and product preferences?

How do customers adapt their shopping habits during seasonal trends?

Are there distinct purchasing behaviors based on the number of items bought per transaction?

What insights can be gleaned from the distribution of product prices within each category?

Your EDA Journey:

Prepare to immerse yourself in a world of data-driven exploration. Through data visualization, statistical analysis, and correlation examination, you'll uncover the nuances that define retail operations and customer dynamics. EDA isn't just about numbers—it's about storytelling with data and extracting meaningful insights that can influence strategic decisions.

Embrace the Retail Sales and Customer Demographics Dataset as your canvas for discovery. As you traverse the landscape of this synthetic retail environment, you'll refine your analytical skills, pose intriguing questions, and contribute to the ever-evolving narrative of the retail industry. Happy exploring!
Shopping Mall Customer Data Segmentation Analysis
kaggle.com
Updated Aug 4, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
DataZng (2024). Shopping Mall Customer Data Segmentation Analysis [Dataset]. https://www.kaggle.com/datasets/datazng/shopping-mall-customer-data-segmentation-analysis/data
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Aug 4, 2024
Dataset provided by
Kagglehttp://kaggle.com/
Authors
DataZng
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Demographic Analysis of Shopping Behavior: Insights and Recommendations

Dataset Information: The Shopping Mall Customer Segmentation Dataset comprises 15,079 unique entries, featuring Customer ID, age, gender, annual income, and spending score. This dataset assists in understanding customer behavior for strategic marketing planning.

Cleaned Data Details: Data cleaned and standardized, 15,079 unique entries with attributes including - Customer ID, age, gender, annual income, and spending score. Can be used by marketing analysts to produce a better strategy for mall specific marketing.

Challenges Faced: 1. Data Cleaning: Overcoming inconsistencies and missing values required meticulous attention. 2. Statistical Analysis: Interpreting demographic data accurately demanded collaborative effort. 3. Visualization: Crafting informative visuals to convey insights effectively posed design challenges.

Research Topics: 1. Consumer Behavior Analysis: Exploring psychological factors driving purchasing decisions. 2. Market Segmentation Strategies: Investigating effective targeting based on demographic characteristics.

Suggestions for Project Expansion: 1. Incorporate External Data: Integrate social media analytics or geographic data to enrich customer insights. 2. Advanced Analytics Techniques: Explore advanced statistical methods and machine learning algorithms for deeper analysis. 3. Real-Time Monitoring: Develop tools for agile decision-making through continuous customer behavior tracking. This summary outlines the demographic analysis of shopping behavior, highlighting key insights, dataset characteristics, team contributions, challenges, research topics, and suggestions for project expansion. Leveraging these insights can enhance marketing strategies and drive business growth in the retail sector.

References OpenAI. (2022). ChatGPT [Computer software]. Retrieved from https://openai.com/chatgpt. Mustafa, Z. (2022). Shopping Mall Customer Segmentation Data [Data set]. Kaggle. Retrieved from https://www.kaggle.com/datasets/zubairmustafa/shopping-mall-customer-segmentation-data Donkeys. (n.d.). Kaggle Python API [Jupyter Notebook]. Kaggle. Retrieved from https://www.kaggle.com/code/donkeys/kaggle-python-api/notebook Pandas-Datareader. (n.d.). Retrieved from https://pypi.org/project/pandas-datareader/

Facebook

Twitter

Click to copy link

Link copied

Cite

maira javeed (2023). student data analysis [Dataset]. https://www.kaggle.com/datasets/mairajaveed/student-data-analysis

student data analysis

Student Performance Analysis

Explore at:

CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.

Dataset updated

Nov 17, 2023

Dataset provided by

Kagglehttp://kaggle.com/

Authors

maira javeed

Description

In this project, we aim to analyze and gain insights into the performance of students based on various factors that influence their academic achievements. We have collected data related to students' demographic information, family background, and their exam scores in different subjects.

**********Key Objectives:*********

Performance Evaluation: Evaluate and understand the academic performance of students by analyzing their scores in various subjects.
Identifying Underlying Factors: Investigate factors that might contribute to variations in student performance, such as parental education, family size, and student attendance.
Visualizing Insights: Create data visualizations to present the findings effectively and intuitively.

Dataset Details:

The dataset used in this analysis contains information about students, including their age, gender, parental education, lunch type, and test scores in subjects like mathematics, reading, and writing.

Analysis Highlights:

We will perform a comprehensive analysis of the dataset, including data cleaning, exploration, and visualization to gain insights into various aspects of student performance.
By employing statistical methods and machine learning techniques, we will determine the significant factors that affect student performance.

Why This Matters:

Understanding the factors that influence student performance is crucial for educators, policymakers, and parents. This analysis can help in making informed decisions to improve educational outcomes and provide support where it is most needed.

Acknowledgments:

We would like to express our gratitude to [mention any data sources or collaborators] for making this dataset available.

Please Note:

This project is meant for educational and analytical purposes. The dataset used is fictitious and does not represent any specific educational institution or individuals.

Clear search

Close search

Google apps

Main menu

student data analysis

Orange dataset table

Web Data Analysis

Dataset

Contents

Insurance Dataset

Customer Segmentation Data

Key Features:

Usage Examples:

Walmart products free dataset

Sales Performance Report DQLab Store

Context

Content

Acknowledgements

Source

‘Deep Learning A-Z - ANN dataset’ analyzed by Analyst-2

Context

Content

Acknowledgements

Inspiration

Dataset of books called Data analysis with SPSS : a first course in applied...

Artificial Intelligence Training Dataset Report

‘E-Shop Clothing Dataset’ analyzed by Analyst-2

BBC Datasets

Data Analytics 2 Dataset

DATA ANALYTICS 2

Dataset of books called Longitudinal data analysis

Project Management Dataset

Dataset of book subjects that contain Data analysis in business research : a...

Job Postings Dataset for Labour Market Research and Insights

‘Harry Potter Movies Dataset’ analyzed by Analyst-2

Harry Potter movies datasets

Content

movies.csv

movies.csv

Notes

Retail Sales Dataset

Shopping Mall Customer Data Segmentation Analysis

student data analysis

Student Performance Analysis