100+ datasets found

c
Walmart Products Dataset – Free Product Data CSV
crawlfeeds.com
csv, zip
Updated Dec 2, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Crawl Feeds (2025). Walmart Products Dataset – Free Product Data CSV [Dataset]. https://crawlfeeds.com/datasets/walmart-products-free-dataset
Explore at:
zip, csvAvailable download formats
Dataset updated
Dec 2, 2025
Dataset authored and provided by
Crawl Feeds
License
https://crawlfeeds.com/privacy_policyhttps://crawlfeeds.com/privacy_policy
Description
Looking for a free Walmart product dataset? The Walmart Products Free Dataset delivers a ready-to-use ecommerce product data CSV containing ~2,100 verified product records from Walmart.com. It includes vital details like product titles, prices, categories, brand info, availability, and descriptions — perfect for data analysis, price comparison, market research, or building machine-learning models.

Key Features

Complete Product Metadata: Each entry includes URL, title, brand, SKU, price, currency, description, availability, delivery method, average rating, total ratings, image links, unique ID, and timestamp.

CSV Format, Ready to Use: Download instantly - no need for scraping, cleaning or formatting.

Good for E-commerce Research & ML: Ideal for product cataloging, price tracking, demand forecasting, recommendation systems, or data-driven projects.

Free & Easy Access: Priced at USD $0.0, making it a great starting point for developers, data analysts or students.

Who Benefits?

Data analysts & researchers exploring e-commerce trends or product catalog data.

Developers & data scientists building price-comparison tools, recommendation engines or ML models.

E-commerce strategists/marketers need product metadata for competitive analysis or market research.

Students/hobbyists needing a free dataset for learning or demo projects.

Why Use This Dataset Instead of Manual Scraping?

Time-saving: No need to write scrapers or deal with rate limits.

Clean, structured data: All records are verified and already formatted in CSV, saving hours of cleaning.

Risk-free: Avoid Terms-of-Service issues or IP blocks that come with manual scraping.
Instant access: Free and immediately downloadable.
d
2019 Public Data File - Students
catalog.data.gov
data.cityofnewyork.us
+2more
Updated Nov 29, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
data.cityofnewyork.us (2024). 2019 Public Data File - Students [Dataset]. https://catalog.data.gov/dataset/2019-public-data-file-students
Explore at:
Dataset updated
Nov 29, 2024
Dataset provided by
data.cityofnewyork.us
Description
To collect feedback on their learning environment from families, students and teachers. Aids in facilitating the understanding of families perceptions, students, and teachers regarding their school. School leaders use feedback from the survey to reflect and make improvements to schools and programs. Each year all parents, teachers and students in grades 6-12 take the NYC School Survey. The survey is aligned to the DOE's Framework for Great Schools. It is designed to collect important information about each school's ability to support student success.
Students Data Analysis
kaggle.com
zip
Updated Jul 20, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
MOMONO (2022). Students Data Analysis [Dataset]. https://www.kaggle.com/datasets/erqizhou/students-data-analysis
Explore at:
zip(2174 bytes)Available download formats
Dataset updated
Jul 20, 2022
Authors
MOMONO
Description
A little paragraph from one real dataset, with a few little changes to protect students' private information. Permissions are given.

Goals

You are going to help teachers with only the data: 1. Prediction: To tell what makes a brilliant student who can apply for a graduate school, whether abroad or not. 2. Application: To help those who fails to apply for a graduate school with advice in job searching.

Tips

Educational data may have subtle structures, hierarchies and heterogeneity are probably involved. Simple regressions can hardly make any difference. Also, you should keep an eye on the collinearity in some indicators collected by teachers who have already forgot statistics.

Not all students are free to choose to apply for a graduate school, but some were born with privileges.

Some of the students are trying (or planning to try) to apply for a graduate school for years, you should be responsible to give advice accurately under their circumstances

About the Data

Some of the original structure are deleted or censored. For those are left: Basic data like: - ID - class: categorical, initially students were divided into 2 classes, yet teachers suspect that of different classes students may performance significant differently. - gender - race: categorical and censored - GPA: real numbers, float

Some teachers assume that scores of math curriculums can represent one's likelihood perfectly: - Algebra: real numbers, Advanced Algebra - ......

Some assume that background of students can affect their choices and likelihood significantly, which are all censored as: - from1: students' home locations - from2: a probably bad indicator for preference on mathematics - from 3: how did students apply for this university (undergraduate) - from4: a probably bad indicator for family background. 0 with more wealth, 4 with more poverty

The final indicator y: - 0, one fails to apply for the graduate school, who may apply again or search jobs in the future - 1, success, inland - 2, success, abroad
Student Performance Dataset: Academic Insights 10K
kaggle.com
zip
Updated Dec 1, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Nadeem Majeed (2024). Student Performance Dataset: Academic Insights 10K [Dataset]. https://www.kaggle.com/datasets/nadeemajeedch/students-performance-10000-clean-data-eda
Explore at:
zip(129033 bytes)Available download formats
Dataset updated
Dec 1, 2024
Authors
Nadeem Majeed
License
https://cdla.io/sharing-1-0/https://cdla.io/sharing-1-0/
Description
The dataset includes: Roll Number: Represent the roll number of the student.

Gender: Useful for analyzing performance differences between male and female students.

Race/Ethnicity: Allows analysis of academic performance trends across different racial or ethnic groups.

Parental Level of Education: Indicates the educational background of the student's family.

Lunch: Shows whether students receive a free or reduced lunch, which is often a socioeconomic indicator.

Test Preparation Course: This tells whether students completed a test prep course, which could impact their performance.

Math Score: Provides a measure of each student’s performance in math, used to calculate averages or trends across various demographics. Science Score: Evaluates students' Science knowledge, which can be analyzed to assess overall scentific knowledge of the student.

Reading Score: Measures performance in reading, allowing for insights into literacy and comprehension levels among students.

Writing Score: Evaluates students' writing skills, which can be analyzed to assess overall literacy and expression.

Total Score: Shows the total number achieved by the student out of 400.

Grade: Gade achieved by the student. "A" grade if Total marks >= 320, "B" grade if Total marks >= 250, "C" grade if Total marks >= 200, "D" grade if Total marks >= 150 and Fail if <150.

Student Performance

kaggle.com

zip

Updated Oct 7, 2022

+ more versions

Facebook

Twitter

Click to copy link

Link copied

Cite

Aman Chauhan (2022). Student Performance [Dataset]. https://www.kaggle.com/datasets/whenamancodes/student-performance

Explore at:

zip(106753 bytes)Available download formats

Dataset updated

Oct 7, 2022

Authors

Aman Chauhan

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

This data approach student achievement in secondary education of two Portuguese schools. The data attributes include student grades, demographic, social and school related features) and it was collected by using school reports and questionnaires. Two datasets are provided regarding the performance in two distinct subjects: Mathematics (mat) and Portuguese language (por). In [Cortez and Silva, 2008], the two datasets were modeled under binary/five-level classification and regression tasks. Important note: the target attribute G3 has a strong correlation with attributes G2 and G1. This occurs because G3 is the final year grade (issued at the 3rd period), while G1 and G2 correspond to the 1st and 2nd period grades. It is more difficult to predict G3 without G2 and G1, but such prediction is much more useful (see paper source for more details).

Attributes for both Maths.csv (Math course) and Portuguese.csv (Portuguese language course) datasets:

Columns	Description
school	student's school (binary: 'GP' - Gabriel Pereira or 'MS' - Mousinho da Silveira)
sex	student's sex (binary: 'F' - female or 'M' - male)
age	student's age (numeric: from 15 to 22)
address	student's home address type (binary: 'U' - urban or 'R' - rural)
famsize	family size (binary: 'LE3' - less or equal to 3 or 'GT3' - greater than 3)
Pstatus	parent's cohabitation status (binary: 'T' - living together or 'A' - apart)
Medu	mother's education (numeric: 0 - none, 1 - primary education (4th grade), 2 â€“ 5th to 9th grade, 3 â€“ secondary education or 4 â€“ higher education)
Fedu	father's education (numeric: 0 - none, 1 - primary education (4th grade), 2 â€“ 5th to 9th grade, 3 â€“ secondary education or 4 â€“ higher education)
Mjob	mother's job (nominal: 'teacher', 'health' care related, civil 'services' (e.g. administrative or police), 'at_home' or 'other')
Fjob	father's job (nominal: 'teacher', 'health' care related, civil 'services' (e.g. administrative or police), 'at_home' or 'other')
reason	reason to choose this school (nominal: close to 'home', school 'reputation', 'course' preference or 'other')
guardian	student's guardian (nominal: 'mother', 'father' or 'other')
traveltime	home to school travel time (numeric: 1 - <15 min., 2 - 15 to 30 min., 3 - 30 min. to 1 hour, or 4 - >1 hour)
studytime	weekly study time (numeric: 1 - <2 hours, 2 - 2 to 5 hours, 3 - 5 to 10 hours, or 4 - >10 hours)
failures	number of past class failures (numeric: n if 1<=n<3, else 4)
schoolsup	extra educational support (binary: yes or no)
famsup	family educational support (binary: yes or no)
paid	extra paid classes within the course subject (Math or Portuguese) (binary: yes or no)
activities	extra-curricular activities (binary: yes or no)
nursery	attended nursery school (binary: yes or no)
higher	wants to take higher education (binary: yes or no)
internet	Internet access at home (binary: yes or no)
romantic	with a romantic relationship (binary: yes or no)
famrel	quality of family relationships (numeric: from 1 - very bad to 5 - excellent)
freetime	free time after school (numeric: from 1 - very low to 5 - very high)
goout	going out with friends (numeric: from 1 - very low to 5 - very high)
Dalc	workday alcohol consumption (numeric: from 1 - very low to 5 - very high)
Walc	weekend alcohol consumption (numeric: from 1 - very low to 5 - very high)
health	current health status (numeric: from 1 - very bad to 5 - very good)
absences	number of school absences (numeric: from 0 to 93)

These grades are related with the course subject, Math or Portuguese:

Grade	Description
G1	first period grade (numeric: from 0 to 20)
G2	second period grade (numeric: from 0 to 20)
G3	final grade (numeric: from 0 to 20, output target)

More - Find More Exciting🙀 Datasets Here - An Upvote👍 A Dayᕙ(`▿´)ᕗ , Keeps Aman Hurray Hurray..... ٩(˘◡˘)۶Haha

d
School Attendance by Student Group and District, 2021-2022
catalog.data.gov
data.ct.gov
+2more
Updated Jun 21, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
data.ct.gov (2025). School Attendance by Student Group and District, 2021-2022 [Dataset]. https://catalog.data.gov/dataset/school-attendance-by-student-group-and-district-2021-2022
Explore at:
Dataset updated
Jun 21, 2025
Dataset provided by
data.ct.gov
Description
This dataset includes the attendance rate for public school students PK-12 by student group and by district during the 2021-2022 school year. Student groups include: Students experiencing homelessness Students with disabilities Students who qualify for free/reduced lunch English learners All high needs students Non-high needs students Students by race/ethnicity (Hispanic/Latino of any race, Black or African American, White, All other races) Attendance rates are provided for each student group by district and for the state. Students who are considered high needs include students who are English language learners, who receive special education, or who qualify for free and reduced lunch. When no attendance data is displayed in a cell, data have been suppressed to safeguard student confidentiality, or to ensure that statistics based on a very small sample size are not interpreted as equally representative as those based on a sufficiently larger sample size. For more information on CSDE data suppression policies, please visit http://edsight.ct.gov/relatedreports/BDCRE%20Data%20Suppression%20Rules.pdf.
Fictional Student Performance Dataset
kaggle.com
zip
Updated Nov 4, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Muhammad Bin Imran (2023). Fictional Student Performance Dataset [Dataset]. https://www.kaggle.com/datasets/muhammadbinimran/fictional-student-performance-dataset
Explore at:
zip(14161 bytes)Available download formats
Dataset updated
Nov 4, 2023
Authors
Muhammad Bin Imran
Description
Dataset Name: Fictional Student Performance Dataset

Description: The "Fictional Student Performance Dataset" is a comprehensive collection of fictional student records designed for educational and analytical purposes. This dataset comprises 500 student profiles and their associated attributes, making it a valuable resource for exploring various aspects of student performance and data analysis.

Attributes:

StudentID: A unique identifier for each student, facilitating individual tracking and analysis. Name: The name of each student, ensuring the dataset's personalization. Age: The age of each student, providing demographic information. Gender: The gender of each student, offering insights into gender-based performance trends. Grade: A continuous variable representing the academic performance of students, which can be used for regression and prediction tasks. Attendance: A percentage value denoting the attendance rate of each student, enabling attendance-related analyses. FinalExamScore: A continuous variable indicating the final exam score achieved by each student, making it suitable for evaluation and prediction tasks. Use Cases:

Educational Research: This dataset is ideal for educational institutions and researchers to analyze student performance and identify factors that influence academic outcomes. Machine Learning Practice: It is an excellent resource for data science enthusiasts and students looking to practice various machine learning techniques, such as regression, classification, and clustering. Predictive Modeling: The "Grade" and "FinalExamScore" attributes can be used to develop predictive models to forecast student performance. Gender-Based Analysis: Explore gender-based trends in student performance and attendance. Attendance Impact: Investigate the correlation between attendance and academic success. Disclaimer: Please note that this dataset is entirely fictional and created for educational and practice purposes. Any resemblance to real individuals or institutions is purely coincidental.

Citation: If you use this dataset in your research or projects, kindly acknowledge its source as the "Fictional Student Performance Dataset"

Data Generation: The dataset was generated using a combination of randomization and scripting to ensure that it does not contain any real or personally identifiable information.

Feel free to explore and utilize this dataset for educational purposes, data analysis, or machine learning exercises. It is intended to foster learning and experimentation in data science.
Federal School Code List for Free Application for Federal Student Aid...
datasets.ai
catalog.data.gov
+2more
47
Updated Aug 12, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Department of Education (2023). Federal School Code List for Free Application for Federal Student Aid (FAFSA) [Dataset]. https://datasets.ai/datasets/federal-school-code-list-for-free-application-for-federal-student-aid-fafsa-6cefe
Explore at:
47Available download formats
Dataset updated
Aug 12, 2023
Dataset provided by
United States Department of Educationhttps://ed.gov/
Authors
Department of Education
Description
The Federal School Code List contains the unique codes assigned by the Department of Education for schools participating in the Title IV federal student aid programs. Students can enter these codes on the Free Application for Federal Student Aid (FAFSA) to indicate which postsecondary schools they want to receive their financial application results. The Federal School Code List is a searchable document in Excel format. The list will be updated on the first of February, May, August, and November of each calendar year.
Predict students' dropout and academic success
kaggle.com
zip
Updated Jan 3, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
The Devastator (2023). Predict students' dropout and academic success [Dataset]. https://www.kaggle.com/datasets/thedevastator/higher-education-predictors-of-student-retention
Explore at:
zip(89332 bytes)Available download formats
Dataset updated
Jan 3, 2023
Authors
The Devastator
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Predict students' dropout and academic success

Investigating the Impact of Social and Economic Factors

By [source]

About this dataset

This dataset provides a comprehensive view of students enrolled in various undergraduate degrees offered at a higher education institution. It includes demographic data, social-economic factors and academic performance information that can be used to analyze the possible predictors of student dropout and academic success. This dataset contains multiple disjoint databases consisting of relevant information available at the time of enrollment, such as application mode, marital status, course chosen and more. Additionally, this data can be used to estimate overall student performance at the end of each semester by assessing curricular units credited/enrolled/evaluated/approved as well as their respective grades. Finally, we have unemployment rate, inflation rate and GDP from the region which can help us further understand how economic factors play into student dropout rates or academic success outcomes. This powerful analysis tool will provide valuable insight into what motivates students to stay in school or abandon their studies for a wide range of disciplines such as agronomy, design, education nursing journalism management social service or technologies

More Datasets

For more datasets, click here.

Featured Notebooks

🚨 Your notebook can be here! 🚨!

How to use the dataset

This dataset can be used to understand and predict student dropouts and academic outcomes. The data includes a variety of demographic, social-economic and academic performance factors related to the students enrolled in higher education institutions. The dataset provides valuable insights into the factors that affect student success and could be used to guide interventions and policies related to student retention.

Using this dataset, researchers can investigate two key questions: - which specific predictive factors are linked with student dropout or completion? - how do different features interact with each other? For example, researchers could explore if there any demographic characteristics (e.g., gender, age at enrollment etc.) or immersion conditions (e.g., unemployment rate in region) are associated with higher student success rates, as well as understand what implications poverty has for educational outcomes. By answering these questions, research insight is generated which can provide critical information for administrators on formulating strategies that promote successful degree completion among students from diverse backgrounds in their institutions.

In order to use this dataset effectively it is important that scientists familiarize themselves with all variables provided in the dataset including categorical (qualitative) variables such as gender or application mode; numerical variables such as number of curricular units at the beginning of semesters or age at enrollment; ordinal data measurement type variables such as marital status; studied trends over time such as inflation rate or GDP; frequency measurements variables like percentage of scholarship holders; etc.. Additionally scientists should make sure they aware off all potential bias included in the data prior running analysis–for example understanding if one population is underrepresented compared another -as this phenomenon could lead unexpected results if not taken into consideration while conducting research undertaken using this data set.. Finally it would be important for practitioners realize that this current Kaggle Dataset contains only one semester-worth information on each admission intake whereas additional studies conducted for a longer time period might be able provide more accurate results related selected topic area due further deterioration retention achievement coefficients obtained from those gradually accurate experiments unfolding different year-long admissions seasons

Research Ideas

Prediction of Student Retention: This dataset can be used to develop predictive models that can identify student risk factors for dropout and take early interventions to improve student retention rate.

Improved Academic Performance: By using this data, higher education institutions could better understand their students' academic progress and identify areas of improvement from both an individual and institutional perspective. This will enable them to develop targeted courses, activities, or initiatives that enhance academic performance more effectively and efficiently.

Accessibility Assistance: Using the demographic information included in the dataset, institutions could develop s...
c
Free or Reduced-price Meal Eligibility - Datasets - CTData.org
data.ctdata.org
Updated Mar 16, 2016
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2016). Free or Reduced-price Meal Eligibility - Datasets - CTData.org [Dataset]. http://data.ctdata.org/dataset/free-or-reduced-price-meal-eligibility
Explore at:
Dataset updated
Mar 16, 2016
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Full Description Eligibility indicates students from families whose total income is at or below 185 percent of the poverty level. Household income below 130 percent of the poverty level qualifies students for free meals. Household income between 130 and 185 percent of the poverty level qualifies students for reduced-price meals. Connecticut State Department of Education collects data for grades PreK through 12 on a school year basis. CTdata.org carries annual school year data for grades K through 3.
m
Data from: Student grade prediction dataset
data.mendeley.com
Updated Jun 16, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Nonso Nnamoko (2022). Student grade prediction dataset [Dataset]. http://doi.org/10.17632/wf8568hxb7.1
Explore at:
Unique identifier
https://doi.org/10.17632/wf8568hxb7.1
Dataset updated
Jun 16, 2022
Authors
Nonso Nnamoko
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The dataset provides a collection of 160 instances belonging to two classes (pass' = 136 andfail' = 24). The data is an anonymised, statistically sound and reliable representation of the original data collected from students studying computer science modules at a UK University. Each instance is made up of 19 features plus the class label. Eight of the features represent students' online behaviour including bio information retrieved from Virtual Learning Environment. Eleven of the features represent students' neighbourhood influence retrieved from Office for Students database. The data has been compiled and made available in de-facto/de-jure standard open formats (CSV and JSON).

This data was collected and used in a research study undertaken by academics and researchers at Computer Science Department, Edge Hill University, United Kingdom. To encourage reproducibility of the experiments and results reported, the data is provided in the exact training-validation-testing splits used in the experiments.
d
First Generation College Students Experiences - Qualitative Dataset 2021
search.dataone.org
dataverse.harvard.edu
Updated Nov 14, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Watts, Gavin (2023). First Generation College Students Experiences - Qualitative Dataset 2021 [Dataset]. http://doi.org/10.7910/DVN/YCXBNF
Explore at:
Unique identifier
https://doi.org/10.7910/DVN/YCXBNF
Dataset updated
Nov 14, 2023
Dataset provided by
Harvard Dataverse
Authors
Watts, Gavin
Description
The experiences of first-generation college students (FGCS) can guide the development of effective practices for supporting and retaining this population in higher education settings. Multiple themes emerged via qualitative interviews with ten FCGS participants, including: challenges/barriers within instruction/classroom communication, financial struggles, academic strategies, and perseverance/motivations related to family and academics. Findings show needs for clear communication/expectations within higher education settings, social supports/relationships outside of the campus settings, as well as acknowledgment and reinforcement for academic successes. Additionally, these findings align with previous research showing FGCS to be underprepared and under-supported in applying for, enrolling in, and paying for college.
b
Provision of Universal Free School Meals in two secondary schools - Datasets...
data.bris.ac.uk
Updated Oct 19, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2022). Provision of Universal Free School Meals in two secondary schools - Datasets - data.bris [Dataset]. https://data.bris.ac.uk/data/dataset/3rdt9hl6g2in72krp33j9693w0
Explore at:
Dataset updated
Oct 19, 2022
Description
In order to request access to this data please complete the data request form.* * University of Bristol staff should use this form instead. The PHIRST research team has worked in partnership with Hammersmith and Fulham colleagues from public health, children and adult services, to create an evaluation study that takes into account the priorities and concerns of all interested parties within the borough. It focuses on the following research questions: 1) Is UFSM feasible in secondary schools? 2) What is the impact of UFSM on student hunger, school attendance and behaviour, and food that is eaten in school? 3) What is the impact of UFSM on family finance and food security? 4) What do students, carers and school staff see as the reasons UFSM leads to these outcomes? 5) What are the things that help or prevent UFSM being delivered effectively in secondary schools? 6) Could UFSM in secondary schools be a cost-effective approach to addressing student hunger? We interviewed students, parents/carers, school staff and catering staff from the two schools receiving UFSM, and senior leaders in eight other secondary schools, ii) ran student surveys in the two UFSM schools and in two comparison schools, and iii) looked at information about student attendance, academic work and behaviour collected by the local authority and by schools before and after UFSM was introduced. We also worked with a group of student co-researchers in both UFSM schools. They advised on the content and format of our interviews and survey and helped us to plan observations of their school lunch times. These students did the observations themselves and shared their findings with the study team.
Datasets for Sentiment Analysis
zenodo.org
csv
Updated Dec 10, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Julie R. Repository creator - Campos Arias; Julie R. Repository creator - Campos Arias (2023). Datasets for Sentiment Analysis [Dataset]. http://doi.org/10.5281/zenodo.10157504
Explore at:
csvAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.10157504
Dataset updated
Dec 10, 2023
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Julie R. Repository creator - Campos Arias; Julie R. Repository creator - Campos Arias
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This repository was created for my Master's thesis in Computational Intelligence and Internet of Things at the University of Córdoba, Spain. The purpose of this repository is to store the datasets found that were used in some of the studies that served as research material for this Master's thesis. Also, the datasets used in the experimental part of this work are included.
Below are the datasets specified, along with the details of their references, authors, and download sources.

----------- STS-Gold Dataset ----------------
The dataset consists of 2026 tweets. The file consists of 3 columns: id, polarity, and tweet. The three columns denote the unique id, polarity index of the text and the tweet text respectively.
Reference: Saif, H., Fernandez, M., He, Y., & Alani, H. (2013). Evaluation datasets for Twitter sentiment analysis: a survey and a new dataset, the STS-Gold.
File name: sts_gold_tweet.csv
----------- Amazon Sales Dataset ----------------
This dataset is having the data of 1K+ Amazon Product's Ratings and Reviews as per their details listed on the official website of Amazon. The data was scraped in the month of January 2023 from the Official Website of Amazon.
Owner: Karkavelraja J., Postgraduate student at Puducherry Technological University (Puducherry, Puducherry, India)
Features:
product_id - Product ID
product_name - Name of the Product
category - Category of the Product
discounted_price - Discounted Price of the Product
actual_price - Actual Price of the Product
discount_percentage - Percentage of Discount for the Product
rating - Rating of the Product
rating_count - Number of people who voted for the Amazon rating
about_product - Description about the Product
user_id - ID of the user who wrote review for the Product
user_name - Name of the user who wrote review for the Product
review_id - ID of the user review
review_title - Short review
review_content - Long review
img_link - Image Link of the Product
product_link - Official Website Link of the Product
License: CC BY-NC-SA 4.0
File name: amazon.csv
----------- Rotten Tomatoes Reviews Dataset ----------------
This rating inference dataset is a sentiment classification dataset, containing 5,331 positive and 5,331 negative processed sentences from Rotten Tomatoes movie reviews. On average, these reviews consist of 21 words. The first 5331 rows contains only negative samples and the last 5331 rows contain only positive samples, thus the data should be shuffled before usage.
This data is collected from https://www.cs.cornell.edu/people/pabo/movie-review-data/ as a txt file and converted into a csv file. The file consists of 2 columns: reviews and labels (1 for fresh (good) and 0 for rotten (bad)).
Reference: Bo Pang and Lillian Lee. Seeing stars: Exploiting class relationships for sentiment categorization with respect to rating scales. In Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL'05), pages 115–124, Ann Arbor, Michigan, June 2005. Association for Computational Linguistics
File name: data_rt.csv
----------- Preprocessed Dataset Sentiment Analysis ----------------
Preprocessed amazon product review data of Gen3EcoDot (Alexa) scrapped entirely from amazon.in
Stemmed and lemmatized using nltk.
Sentiment labels are generated using TextBlob polarity scores.
The file consists of 4 columns: index, review (stemmed and lemmatized review using nltk), polarity (score) and division (categorical label generated using polarity score).
DOI: 10.34740/kaggle/dsv/3877817
Citation: @misc{pradeesh arumadi_2022, title={Preprocessed Dataset Sentiment Analysis}, url={https://www.kaggle.com/dsv/3877817}, DOI={10.34740/KAGGLE/DSV/3877817}, publisher={Kaggle}, author={Pradeesh Arumadi}, year={2022} }
This dataset was used in the experimental phase of my research.
File name: EcoPreprocessed.csv
----------- Amazon Earphones Reviews ----------------
This dataset consists of a 9930 Amazon reviews, star ratings, for 10 latest (as of mid-2019) bluetooth earphone devices for learning how to train Machine for sentiment analysis.
This dataset was employed in the experimental phase of my research. To align it with the objectives of my study, certain reviews were excluded from the original dataset, and an additional column was incorporated into this dataset.
The file consists of 5 columns: ReviewTitle, ReviewBody, ReviewStar, Product and division (manually added - categorical label generated using ReviewStar score)
License: U.S. Government Works
Source: www.amazon.in
File name (original): AllProductReviews.csv (contains 14337 reviews)
File name (edited - used for my research) : AllProductReviews2.csv (contains 9930 reviews)
----------- Amazon Musical Instruments Reviews ----------------
This dataset contains 7137 comments/reviews of different musical instruments coming from Amazon.
This dataset was employed in the experimental phase of my research. To align it with the objectives of my study, certain reviews were excluded from the original dataset, and an additional column was incorporated into this dataset.
The file consists of 10 columns: reviewerID, asin (ID of the product), reviewerName, helpful (helpfulness rating of the review), reviewText, overall (rating of the product), summary (summary of the review), unixReviewTime (time of the review - unix time), reviewTime (time of the review (raw) and division (manually added - categorical label generated using overall score).
Source: http://jmcauley.ucsd.edu/data/amazon/
File name (original): Musical_instruments_reviews.csv (contains 10261 reviews)
File name (edited - used for my research) : Musical_instruments_reviews2.csv (contains 7137 reviews)
c
4-year Cohort High School Graduation Rate - Datasets - CTData.org
data.ctdata.org
Updated Mar 29, 2016
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2016). 4-year Cohort High School Graduation Rate - Datasets - CTData.org [Dataset]. http://data.ctdata.org/dataset/4-year-cohort-high-school-graduation-rate
Explore at:
Dataset updated
Mar 29, 2016
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Full Description The variable examined is graduation status after four years of high school. Early and summer graduates are considered graduates after four years. The "other" rate includes students who dropped out of high school, enrolled in a GED program, transferred to post-secondary education, or have unknown status. Special education students in school after four years but subsequently graduated are not included in the "still enrolled" rate due to Individuals with Disabilities Education Act (IDEA) restrictions. The subgroups reported are gender, race/ethnicity, English language learners, special education students, and students eligible for free or reduced-price meals (FRPM). The data replace the rate of students enrolled in 12th grade in September who graduated the following June. Connecticut State Department of Education (SDE) collects data longitudinally by four-year cohorts. SDE reports and CTdata.org carries graduation rates of four-year cohorts annually.
Udemy Dataset
brightdata.com
.json, .csv, .xlsx
Updated May 7, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Bright Data (2024). Udemy Dataset [Dataset]. https://brightdata.com/products/datasets/udemy
Explore at:
.json, .csv, .xlsxAvailable download formats
Dataset updated
May 7, 2024
Dataset authored and provided by
Bright Datahttps://brightdata.com/
License
https://brightdata.com/licensehttps://brightdata.com/license
Area covered
Worldwide
Description
We'll tailor a Udemy dataset to meet your unique needs, encompassing course titles, user engagement metrics, completion rates, demographic data of learners, enrollment numbers, review scores, and other pertinent metrics.

Leverage our Udemy datasets for diverse applications to bolster strategic planning and market analysis. Scrutinizing these datasets enables organizations to grasp learner preferences and online education trends, facilitating nuanced educational program development and learning initiatives. Customize your access to the entire dataset or specific subsets as per your business requisites.

Popular use cases involve optimizing educational content based on engagement insights, enhancing learning strategies through targeted learner segmentation, and identifying and forecasting trends to stay ahead in the online education landscape.
Student Study Performance
kaggle.com
zip
Updated Mar 7, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Bhavik Jikadara (2024). Student Study Performance [Dataset]. https://www.kaggle.com/datasets/bhavikjikadara/student-study-performance
Explore at:
zip(8907 bytes)Available download formats
Dataset updated
Mar 7, 2024
Authors
Bhavik Jikadara
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Problem Statement:

This project understands how the student's performance (test scores) is affected by other variables such as Gender, Ethnicity, Parental level of education, Lunch and Test preparation course.

Content

This data set consists of the marks secured by the students in various subjects. - gender : sex of students -> (Male/female) - race/ethnicity : ethnicity of students -> (Group A, B,C, D,E) - parental level of education : parents' final education ->(bachelor's degree,some college,master's degree,associate's degree,- high school) - lunch : having lunch before test (standard or free/reduced) - test preparation course : complete or not complete before test - math score - reading score - writing score

Inspiration:

To understand the influence of the parent's background, test preparation etc on students' performance
O
Students Covered Under Tobacco-Free School Policy
data.ok.gov
catalog.data.gov
+1more
csv
Updated Oct 31, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
OKStateStat (2019). Students Covered Under Tobacco-Free School Policy [Dataset]. https://data.ok.gov/dataset/students-covered-under-tobacco-free-school-policy
Explore at:
csvAvailable download formats
Dataset updated
Oct 31, 2019
Dataset authored and provided by
OKStateStat
Description
Increase the percentage of students covered under a 24/7 tobacco-free school policy from 74% in 2012 to 86% by 2018.
d
School Attendance by District, 2020-2021
catalog.data.gov
data.ct.gov
+2more
Updated Jun 28, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
data.ct.gov (2025). School Attendance by District, 2020-2021 [Dataset]. https://catalog.data.gov/dataset/school-attendance-by-district-2020-2021
Explore at:
Dataset updated
Jun 28, 2025
Dataset provided by
data.ct.gov
Description
This dataset includes the attendance rate for public school students PK-12 by district during the 2020-2021 school year. Attendance rates are provided for each district for the overall student population and for the high needs student population. Students who are considered high needs include students who are English language learners, who receive special education, or who qualify for free and reduced lunch. When no attendance data is displayed in a cell, data have been suppressed to safeguard student confidentiality, or to ensure that statistics based on a very small sample size are not interpreted as equally representative as those based on a sufficiently larger sample size. For more information on CSDE data suppression policies, please visit http://edsight.ct.gov/relatedreports/BDCRE%20Data%20Suppression%20Rules.pdf.
m
Adoption of AI in Education
data.mendeley.com
Updated Feb 5, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Pivithuru Kumarasinghe (2024). Adoption of AI in Education [Dataset]. http://doi.org/10.17632/hwpxz98swn.1
Explore at:
Unique identifier
https://doi.org/10.17632/hwpxz98swn.1
Dataset updated
Feb 5, 2024
Authors
Pivithuru Kumarasinghe
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset examines the opinions of undergraduate students in Hangzhou, China, who are studying management, on the use of Artificial Intelligence (AI) in education. The data set is guided by the Diffusion of Innovations Theory (DOI) and explores the factors that influence students' intentions to use AI technologies. The survey was conducted using a random sampling method to ensure comprehensive data collection from 420 Chinese students aged 18-21. The methodology used in this survey was rigorous.

Facebook

Twitter

Click to copy link

Link copied

Cite

Crawl Feeds (2025). Walmart Products Dataset – Free Product Data CSV [Dataset]. https://crawlfeeds.com/datasets/walmart-products-free-dataset

Walmart Products Dataset – Free Product Data CSV

Walmart Products Dataset – Free Product Data CSV from Walmart.com

Explore at:

zip, csvAvailable download formats

Dataset updated

Dec 2, 2025

Dataset authored and provided by

Crawl Feeds

License

https://crawlfeeds.com/privacy_policyhttps://crawlfeeds.com/privacy_policy

Description

Looking for a free Walmart product dataset? The Walmart Products Free Dataset delivers a ready-to-use ecommerce product data CSV containing ~2,100 verified product records from Walmart.com. It includes vital details like product titles, prices, categories, brand info, availability, and descriptions — perfect for data analysis, price comparison, market research, or building machine-learning models.

Key Features

Complete Product Metadata: Each entry includes URL, title, brand, SKU, price, currency, description, availability, delivery method, average rating, total ratings, image links, unique ID, and timestamp.

CSV Format, Ready to Use: Download instantly - no need for scraping, cleaning or formatting.

Good for E-commerce Research & ML: Ideal for product cataloging, price tracking, demand forecasting, recommendation systems, or data-driven projects.

Free & Easy Access: Priced at USD $0.0, making it a great starting point for developers, data analysts or students.

Who Benefits?

Data analysts & researchers exploring e-commerce trends or product catalog data.
Developers & data scientists building price-comparison tools, recommendation engines or ML models.
E-commerce strategists/marketers need product metadata for competitive analysis or market research.
Students/hobbyists needing a free dataset for learning or demo projects.

Why Use This Dataset Instead of Manual Scraping?

Time-saving: No need to write scrapers or deal with rate limits.
Clean, structured data: All records are verified and already formatted in CSV, saving hours of cleaning.
Risk-free: Avoid Terms-of-Service issues or IP blocks that come with manual scraping.
Instant access: Free and immediately downloadable.

Clear search

Close search

Google apps

Main menu

Walmart Products Dataset – Free Product Data CSV

Key Features

Who Benefits?

Why Use This Dataset Instead of Manual Scraping?

2019 Public Data File - Students

Students Data Analysis

Goals

Tips

About the Data

Student Performance Dataset: Academic Insights 10K

Student Performance

Attributes for both Maths.csv (Math course) and Portuguese.csv (Portuguese language course) datasets:

These grades are related with the course subject, Math or Portuguese:

School Attendance by Student Group and District, 2021-2022

Fictional Student Performance Dataset

Federal School Code List for Free Application for Federal Student Aid...

Predict students' dropout and academic success

Predict students' dropout and academic success

Investigating the Impact of Social and Economic Factors

About this dataset

More Datasets

Featured Notebooks

How to use the dataset

Research Ideas

Free or Reduced-price Meal Eligibility - Datasets - CTData.org

Data from: Student grade prediction dataset

First Generation College Students Experiences - Qualitative Dataset 2021

Provision of Universal Free School Meals in two secondary schools - Datasets...

Datasets for Sentiment Analysis

4-year Cohort High School Graduation Rate - Datasets - CTData.org

Udemy Dataset

Student Study Performance

Problem Statement:

Content

Inspiration:

Students Covered Under Tobacco-Free School Policy

School Attendance by District, 2020-2021

Adoption of AI in Education

Walmart Products Dataset – Free Product Data CSV

Walmart Products Dataset – Free Product Data CSV from Walmart.com

Key Features

Who Benefits?

Why Use This Dataset Instead of Manual Scraping?