100+ datasets found
  1. Top Rated Movies

    • kaggle.com
    Updated Sep 10, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Marium Masroor (2024). Top Rated Movies [Dataset]. https://www.kaggle.com/datasets/mariumfaheem666/top-rated-movies
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Sep 10, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Marium Masroor
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Beginner Freindly:

    It is a beginner friendly data which is easy to understand and best to start any one's journey. Students of machine learning and data analytics can use this to understand basic libraries of python.

    Preprocessing:

    To make it less complex only four basic columns are included which are giving some quick information of top rated movies all over the world. This data set can help to build concepts of preprocessing and handling of data before applying any mathematical models.

    Visualizations are also giving some comprehensive description about popular movies.

    Lets get started. Happy coding!

  2. A deceptively simple dataset for beginners

    • kaggle.com
    Updated Jan 17, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ashish Gupta (2019). A deceptively simple dataset for beginners [Dataset]. https://www.kaggle.com/ashishguptadss/a-deceptively-simple-dataset-for-beginners/kernels
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jan 17, 2019
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Ashish Gupta
    License

    http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/

    Description

    Dataset

    This dataset was created by Ashish Gupta

    Released under Database: Open Database, Contents: Database Contents

    Contents

  3. A

    ‘Beginner's Classification Dataset’ analyzed by Analyst-2

    • analyst-2.ai
    Updated Jan 28, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com) (2022). ‘Beginner's Classification Dataset’ analyzed by Analyst-2 [Dataset]. https://analyst-2.ai/analysis/kaggle-beginner-s-classification-dataset-8154/cf913145/?iid=001-404&v=presentation
    Explore at:
    Dataset updated
    Jan 28, 2022
    Dataset authored and provided by
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com)
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Analysis of ‘Beginner's Classification Dataset’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/sveneschlbeck/beginners-classification-dataset on 28 January 2022.

    --- Dataset description provided by original source is as follows ---

    Context

    This beginner-friendly binary classification dataset contains a .csv file with pre-cleaned data - ideal for beginners who want to test out new algorithmic approaches to classification problems.

    Content

    The dataset contains only three columns: - age - interest - success

    The content can be applied to various things, e.g. how successful different people learn new sports.

    Take a look at the notebook "Decision Border Visualizer" to see how or where a binary classification algorithm draws the separation line(s) for distinguishing purposes.

    Data Source

    Jannis Seemann

    --- Original source retains full ownership of the source dataset ---

  4. Introduction to Pandas for Beginners

    • kaggle.com
    Updated Feb 15, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    LeKyThanhLiem (2024). Introduction to Pandas for Beginners [Dataset]. https://www.kaggle.com/datasets/lekythanhliem/introduction-to-pandas-for-beginners
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Feb 15, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    LeKyThanhLiem
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Dataset

    This dataset was created by LeKyThanhLiem

    Released under MIT

    Contents

  5. A

    ‘Titanic Solution for Beginner's Guide’ analyzed by Analyst-2

    • analyst-2.ai
    Updated Feb 14, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com) (2022). ‘Titanic Solution for Beginner's Guide’ analyzed by Analyst-2 [Dataset]. https://analyst-2.ai/analysis/kaggle-titanic-solution-for-beginner-s-guide-03a8/ae3641d4/?iid=014-162&v=presentation
    Explore at:
    Dataset updated
    Feb 14, 2022
    Dataset authored and provided by
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com)
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Analysis of ‘Titanic Solution for Beginner's Guide’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/harunshimanto/titanic-solution-for-beginners-guide on 14 February 2022.

    --- Dataset description provided by original source is as follows ---

    Overview

    The data has been split into two groups:

    training set (train.csv)
    test set (test.csv)
    

    The training set should be used to build your machine learning models. For the training set, we provide the outcome (also known as the “ground truth”) for each passenger. Your model will be based on “features” like passengers’ gender and class. You can also use feature engineering to create new features.

    The test set should be used to see how well your model performs on unseen data. For the test set, we do not provide the ground truth for each passenger. It is your job to predict these outcomes. For each passenger in the test set, use the model you trained to predict whether or not they survived the sinking of the Titanic.

    We also include gender_submission.csv, a set of predictions that assume all and only female passengers survive, as an example of what a submission file should look like.

    Data Dictionary

    Variable Definition Key survival Survival 0 = No, 1 = Yes pclass Ticket class 1 = 1st, 2 = 2nd, 3 = 3rd sex Sex
    Age Age in years
    sibsp # of siblings / spouses aboard the Titanic
    parch # of parents / children aboard the Titanic
    ticket Ticket number
    fare Passenger fare
    cabin Cabin number
    embarked Port of Embarkation C = Cherbourg, Q = Queenstown, S = Southampton

    Variable Notes

    pclass: A proxy for socio-economic status (SES) 1st = Upper 2nd = Middle 3rd = Lower

    age: Age is fractional if less than 1. If the age is estimated, is it in the form of xx.5

    sibsp: The dataset defines family relations in this way... Sibling = brother, sister, stepbrother, stepsister Spouse = husband, wife (mistresses and fiancés were ignored)

    parch: The dataset defines family relations in this way... Parent = mother, father Child = daughter, son, stepdaughter, stepson Some children travelled only with a nanny, therefore parch=0 for them.

    --- Original source retains full ownership of the source dataset ---

  6. A

    ‘Data visualization for beginners’ analyzed by Analyst-2

    • analyst-2.ai
    Updated Jan 28, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com) (2022). ‘Data visualization for beginners’ analyzed by Analyst-2 [Dataset]. https://analyst-2.ai/analysis/kaggle-data-visualization-for-beginners-394c/a2ae32fd/?iid=000-351&v=presentation
    Explore at:
    Dataset updated
    Jan 28, 2022
    Dataset authored and provided by
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com)
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Analysis of ‘Data visualization for beginners’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/chryzal/data-visualization-for-beginners on 28 January 2022.

    --- No further description of dataset provided by original source ---

    --- Original source retains full ownership of the source dataset ---

  7. Magic The Gathering dataset

    • kaggle.com
    Updated Oct 22, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Antonio Pelusi (2022). Magic The Gathering dataset [Dataset]. http://doi.org/10.34740/kaggle/dsv/4373482
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Oct 22, 2022
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Antonio Pelusi
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    A simple dataset to start learning the basics of Data Analytics. You can find a simple script in Python (Pymongo to connect to MongoDB and pandas to print) that use this dataset in the code section.

    Click the following link to get more info about the usage of the Python Script mtgdb.py): MTGDB

  8. h

    twt-kaggle-data

    • huggingface.co
    Updated Dec 8, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    megha manoj (2023). twt-kaggle-data [Dataset]. https://huggingface.co/datasets/mochi-skz/twt-kaggle-data
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Dec 8, 2023
    Authors
    megha manoj
    Description

    mochi-skz/twt-kaggle-data dataset hosted on Hugging Face and contributed by the HF Datasets community

  9. P

    DSEval-Kaggle Dataset

    • paperswithcode.com
    Updated Feb 26, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yuge Zhang; Qiyang Jiang; Xingyu Han; Nan Chen; Yuqing Yang; Kan Ren (2024). DSEval-Kaggle Dataset [Dataset]. https://paperswithcode.com/dataset/dseval
    Explore at:
    Dataset updated
    Feb 26, 2024
    Authors
    Yuge Zhang; Qiyang Jiang; Xingyu Han; Nan Chen; Yuqing Yang; Kan Ren
    Description

    In this paper, we introduce a novel benchmarking framework designed specifically for evaluations of data science agents. Our contributions are three-fold. First, we propose DSEval, an evaluation paradigm that enlarges the evaluation scope to the full lifecycle of LLM-based data science agents. We also cover aspects including but not limited to the quality of the derived analytical solutions or machine learning models, as well as potential side effects such as unintentional changes to the original data. Second, we incorporate a novel bootstrapped annotation process letting LLM themselves generate and annotate the benchmarks with ``human in the loop''. A novel language (i.e., DSEAL) has been proposed and the derived four benchmarks have significantly improved the benchmark scalability and coverage, with largely reduced human labor. Third, based on DSEval and the four benchmarks, we conduct a comprehensive evaluation of various data science agents from different aspects. Our findings reveal the common challenges and limitations of the current works, providing useful insights and shedding light on future research on LLM-based data science agents.

    This is one of DSEval benchmarks.

  10. A

    ‘Deep Learning A-Z - ANN dataset’ analyzed by Analyst-2

    • analyst-2.ai
    Updated Jan 28, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com) (2022). ‘Deep Learning A-Z - ANN dataset’ analyzed by Analyst-2 [Dataset]. https://analyst-2.ai/analysis/kaggle-deep-learning-a-z-ann-dataset-3c75/cb36262b/?iid=013-193&v=presentation
    Explore at:
    Dataset updated
    Jan 28, 2022
    Dataset authored and provided by
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com)
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Analysis of ‘Deep Learning A-Z - ANN dataset’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/filippoo/deep-learning-az-ann on 28 January 2022.

    --- Dataset description provided by original source is as follows ---

    Context

    This is the dataset used in the section "ANN (Artificial Neural Networks)" of the Udemy course from Kirill Eremenko (Data Scientist & Forex Systems Expert) and Hadelin de Ponteves (Data Scientist), called Deep Learning A-Z™: Hands-On Artificial Neural Networks. The dataset is very useful for beginners of Machine Learning, and a simple playground where to compare several techniques/skills.

    It can be freely downloaded here: https://www.superdatascience.com/deep-learning/

    The story: A bank is investigating a very high rate of customer leaving the bank. Here is a 10.000 records dataset to investigate and predict which of the customers are more likely to leave the bank soon.

    The story of the story: I'd like to compare several techniques (better if not alone, and with the experience of several Kaggle users) to improve my basic knowledge on Machine Learning.

    Content

    I will write more later, but the columns names are very self-explaining.

    Acknowledgements

    Udemy instructors Kirill Eremenko (Data Scientist & Forex Systems Expert) and Hadelin de Ponteves (Data Scientist), and their efforts to provide this dataset to their students.

    Inspiration

    Which methods score best with this dataset? Which are fastest (or, executable in a decent time)? Which are the basic steps with such a simple dataset, very useful to beginners?

    --- Original source retains full ownership of the source dataset ---

  11. h

    test-dataset-kaggle

    • huggingface.co
    Updated Feb 15, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Gholamreza Dar (2024). test-dataset-kaggle [Dataset]. https://huggingface.co/datasets/Gholamreza/test-dataset-kaggle
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Feb 15, 2024
    Authors
    Gholamreza Dar
    Description

    Gholamreza/test-dataset-kaggle dataset hosted on Hugging Face and contributed by the HF Datasets community

  12. h

    BirdCLEF-Challenge2023-Kaggle

    • huggingface.co
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bernardo Cecchetto, BirdCLEF-Challenge2023-Kaggle [Dataset]. https://huggingface.co/datasets/bernardocecchetto/BirdCLEF-Challenge2023-Kaggle
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Authors
    Bernardo Cecchetto
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    This dataset contains audios of 264 species of birds singing that were all processed. It was processed as follows:

    Stereo to Mono Resampled 16kHz High Pass Filter (1500Hz and filter order of 16) Normalized

    The raw dataset was provided by the BirdCLEF 2023 challenge from Kaggle. You can access it in https://www.kaggle.com/competitions/birdclef-2023/data

  13. o

    Python Package List

    • opendatabay.com
    .undefined
    Updated Jun 28, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Datasimple (2025). Python Package List [Dataset]. https://www.opendatabay.com/data/ai-ml/8608c317-7e93-4c90-8670-143e459975c6
    Explore at:
    .undefinedAvailable download formats
    Dataset updated
    Jun 28, 2025
    Dataset authored and provided by
    Datasimple
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Area covered
    Software and Technology
    Description

    Context A significant amount of software is available in Kaggle's Python notebook. I had hoped to find a reference somewhere listing which Python packages were available and what each one did.

    When I didn't find what I was looking for, I decided to build this dataset instead.

    Content This dataset was assembled in four steps:

    Code inside a Kaggle notebook was used to gather the names of over 600 installed packages. A package list was scraped from Anaconda and cross-referenced against the notebook package list. The roughly 400 packages that remained were carefully queried from the Python Package Index using its JSON API. The results were collated into a manifest. Reference Anaconda's 64-bit Linux Python package list. HTML Scraping - The Hitchhiker's Guide to Python The PyPI JSON API. Rate Limiting for the PyPI API Acknowledgements Thanks to @nagadomi for the original script. Thanks to the Kaggle team for creating a powerful notebook environment. Photo by j zamora.

    License

    CC0

    Original Data Source: Python Package List

  14. Python Basic for Beginners

    • kaggle.com
    Updated Nov 16, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nadeem Taj (2022). Python Basic for Beginners [Dataset]. https://www.kaggle.com/datasets/nadeemtaj407/python-basic-for-beginners
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Nov 16, 2022
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Nadeem Taj
    Description

    Dataset

    This dataset was created by Nadeem Taj

    Contents

  15. R

    Kaggle Dataset

    • universe.roboflow.com
    zip
    Updated Apr 22, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Moiz (2025). Kaggle Dataset [Dataset]. https://universe.roboflow.com/moiz-wklhw/kaggle-dataset-dxie4
    Explore at:
    zipAvailable download formats
    Dataset updated
    Apr 22, 2025
    Dataset authored and provided by
    Moiz
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Variables measured
    Objects Bounding Boxes
    Description

    Kaggle Dataset

    ## Overview
    
    Kaggle Dataset is a dataset for object detection tasks - it contains Objects annotations for 617 images.
    
    ## Getting Started
    
    You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
    
      ## License
    
      This dataset is available under the [Public Domain license](https://creativecommons.org/licenses/Public Domain).
    
  16. A

    ‘Kaggle Competitions Top 100’ analyzed by Analyst-2

    • analyst-2.ai
    Updated Feb 14, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com) (2022). ‘Kaggle Competitions Top 100’ analyzed by Analyst-2 [Dataset]. https://analyst-2.ai/analysis/kaggle-kaggle-competitions-top-100-961d/latest
    Explore at:
    Dataset updated
    Feb 14, 2022
    Dataset authored and provided by
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com)
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Analysis of ‘Kaggle Competitions Top 100’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/vivovinco/kaggle-competitions-top-100 on 13 February 2022.

    --- Dataset description provided by original source is as follows ---

    Context

    This dataset contains top 100 of Kaggle competitions ranking. The dataset will be updated every month.

    Content

    100 rows and 13 columns. Columns' description are listed below.

    • User : Name of the user
    • Tier : Grandmaster, Master or Expert
    • Company/School : Company/School info of the user if mentioned
    • Country : Country info of the user if mentioned
    • Competitions_Num : Number of competitions joined
    • Competitions_Gold : Number of competitions gold medals won
    • Competitions_Silver : Number of competitions silver medals won
    • Competitions_Bronze : Number of competitions bronze medals won
    • Datasets_Num : Number of public datasets
    • Notebooks_Num : Number of public notebooks
    • Discussions_Num : Number of topics/comments posted
    • Points : Total points
    • Profile : Link of Kaggle profile

    Acknowledgements

    Data from Kaggle. Image from Smartcat.

    If you're reading this, please upvote.

    --- Original source retains full ownership of the source dataset ---

  17. Z

    Data from: KGTorrent: A Dataset of Python Jupyter Notebooks from Kaggle

    • data.niaid.nih.gov
    • zenodo.org
    Updated Jul 19, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Quaranta, Luigi (2024). KGTorrent: A Dataset of Python Jupyter Notebooks from Kaggle [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_4468522
    Explore at:
    Dataset updated
    Jul 19, 2024
    Dataset provided by
    Lanubile, Filippo
    Quaranta, Luigi
    Calefato, Fabio
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    KGTorrent is a dataset of Python Jupyter notebooks from the Kaggle platform.

    The dataset is accompanied by a MySQL database containing metadata about the notebooks and the activity of Kaggle users on the platform. The information to build the MySQL database has been derived from Meta Kaggle, a publicly available dataset containing Kaggle metadata.

    In this package, we share the complete KGTorrent dataset (consisting of the dataset itself plus its companion database), as well as the specific version of Meta Kaggle used to build the database.

    More specifically, the package comprises the following three compressed archives:

    KGT_dataset.tar.bz2, the dataset of Jupyter notebooks;

    KGTorrent_dump_10-2020.sql.tar.bz2, the dump of the MySQL companion database;

    MetaKaggle27Oct2020.tar.bz2, a copy of the Meta Kaggle version used to build the database.

    Moreover, we include KGTorrent_logical_schema.pdf, the logical schema of the KGTorrent MySQL database.

  18. issues-kaggle-notebooks

    • huggingface.co
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Hugging Face Smol Models Research, issues-kaggle-notebooks [Dataset]. https://huggingface.co/datasets/HuggingFaceTB/issues-kaggle-notebooks
    Explore at:
    Dataset provided by
    Hugging Facehttps://huggingface.co/
    Authors
    Hugging Face Smol Models Research
    Description

    GitHub Issues & Kaggle Notebooks

      Description
    

    GitHub Issues & Kaggle Notebooks is a collection of two code datasets intended for language models training, they are sourced from GitHub issues and notebooks in Kaggle platform. These datasets are a modified part of the StarCoder2 model training corpus, precisely the bigcode/StarCoder2-Extras dataset. We reformat the samples to remove StarCoder2's special tokens and use natural text to delimit comments in issues and display… See the full description on the dataset page: https://huggingface.co/datasets/HuggingFaceTB/issues-kaggle-notebooks.

  19. R

    Resistors Kaggle Dataset

    • universe.roboflow.com
    zip
    Updated Jun 28, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    RCR (2024). Resistors Kaggle Dataset [Dataset]. https://universe.roboflow.com/rcr-mjqgv/resistors-kaggle
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jun 28, 2024
    Dataset authored and provided by
    RCR
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    Resistor Bounding Boxes
    Description

    Resistors Kaggle

    ## Overview
    
    Resistors Kaggle is a dataset for object detection tasks - it contains Resistor annotations for 1,000 images.
    
    ## Getting Started
    
    You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
    
      ## License
    
      This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
    
  20. o

    Amazon Product Reviews Dataset

    • opendatabay.com
    .csv
    Updated Jun 8, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Datasimple (2025). Amazon Product Reviews Dataset [Dataset]. https://www.opendatabay.com/data/dataset/6051da05-ace4-44ca-baca-85efdd809836
    Explore at:
    .csvAvailable download formats
    Dataset updated
    Jun 8, 2025
    Dataset authored and provided by
    Datasimple
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Area covered
    Reviews & Ratings
    Description

    The dataset consists of samples from Amazon Ratings for select products. The reviews are picked randomly and the corpus has nearly 1.6k reviews of different customers.
    Amazon aims to understand what are the main topics of these reviews to classify them for easier search.
    Can you build a strong model that differentiates the topics based on the reviews corpus?

    Acknowledgements The dataset is referred from Kaggle.

    Objective: Understand the Dataset & perform the necessary cleanup. Build a strong Topic Modelling Algorithm to classify the topics.

    Original Data Source: Amazon Product Reviews Dataset

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Marium Masroor (2024). Top Rated Movies [Dataset]. https://www.kaggle.com/datasets/mariumfaheem666/top-rated-movies
Organization logo

Top Rated Movies

Simple Linear Regression

Explore at:
202 scholarly articles cite this dataset (View in Google Scholar)
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Sep 10, 2024
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Marium Masroor
License

Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically

Description

Beginner Freindly:

It is a beginner friendly data which is easy to understand and best to start any one's journey. Students of machine learning and data analytics can use this to understand basic libraries of python.

Preprocessing:

To make it less complex only four basic columns are included which are giving some quick information of top rated movies all over the world. This data set can help to build concepts of preprocessing and handling of data before applying any mathematical models.

Visualizations are also giving some comprehensive description about popular movies.

Lets get started. Happy coding!

Search
Clear search
Close search
Google apps
Main menu