100+ datasets found
  1. P

    DSEval-Kaggle Dataset

    • paperswithcode.com
    Updated Apr 19, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yuge Zhang; Qiyang Jiang; Xingyu Han; Nan Chen; Yuqing Yang; Kan Ren (2024). DSEval-Kaggle Dataset [Dataset]. https://paperswithcode.com/dataset/dseval
    Explore at:
    Dataset updated
    Apr 19, 2024
    Authors
    Yuge Zhang; Qiyang Jiang; Xingyu Han; Nan Chen; Yuqing Yang; Kan Ren
    Description

    In this paper, we introduce a novel benchmarking framework designed specifically for evaluations of data science agents. Our contributions are three-fold. First, we propose DSEval, an evaluation paradigm that enlarges the evaluation scope to the full lifecycle of LLM-based data science agents. We also cover aspects including but not limited to the quality of the derived analytical solutions or machine learning models, as well as potential side effects such as unintentional changes to the original data. Second, we incorporate a novel bootstrapped annotation process letting LLM themselves generate and annotate the benchmarks with ``human in the loop''. A novel language (i.e., DSEAL) has been proposed and the derived four benchmarks have significantly improved the benchmark scalability and coverage, with largely reduced human labor. Third, based on DSEval and the four benchmarks, we conduct a comprehensive evaluation of various data science agents from different aspects. Our findings reveal the common challenges and limitations of the current works, providing useful insights and shedding light on future research on LLM-based data science agents.

    This is one of DSEval benchmarks.

  2. Performance vs. Predicted Performance

    • kaggle.com
    Updated Dec 21, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Calathea21 (2022). Performance vs. Predicted Performance [Dataset]. http://doi.org/10.34740/kaggle/dsv/4752670
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Dec 21, 2022
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Calathea21
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Description

    This dataset contains information about high school students and their actual and predicted performance on an exam. Most of the information, including some general information about high school students and their grade for an exam, was based on an already existing dataset, while the predicted exam performance was based on a human experiment. In this experiment, participants were shown short descriptions of the students (based on the information in the original data) and had to rank and grade according to their expected performance. Prior to this task some participants were exposed to some "Stereotype Activation", suggesting that boys perform less well in school than girls.

    Description of *original_data.csv*

    Based on this dataset (which is also available on kaggle), we extracted a number of student profiles that participants had to make grade predictions for. For more information about this dataset we refer to the corresponding kaggle page: https://www.kaggle.com/datasets/uciml/student-alcohol-consumption

    Note that we performed some preprocessing on the original data:

    • The original data consisted of two parts: the information about students following a Maths course and the information about students following a Portuguese course. Since in both datasets the same type of information was recorded, we merged both datasets and added a column "subject", to show which course each student belongs to

    • We excluded all data where G3 = 0 (i.e. the grade for the last exam = 0)

    • From original_data.csv we randomly sampled 856 students that participants in our study had to make grade predictions for.

    Description of *CompleteDataAndBiases.csv*

    index - this column corresponds to the indeces in the file "original_data.csv". Through these indices, it is possible to add columns from the original data to the dataset with the grade prediction

    ParticipantID - the ID of the participant who made the performance predictions for the corresponding student. Predictions needed to be made for 856 students, and each participant made 8 predictions total. Thus there are 107 different participant IDs

    name - to make the prediction task more engaging for participants, each of the 8 student profiles, that participants had to grade & rank was randomly matched to one of four boy/girl's names (depending on the sex of the student)

    sex - the sex of each student, either female (F) or male (M). For benchmarking fair ML algorithms, this can be used as the sensitive attribute. We assume that in the fair version of the decision variable ("Pass"), no sex discrimination occurs. The biased versions of the variable ("Predicted Pass") are mostly discriminatory towards male students.

    studytime - this variable is taken from the original dataset and denotes how long a student studied for their exam. In the original data this variable consisted of four levels (less than 2 hours vs. 2-5 hours vs. 5-10 hours vs. more than 10 hours). We binned the latter two levels together and encoded this column numerically from 1-3.

    freetime - Originally, this variable ranged from 1 (very low) to 5 (very high). We binned this variable into three categories, where level 1 and 2 are binned, as well as level 4 and 5.

    romantic - Binary variable, denoting whether the student is in a romantic relationship or not.

    Walc - This variable shows how much alcohol each student consumes in the weekend. Originally it ranged from 1 to 5 (5 corresponding to the highest alcohol consumption), but we binned the last two levels together.

    goout - This variable shows how often a student goes out in a week. Originally it ranged from 1 to 5 (5 corresponding to going out very often), but we binned the last two levels together.

    Parents_edu - This variable was not present in the original dataset. Instead, the original dataset consisted of two variables "mum_edu" and "dad_edu". We obtained "Parents_edu" by taking the higher one of both. The variable consist of 4 levels, whereas 4 = highest level of education.

    absences - This variable shows the number of absences per student. Originally it ranged from 0 - 93, but because large number of absences were infrequent we binned all absences of >=7 into one level.

    reason - The reason for why a student chose to go to the school in question. The levels are close to home, school's reputation, school's curricular and other

    G3 - The actual grade each student received for the final exam of the course, ranging from 0-20.

    Pass - A binary variable showing whether G3 is a passing grade (i.e. >=10) or not.

    Predicted Grade - The grade the student was predicted to receive in our experiment

    Predicted Rank - In our ex...

  3. 💾DDR4 Memory Benchmarks📊

    • kaggle.com
    Updated May 3, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Alan Jo (2022). 💾DDR4 Memory Benchmarks📊 [Dataset]. https://www.kaggle.com/alanjo/ddr4-memory-benchmarks/discussion
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 3, 2022
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Alan Jo
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Enhanced version of this dataset can be found here 🙂

    Context

    Benchmarks allow for easy comparison between multiple RAM kits by scoring their performance on a standardized series of tests, and they are useful in many instances: When buying or building a new PC.

    Content

    Newest data as of May 3rd, 2022 This dataset contains benchmarks of DDR4 memory models

    Acknowledgements

    Data scrapped from PassMark

    If you enjoyed this dataset, here's some similar datasets you may like 😎

  4. SNIPS Natural Language Understanding Benchmark:

    • kaggle.com
    Updated Jun 25, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    renan ferreira (2024). SNIPS Natural Language Understanding Benchmark: [Dataset]. https://www.kaggle.com/datasets/renanaferreira/snips-natural-language-understanding-benchmark
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jun 25, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    renan ferreira
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Natural Language Understanding benchmark

    This repository contains the results of three benchmarks that compare natural language understanding services offering: 1. built-in intents (Apple’s SiriKit, Amazon’s Alexa, Microsoft’s Luis, Google’s API.ai, and Snips.ai) on a selection of various intents. This benchmark was performed in December 2016. Its results are described in length in the following post. 2. custom intent engines (Google's API.ai, Facebook's Wit, Microsoft's Luis, Amazon's Alexa, and Snips' NLU) for seven chosen intents. This benchmark was performed in June 2017. Its results are described in a paper and a blog post. 3. extension of Braun et al., 2017 (Google's API.AI, Microsoft's Luis, IBM's Watson, Rasa) This experiment replicates the analysis made by Braun et al., 2017, published in Evaluating Natural Language Understanding Services for Conversational Question Answering Systems as part of SIGDIAL 2017 proceedings. Snips and Rasa are added. Details are available in a paper and a blog post.

    The data is provided for each benchmark and more details about the methods are available in the README file in each folder.

    Any publication based on these datasets must include a full citation to the following paper in which the results were published by the Snips Team:

    "https://arxiv.org/abs/1805.10190">Coucke A. et al., "Snips Voice Platform: an embedded Spoken Language Understanding system for private-by-design voice interfaces." 2018,

    accepted for a spotlight presentation at the Privacy in Machine Learning and Artificial Intelligence workshop colocated with ICML 2018.

    The Snips team has joined Sonos in November 2019. These open datasets remain available and their access is now managed by the Sonos Voice Experience Team. Please email sve-research@sonos.com with any question.

  5. n

    Data from: Assessing predictive performance of supervised machine learning...

    • data.niaid.nih.gov
    • datadryad.org
    • +1more
    zip
    Updated May 23, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Evans Omondi (2023). Assessing predictive performance of supervised machine learning algorithms for a diamond pricing model [Dataset]. http://doi.org/10.5061/dryad.wh70rxwrh
    Explore at:
    zipAvailable download formats
    Dataset updated
    May 23, 2023
    Dataset provided by
    Strathmore University
    Authors
    Evans Omondi
    License

    https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html

    Description

    The diamond is 58 times harder than any other mineral in the world, and its elegance as a jewel has long been appreciated. Forecasting diamond prices is challenging due to nonlinearity in important features such as carat, cut, clarity, table, and depth. Against this backdrop, the study conducted a comparative analysis of the performance of multiple supervised machine learning models (regressors and classifiers) in predicting diamond prices. Eight supervised machine learning algorithms were evaluated in this work including Multiple Linear Regression, Linear Discriminant Analysis, eXtreme Gradient Boosting, Random Forest, k-Nearest Neighbors, Support Vector Machines, Boosted Regression and Classification Trees, and Multi-Layer Perceptron. The analysis is based on data preprocessing, exploratory data analysis (EDA), training the aforementioned models, assessing their accuracy, and interpreting their results. Based on the performance metrics values and analysis, it was discovered that eXtreme Gradient Boosting was the most optimal algorithm in both classification and regression, with a R2 score of 97.45% and an Accuracy value of 74.28%. As a result, eXtreme Gradient Boosting was recommended as the optimal regressor and classifier for forecasting the price of a diamond specimen. Methods Kaggle, a data repository with thousands of datasets, was used in the investigation. It is an online community for machine learning practitioners and data scientists, as well as a robust, well-researched, and sufficient resource for analyzing various data sources. On Kaggle, users can search for and publish various datasets. In a web-based data-science environment, they can study datasets and construct models.

  6. Kaggle, IHME and LANL Forecasts

    • kaggle.com
    Updated May 26, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Anthony Goldbloom (2020). Kaggle, IHME and LANL Forecasts [Dataset]. https://www.kaggle.com/antgoldbloom/covid19-epidemiological-benchmarking-dataset/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 26, 2020
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Anthony Goldbloom
    Description

    Dataset

    This dataset was created by Anthony Goldbloom

    Contents

  7. A

    ‘STUDENTS PERFORMANCE DATASET’ analyzed by Analyst-2

    • analyst-2.ai
    Updated Jan 28, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com) (2022). ‘STUDENTS PERFORMANCE DATASET’ analyzed by Analyst-2 [Dataset]. https://analyst-2.ai/analysis/kaggle-students-performance-dataset-18ad/98348783/?iid=001-124&v=presentation
    Explore at:
    Dataset updated
    Jan 28, 2022
    Dataset authored and provided by
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com)
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Analysis of ‘STUDENTS PERFORMANCE DATASET’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/balavashan/students-performance-dataset on 28 January 2022.

    --- Dataset description provided by original source is as follows ---

    Context

    This Dataset contains the information about the students and their Performance try to predict using the simple machine Learning Algorithm.

    Content

    It contains the Specific information about the Schooling,Family issues,personal Relationship of the students,Internet facility and so on.

    Acknowledgements

    This data are gathered from numerous sources thanks a lot @uci_repository.

    Inspiration

    Try to find out the issues with the Students as they are future Personalities ,to save their Schooling and Youth life.

    --- Original source retains full ownership of the source dataset ---

  8. Chess XAI Benchmark

    • kaggle.com
    Updated Jul 25, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    smuecke (2022). Chess XAI Benchmark [Dataset]. https://www.kaggle.com/datasets/smuecke/chess-xai-benchmark
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 25, 2022
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    smuecke
    Description

    This data set contains 30 million chess positions along with a label that indicates if the position is not check (0), check (1) or checkmate (2). In addition, we provide 3 reference explanations per data point consisting of 8×8 bit masks that highlight certain squares that are relevant for the decision. For each class, we identified one explanation type that characterizes it most accurately: - No check (0): All squares that are controlled by the enemy player, i.e., all squares that can be reached or captured on by any enemy piece. - Check (1): All squares (origin or target) of legal moves. As a checkmate is a check where the player under attack has no more legal moves, highlighting legal moves is sufficient to disprove a checkmate. - Checkmate (2): All squares with pieces that are essential for creating the checkmate. This includes attackers, friendly pieces blocking the King, enemy pieces guarding escape squares and enemy pieces protecting attackers.

    The data is saved as a CSV file containing the chess positions in Forsyth–Edwards Notation (FEN) and the label (0-2) as columns. The FEN string can be read by most chess software packages and encodes the current piece setup, whose turn it is and some more game-specific information (castling rights, en-passant squares). The explanations are saved as 64-bit unsigned integers, which can be converted to SquareSet objects from the chess library. We provide code for converting between different data and explanation representations.

    Our data set is based on the Lichess open database, which contains records of over 3 billion games of chess played online by human players on the free chess website Lichess. To read and process the games and to create the explanations, we used the Python package chess. We selected only those games that end in checkmate, excluding those that end by timeout or resignation. Also we skip the first ten moves, as they lead to lots of duplicate positions.

  9. A

    ‘Body performance Data’ analyzed by Analyst-2

    • analyst-2.ai
    Updated Jan 28, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com) (2022). ‘Body performance Data’ analyzed by Analyst-2 [Dataset]. https://analyst-2.ai/analysis/kaggle-body-performance-data-aa99/53970b34/?iid=005-594&v=presentation
    Explore at:
    Dataset updated
    Jan 28, 2022
    Dataset authored and provided by
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com)
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Analysis of ‘Body performance Data’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/kukuroo3/body-performance-data on 28 January 2022.

    --- Dataset description provided by original source is as follows ---

    Context

    This is data that confirmed the grade of performance with age and some exercise performance data.

    Content

    data shape : (13393, 12)

    • age : 20 ~64
    • gender : F,M
    • height_cm : (If you want to convert to feet, divide by 30.48)
    • weight_kg
    • body fat_%
    • diastolic : diastolic blood pressure (min)
    • systolic : systolic blood pressure (min)
    • gripForce
    • sit and bend forward_cm
    • sit-ups counts
    • broad jump_cm
    • class : A,B,C,D ( A: best) / stratified

    Source

    link (Korea Sports Promotion Foundation) Some post-processing and filtering has done from the raw data.

    --- Original source retains full ownership of the source dataset ---

  10. Clustering benchmark datasets

    • kaggle.com
    Updated Feb 12, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Krid Jin (2018). Clustering benchmark datasets [Dataset]. https://www.kaggle.com/vasopikof/clustering-benchmark-datasets/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Feb 12, 2018
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Krid Jin
    License

    http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/

    Description

    Context

    Clustering benchmark datasets published by School of Computing, University of Eastern Finland

    Content

    2D scatter points and label which need to process the formatting first.

    find more in https://cs.joensuu.fi/sipu/datasets/

    Acknowledgements

    @misc{ClusteringDatasets, author = {Pasi Fr"anti et al}, title = {Clustering datasets}, year = {2015}, url = {http://cs.uef.fi/sipu/datasets/}, }

    Inspiration

    With standard and famous benchmark, various clustering algorithm can be performed and compared though a number of kernels.

  11. Student Performance

    • kaggle.com
    Updated Oct 7, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Aman Chauhan (2022). Student Performance [Dataset]. https://www.kaggle.com/datasets/whenamancodes/student-performance
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Oct 7, 2022
    Dataset provided by
    Kaggle
    Authors
    Aman Chauhan
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This data approach student achievement in secondary education of two Portuguese schools. The data attributes include student grades, demographic, social and school related features) and it was collected by using school reports and questionnaires. Two datasets are provided regarding the performance in two distinct subjects: Mathematics (mat) and Portuguese language (por). In [Cortez and Silva, 2008], the two datasets were modeled under binary/five-level classification and regression tasks. Important note: the target attribute G3 has a strong correlation with attributes G2 and G1. This occurs because G3 is the final year grade (issued at the 3rd period), while G1 and G2 correspond to the 1st and 2nd period grades. It is more difficult to predict G3 without G2 and G1, but such prediction is much more useful (see paper source for more details).

    Attributes for both Maths.csv (Math course) and Portuguese.csv (Portuguese language course) datasets:

    ColumnsDescription
    schoolstudent's school (binary: 'GP' - Gabriel Pereira or 'MS' - Mousinho da Silveira)
    sexstudent's sex (binary: 'F' - female or 'M' - male)
    agestudent's age (numeric: from 15 to 22)
    addressstudent's home address type (binary: 'U' - urban or 'R' - rural)
    famsizefamily size (binary: 'LE3' - less or equal to 3 or 'GT3' - greater than 3)
    Pstatusparent's cohabitation status (binary: 'T' - living together or 'A' - apart)
    Medumother's education (numeric: 0 - none, 1 - primary education (4th grade), 2 – 5th to 9th grade, 3 – secondary education or 4 – higher education)
    Fedufather's education (numeric: 0 - none, 1 - primary education (4th grade), 2 – 5th to 9th grade, 3 – secondary education or 4 – higher education)
    Mjobmother's job (nominal: 'teacher', 'health' care related, civil 'services' (e.g. administrative or police), 'at_home' or 'other')
    Fjobfather's job (nominal: 'teacher', 'health' care related, civil 'services' (e.g. administrative or police), 'at_home' or 'other')
    reasonreason to choose this school (nominal: close to 'home', school 'reputation', 'course' preference or 'other')
    guardianstudent's guardian (nominal: 'mother', 'father' or 'other')
    traveltimehome to school travel time (numeric: 1 - <15 min., 2 - 15 to 30 min., 3 - 30 min. to 1 hour, or 4 - >1 hour)
    studytimeweekly study time (numeric: 1 - <2 hours, 2 - 2 to 5 hours, 3 - 5 to 10 hours, or 4 - >10 hours)
    failuresnumber of past class failures (numeric: n if 1<=n<3, else 4)
    schoolsupextra educational support (binary: yes or no)
    famsupfamily educational support (binary: yes or no)
    paidextra paid classes within the course subject (Math or Portuguese) (binary: yes or no)
    activitiesextra-curricular activities (binary: yes or no)
    nurseryattended nursery school (binary: yes or no)
    higherwants to take higher education (binary: yes or no)
    internetInternet access at home (binary: yes or no)
    romanticwith a romantic relationship (binary: yes or no)
    famrelquality of family relationships (numeric: from 1 - very bad to 5 - excellent)
    freetimefree time after school (numeric: from 1 - very low to 5 - very high)
    gooutgoing out with friends (numeric: from 1 - very low to 5 - very high)
    Dalcworkday alcohol consumption (numeric: from 1 - very low to 5 - very high)
    Walcweekend alcohol consumption (numeric: from 1 - very low to 5 - very high)
    healthcurrent health status (numeric: from 1 - very bad to 5 - very good)
    absencesnumber of school absences (numeric: from 0 to 93)

    These grades are related with the course subject, Math or Portuguese:

    GradeDescription
    G1first period grade (numeric: from 0 to 20)
    G2second period grade (numeric: from 0 to 20)
    G3final grade (numeric: from 0 to 20, output target)

    More - Find More Exciting🙀 Datasets Here - An Upvote👍 A Dayᕙ(`▿´)ᕗ , Keeps Aman Hurray Hurray..... ٩(˘◡˘)۶Haha

  12. Student Engagement

    • kaggle.com
    Updated Nov 23, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Devastator (2022). Student Engagement [Dataset]. https://www.kaggle.com/datasets/thedevastator/student-engagement-with-tableau-a-data-science-p
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Nov 23, 2022
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    The Devastator
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Student Engagement

    Predicting Engagement and Exam Performance

    By [source]

    About this dataset

    This dataset contains information on student engagement with Tableau, including quizzes, exams, and lessons. The data includes the course title, the rating of the course, the date the course was rated, the exam category, the exam duration, whether the answer was correct or not, the number of quizzes completed, the number of exams completed, the number of lessons completed, the date engaged, the exam result, and more

    How to use the dataset

    The 'Student Engagement with Tableau' dataset offers insights into student engagement with the Tableau software. The data includes information on courses, exams, quizzes, and student learning.

    This dataset can be used to examine how students use Tableau, what kind of engagement leads to better learning outcomes, and whether certain course or exam characteristics are associated with student engagement

    Research Ideas

    • Creating a heat map of student engagement by course and location
    • Determining which courses are most popular among students from different countries
    • Identifying patterns in students' exam results

    Acknowledgements

    Data Source

    License

    License: CC0 1.0 Universal (CC0 1.0) - Public Domain Dedication No Copyright - You can copy, modify, distribute and perform the work, even for commercial purposes, all without asking permission. See Other Information.

    Columns

    File: 365_course_info.csv | Column name | Description | |:-----------------|:----------------------------------| | course_title | The title of the course. (String) |

    File: 365_course_ratings.csv | Column name | Description | |:------------------|:---------------------------------------------------------| | course_rating | The rating given to the course by the student. (Numeric) | | date_rated | The date on which the course was rated. (Date) |

    File: 365_exam_info.csv | Column name | Description | |:------------------|:-------------------------------------------------| | exam_category | The category of the exam. (Categorical) | | exam_duration | The duration of the exam in minutes. (Numerical) |

    File: 365_quiz_info.csv | Column name | Description | |:-------------------|:----------------------------------------------------------------------| | answer_correct | Whether or not the student answered the question correctly. (Boolean) |

    File: 365_student_engagement.csv | Column name | Description | |:-----------------------|:------------------------------------------------------------------| | engagement_quizzes | The number of times a student has engaged with quizzes. (Numeric) | | engagement_exams | The number of times a student has engaged with exams. (Numeric) | | engagement_lessons | The number of times a student has engaged with lessons. (Numeric) | | date_engaged | The date of the student's engagement. (Date) |

    File: 365_student_exams.csv | Column name | Description | |:-------------------------|:---------------------------------------------------| | exam_result | The result of the exam. (Categorical) | | exam_completion_time | The time it took to complete the exam. (Numerical) | | date_exam_completed | The date the exam was completed. (Date) |

    File: 365_student_hub_questions.csv | Column name | Description | |:------------------------|:----------------------------------------| | date_question_asked | The date the question was asked. (Date) |

    File: 365_student_info.csv | Column name | Description | |:--------------------|:-------------------------------------------------------| | student_country | The country of the student. (Categorical) | | date_registered | The date the student registered for the course. (Date) |

    File: 365_student_learning.csv | Column name | Description | |:--------------------|:------------------------------...

  13. BrowseComp: A Benchmark for Browsing Agents

    • kaggle.com
    zip
    Updated Jun 4, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Open Benchmarks (2025). BrowseComp: A Benchmark for Browsing Agents [Dataset]. https://www.kaggle.com/datasets/open-benchmarks/browsecomp-a-benchmark-for-browsing-agents
    Explore at:
    zip(0 bytes)Available download formats
    Dataset updated
    Jun 4, 2025
    Dataset authored and provided by
    Open Benchmarks
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Relevant links: * leaderboard: Coming soon * implementation: * publication: https://arxiv.org/abs/2504.12516 * original repository: https://github.com/openai/simple-evals/tree/main

    Abstract We present BrowseComp, a simple yet challenging benchmark for measuring the ability for agents to browse the web. BrowseComp comprises 1,266 questions that require persistently navigating the internet in search of hard-to-find, entangled information. Despite the difficulty of the questions, BrowseComp is simple and easy-to-use, as predicted answers are short and easily verifiable against reference answers. BrowseComp for browsing agents can be seen as analogous to how programming competitions are an incomplete but useful benchmark for coding agents. While BrowseComp sidesteps challenges of a true user query distribution, like generating long answers or resolving ambiguity, it measures the important core capability of exercising persistence and creativity in finding information.

  14. ChitroJera

    • kaggle.com
    Updated Nov 12, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Md Sakib Ul Rahman Sourove (2024). ChitroJera [Dataset]. https://www.kaggle.com/datasets/sourove/chitrojera
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Nov 12, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Md Sakib Ul Rahman Sourove
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Visual Question Answer (VQA) poses the problem of answering a natural language question about a visual context. Bangla, despite being a widely spoken language, is considered lowresource in the realm of VQA due to the lack of a proper benchmark dataset. The absence of such datasets challenges models that are known to be performant in other languages. Furthermore, existing Bangla VQA datasets offer little cultural relevance and are largely adapted from their foreign counterparts. To address these challenges, we introduce a large-scale Bangla VQA dataset titled ChitroJera, totaling over 15k samples where diverse and locally relevant data sources are used. We assess the performance of text encoders, image encoders, multimodal models, and our novel dual-encoder models. The experiments reveal that the pretrained dual-encoders outperform other models of its scale. We also evaluate the performance of large language models (LLMs) using promptbased techniques, with LLMs achieving the best performance. Given the underdeveloped state of existing datasets, we envision ChitroJera expanding the scope of Vision-Language tasks in Bangla.

  15. UAV-benchmark

    • kaggle.com
    Updated Mar 13, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dmitry Sokolov (2023). UAV-benchmark [Dataset]. https://www.kaggle.com/datasets/dimka11/uavbenchmark
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Mar 13, 2023
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Dmitry Sokolov
    Description

    Dataset

    This dataset was created by Dmitry Sokolov

    Contents

  16. M4 Forecasting Competition Dataset

    • kaggle.com
    zip
    Updated Mar 9, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sri Yogesh (2020). M4 Forecasting Competition Dataset [Dataset]. https://www.kaggle.com/yogesh94/m4-forecasting-competition-dataset
    Explore at:
    zip(83502902 bytes)Available download formats
    Dataset updated
    Mar 9, 2020
    Authors
    Sri Yogesh
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    The M4 Forecasting Competition Dataset

    The M4 competition which is a continuation of the Makridakis Competitions for forecasting and was conducted in 2018. This competion includes the prediction of both Point Forecasts and Prediction Intervals.

    More Details

    Paper describing the competition and the various benchmarks and approaches was published in a special edition of the International Journal of Forecasting and is available for open access and can be found here

    Code for benchmarks

    The code for various benchmarks on this dataset can be found at the following github repository

    Source

    The data is available at both the github link and the official website of MOFC

  17. benchmark

    • kaggle.com
    Updated May 4, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    KoHanE (2023). benchmark [Dataset]. https://www.kaggle.com/datasets/kohane/benchmark/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 4, 2023
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    KoHanE
    Description

    Dataset

    This dataset was created by KoHanE

    Contents

  18. AbRank: Antibody Affinity Ranking

    • kaggle.com
    Updated May 16, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Aurélien Pélissier (2025). AbRank: Antibody Affinity Ranking [Dataset]. https://www.kaggle.com/datasets/aurlienplissier/abrank
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 16, 2025
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Aurélien Pélissier
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Description

    AbRank is a large-scale benchmark and evaluation framework that reframes affinity prediction as a pairwise ranking problem. It aggregates over 380,000 binding assays from nine heterogeneous sources, spanning diverse antibodies, antigens, and experimental conditions, and introduces standardized data splits that systematically increase distribution shift, from local perturbations such as point mutations to broad generalization across novel antigens and antibodies. To ensure robust supervision, AbRank defines a 10-confident ranking framework by filtering out comparisons with marginal affinity differences, focusing training on pairs with at least an 10-fold difference in measured binding strength.

  19. Audio Benchmark

    • kaggle.com
    Updated Sep 28, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Lisa Sharapova (2023). Audio Benchmark [Dataset]. https://www.kaggle.com/datasets/lallucycle/audio-benchmark
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Sep 28, 2023
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Lisa Sharapova
    Description

    Dataset

    This dataset was created by Lisa Sharapova

    Contents

  20. ZeroShot LLM4TS Benchmark

    • kaggle.com
    Updated May 26, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Vittorio Rossi (2024). ZeroShot LLM4TS Benchmark [Dataset]. https://www.kaggle.com/datasets/vittoriorossi/zeroshot-llm4ts-benchmark
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 26, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Vittorio Rossi
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    The "ZeroShot LLM4TS Benchmark" dataset is designed to evaluate the performance of large language models (LLMs) in zero-shot time series forecasting tasks. The dataset contains various time series data from different domains, providing a comprehensive benchmark for testing LLM capabilities in forecasting without prior training on the specific data.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Yuge Zhang; Qiyang Jiang; Xingyu Han; Nan Chen; Yuqing Yang; Kan Ren (2024). DSEval-Kaggle Dataset [Dataset]. https://paperswithcode.com/dataset/dseval

DSEval-Kaggle Dataset

Explore at:
2 scholarly articles cite this dataset (View in Google Scholar)
Dataset updated
Apr 19, 2024
Authors
Yuge Zhang; Qiyang Jiang; Xingyu Han; Nan Chen; Yuqing Yang; Kan Ren
Description

In this paper, we introduce a novel benchmarking framework designed specifically for evaluations of data science agents. Our contributions are three-fold. First, we propose DSEval, an evaluation paradigm that enlarges the evaluation scope to the full lifecycle of LLM-based data science agents. We also cover aspects including but not limited to the quality of the derived analytical solutions or machine learning models, as well as potential side effects such as unintentional changes to the original data. Second, we incorporate a novel bootstrapped annotation process letting LLM themselves generate and annotate the benchmarks with ``human in the loop''. A novel language (i.e., DSEAL) has been proposed and the derived four benchmarks have significantly improved the benchmark scalability and coverage, with largely reduced human labor. Third, based on DSEval and the four benchmarks, we conduct a comprehensive evaluation of various data science agents from different aspects. Our findings reveal the common challenges and limitations of the current works, providing useful insights and shedding light on future research on LLM-based data science agents.

This is one of DSEval benchmarks.

Search
Clear search
Close search
Google apps
Main menu