2 datasets found
  1. P

    Data from: Data Science Problems Dataset

    • paperswithcode.com
    Updated Nov 17, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Shubham Chandel; Colin B. Clement; Guillermo Serrato; Neel Sundaresan (2022). Data Science Problems Dataset [Dataset]. https://paperswithcode.com/dataset/data-science-problems
    Explore at:
    Dataset updated
    Nov 17, 2022
    Authors
    Shubham Chandel; Colin B. Clement; Guillermo Serrato; Neel Sundaresan
    Description

    Evaluate a natural language code generation model on real data science pedagogical notebooks! Data Science Problems (DSP) includes well-posed data science problems in Markdown along with unit tests to verify correctness and a Docker environment for reproducible execution. About 1/3 of notebooks in this benchmark also include data dependencies, so this benchmark not only can test a model's ability to chain together complex tasks, but also evaluate the solutions on real data! See our paper Training and Evaluating a Jupyter Notebook Data Science Assistant for more details about state of the art results and other properties of the dataset.

  2. g

    Data from: Data Science Problems

    • gitee.com
    • opendatalab.com
    • +2more
    Updated Nov 2, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2021). Data Science Problems [Dataset]. https://gitee.com/mirrors_microsoft/DataScienceProblems
    Explore at:
    Dataset updated
    Nov 2, 2021
    License

    https://github.com/microsoft/DataScienceProblems/blob/main/LICENSE.txthttps://github.com/microsoft/DataScienceProblems/blob/main/LICENSE.txt

    Description

    Evaluate a natural language code generation model on real data science pedagogical notebooks! Data Science Problems (DSP) includes well-posed data science problems in Markdown along with unit tests to verify correctness and a Docker environment for reproducible execution. About 1/3 of notebooks in this benchmark also include data dependencies, so this benchmark not only can test a model's ability to chain together complex tasks, but also evaluate the solutions on real data! See our paper Training and Evaluating a Jupyter Notebook Data Science Assistant (https://arxiv.org/abs/2201.12901) for more details about state of the art results and other properties of the dataset.

  3. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Shubham Chandel; Colin B. Clement; Guillermo Serrato; Neel Sundaresan (2022). Data Science Problems Dataset [Dataset]. https://paperswithcode.com/dataset/data-science-problems

Data from: Data Science Problems Dataset

Related Article
Explore at:
Dataset updated
Nov 17, 2022
Authors
Shubham Chandel; Colin B. Clement; Guillermo Serrato; Neel Sundaresan
Description

Evaluate a natural language code generation model on real data science pedagogical notebooks! Data Science Problems (DSP) includes well-posed data science problems in Markdown along with unit tests to verify correctness and a Docker environment for reproducible execution. About 1/3 of notebooks in this benchmark also include data dependencies, so this benchmark not only can test a model's ability to chain together complex tasks, but also evaluate the solutions on real data! See our paper Training and Evaluating a Jupyter Notebook Data Science Assistant for more details about state of the art results and other properties of the dataset.

Search
Clear search
Close search
Google apps
Main menu