77 datasets found
  1. openai_humaneval

    • huggingface.co
    Updated Jan 1, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    OpenAI (2022). openai_humaneval [Dataset]. https://huggingface.co/datasets/openai/openai_humaneval
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jan 1, 2022
    Dataset authored and provided by
    OpenAIhttp://openai.com/
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Dataset Card for OpenAI HumanEval

      Dataset Summary
    

    The HumanEval dataset released by OpenAI includes 164 programming problems with a function sig- nature, docstring, body, and several unit tests. They were handwritten to ensure not to be included in the training set of code generation models.

      Supported Tasks and Leaderboards
    
    
    
    
    
      Languages
    

    The programming problems are written in Python and contain English natural text in comments and docstrings.… See the full description on the dataset page: https://huggingface.co/datasets/openai/openai_humaneval.

  2. HumanEval-X

    • opendatalab.com
    zip
    Updated Jan 1, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Tsinghua University (2023). HumanEval-X [Dataset]. https://opendatalab.com/OpenDataLab/humaneval-x
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jan 1, 2023
    Dataset provided by
    智谱http://zhipuai.cn/
    Tsinghua University
    Huawei
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    HumanEval-X is a benchmark for evaluating the multilingual ability of code generative models. It consists of 820 high-quality human-crafted data samples (each with test cases) in Python, C++, Java, JavaScript, and Go, and can be used for various tasks, such as code generation and translation.

  3. OpenAI HumanEval (Coding Challenges & Unit-tests)

    • kaggle.com
    Updated Nov 21, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Devastator (2022). OpenAI HumanEval (Coding Challenges & Unit-tests) [Dataset]. https://www.kaggle.com/datasets/thedevastator/handcrafted-dataset-for-code-generation-models
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Nov 21, 2022
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    The Devastator
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    OpenAI HumanEval (Coding Challenges & Unit-tests)

    164 programming problems with a function signature, docstring, body, unittests

    Source

    Huggingface Hub: link

    About this dataset

    The OpenAI HumanEval dataset is a handcrafted set of 164 programming problems designed to challenge code generation models. The problems include a function signature, docstring, body, and several unit tests, all handwritten to ensure they're not included in the training set of code generation models. The entry point for each problem is the prompt, making it an ideal dataset for testing natural language processing and machine learning models' ability to generate Python programs from scratch

    How to use the dataset

    To use this dataset, simply download the zip file and extract it. The resulting directory will contain the following files:

    canonical_solution.py: The solution to the problem. (String) entry_point.py: The entry point for the problem. (String) prompt.txt: The prompt for the problem. (String) test.py: The unit tests for the problem

    Research Ideas

    • The dataset could be used to develop a model that generates programs from natural language.
    • The dataset could be used to develop a model that completes or debugs programs.
    • The dataset could be used to develop a model that writes unit tests for programs

    Acknowledgements

    License

    License: CC0 1.0 Universal (CC0 1.0) - Public Domain Dedication No Copyright - You can copy, modify, distribute and perform the work, even for commercial purposes, all without asking permission. See Other Information.

    Columns

    File: test.csv | Column name | Description | |:-----------------------|:--------------------------------------------------------------------------------------------------| | prompt | A natural language description of the programming problem. (String) | | canonical_solution | The correct Python code solution to the problem. (String) | | test | A set of unit tests that the generated code must pass in order to be considered correct. (String) | | entry_point | The starting point for the generated code. (String) |

  4. openai-humaneval

    • opendatalab.com
    zip
    Updated Dec 16, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Anthropic AI (2023). openai-humaneval [Dataset]. https://opendatalab.com/OpenDataLab/openai-humaneval
    Explore at:
    zipAvailable download formats
    Dataset updated
    Dec 16, 2023
    Dataset provided by
    Anthropichttps://anthropic.com/
    OpenAIhttp://openai.com/
    Zipline
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    The HumanEval dataset released by OpenAI includes 164 programming problems with a function sig- nature, docstring, body, and several unit tests. They were handwritten to ensure not to be included in the training set of code generation models.

  5. h

    humanevalpack

    • huggingface.co
    Updated Apr 15, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    BigCode (2024). humanevalpack [Dataset]. https://huggingface.co/datasets/bigcode/humanevalpack
    Explore at:
    Dataset updated
    Apr 15, 2024
    Dataset authored and provided by
    BigCode
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Dataset Card for HumanEvalPack

      Dataset Summary
    

    HumanEvalPack is an extension of OpenAI's HumanEval to cover 6 total languages across 3 tasks. The Python split is exactly the same as OpenAI's Python HumanEval. The other splits are translated by humans (similar to HumanEval-X but with additional cleaning, see here). Refer to the OctoPack paper for more details.

    Languages: Python, JavaScript, Java, Go, C++, Rust OctoPack🐙🎒:

    Data CommitPack 4TB of GitHub commits… See the full description on the dataset page: https://huggingface.co/datasets/bigcode/humanevalpack.

  6. h

    HumanEval

    • huggingface.co
    Updated Aug 27, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Princeton-AI (2025). HumanEval [Dataset]. https://huggingface.co/datasets/Gen-Verse/HumanEval
    Explore at:
    Dataset updated
    Aug 27, 2025
    Dataset authored and provided by
    Princeton-AI
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Gen-Verse/HumanEval dataset hosted on Hugging Face and contributed by the HF Datasets community

  7. h

    instructhumaneval

    • huggingface.co
    • opendatalab.com
    Updated Jun 29, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CodeParrot (2023). instructhumaneval [Dataset]. https://huggingface.co/datasets/codeparrot/instructhumaneval
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jun 29, 2023
    Dataset authored and provided by
    CodeParrot
    Description

    Instruct HumanEval

      Summary
    

    InstructHumanEval is a modified version of OpenAI HumanEval. For a given prompt, we extracted its signature, its docstring as well as its header to create a flexing setting which would allow to evaluation instruction-tuned LLM. The delimiters used in the instruction-tuning procedure can be use to build and instruction that would allow the model to elicit its best capabilities. Here is an example of use The prompt can be built as follows… See the full description on the dataset page: https://huggingface.co/datasets/codeparrot/instructhumaneval.

  8. h

    Reorganized-humaneval

    • huggingface.co
    Updated Mar 11, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yixin He (2025). Reorganized-humaneval [Dataset]. https://huggingface.co/datasets/HeyixInn0/Reorganized-humaneval
    Explore at:
    Dataset updated
    Mar 11, 2025
    Authors
    Yixin He
    Description

    HeyixInn0/Reorganized-humaneval dataset hosted on Hugging Face and contributed by the HF Datasets community

  9. humaneval-pro

    • huggingface.co
    Updated Dec 31, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CodeEval-Pro (2024). humaneval-pro [Dataset]. https://huggingface.co/datasets/CodeEval-Pro/humaneval-pro
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Dec 31, 2024
    Dataset provided by
    CodeEval, Inc.
    Authors
    CodeEval-Pro
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Evaluation dataset for umanEval Pro and MBPP Pro: Evaluating Large Language Models on Self-invoking Code Generation Task (arxiv.org/abs/2412.21199).

  10. openai-human-eval

    • kaggle.com
    Updated Apr 10, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Inoichan (2024). openai-human-eval [Dataset]. https://www.kaggle.com/datasets/inoueu1/openai-human-eval/discussion
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Apr 10, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Inoichan
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Dataset

    This dataset was created by Inoichan

    Released under MIT

    Contents

  11. h

    humaneval-fix-starcoder

    • huggingface.co
    Updated Feb 13, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Eitan Turok (2024). humaneval-fix-starcoder [Dataset]. https://huggingface.co/datasets/eitanturok/humaneval-fix-starcoder
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Feb 13, 2024
    Authors
    Eitan Turok
    Description

    eitanturok/humaneval-fix-starcoder dataset hosted on Hugging Face and contributed by the HF Datasets community

  12. bc-humaneval

    • opendatalab.com
    • huggingface.co
    zip
    Updated Jan 9, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Google Research (2024). bc-humaneval [Dataset]. https://opendatalab.com/OpenDataLab/bc-humaneval
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jan 9, 2024
    Dataset provided by
    谷歌http://google.com/
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The BabelCode-HumaneEval (BC-HumanEval) dataset converts the HumanEval dataset released by OpenAI to 16 programming languages.

  13. h

    humaneval

    • huggingface.co
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Xiaosen Zheng, humaneval [Dataset]. https://huggingface.co/datasets/xszheng2020/humaneval
    Explore at:
    Authors
    Xiaosen Zheng
    Description

    xszheng2020/humaneval dataset hosted on Hugging Face and contributed by the HF Datasets community

  14. h

    HumanEval-V-Benchmark

    • huggingface.co
    Updated May 2, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    HumanEval-V (2025). HumanEval-V-Benchmark [Dataset]. https://huggingface.co/datasets/HumanEval-V/HumanEval-V-Benchmark
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 2, 2025
    Dataset authored and provided by
    HumanEval-V
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    HumanEval-V: Benchmarking High-Level Visual Reasoning with Complex Diagrams in Coding Tasks

    📄 Paper •
    🏠 Home Page •
    💻 GitHub Repository •
    🏆 Leaderboard •
    🤗 Dataset Viewer 
    

    HumanEval-V is a novel benchmark designed to evaluate the diagram understanding and reasoning capabilities of Large Multimodal Models (LMMs) in programming contexts. Unlike existing benchmarks, HumanEval-V focuses on coding tasks that require sophisticated visual reasoning over… See the full description on the dataset page: https://huggingface.co/datasets/HumanEval-V/HumanEval-V-Benchmark.

  15. Replication package for "The Art of Repair: Optimizing Iterative Program...

    • zenodo.org
    xz, zip
    Updated May 6, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Fernando Vallecillos Ruiz; Fernando Vallecillos Ruiz; Max Hort; Max Hort; Leon Moonen; Leon Moonen (2025). Replication package for "The Art of Repair: Optimizing Iterative Program Repair with Instruction-Tuned Models" [Dataset]. http://doi.org/10.5281/zenodo.15294696
    Explore at:
    xz, zipAvailable download formats
    Dataset updated
    May 6, 2025
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Fernando Vallecillos Ruiz; Fernando Vallecillos Ruiz; Max Hort; Max Hort; Leon Moonen; Leon Moonen
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This repository contains the replication package for the paper "The Art of Repair: Optimizing Iterative Program Repair with Instruction-Tuned Models" by Fernando Vallecillos Ruiz, Max Hort, and Leon Moonen, accepted for the research track of the 29th International Conference on Evaluation and Assessment in Software Engineering (EASE 2025). A preprint of the paper is included.

    The source code is distributed under the MIT license, and except for 3rd party datasets that come with their own license, all documentation, data, models and results in this repository are distributed under the CC BY 4.0 license.

    Repository Overview

    This repository contains the necessary scripts, data, and resources to replicate the experiments presented in our conference paper. The structure of this repository has been organized to facilitate ease of use for researchers interested in reproducing our results, conducting similar analyses, or building upon our work.

    Repository Structure

    FolderDescription
    analysisContains Jupyter notebook scripts used to generate tables and visual analyses. These scripts assist in visualizing results, comparing metrics, and summarizing data from the experiments. The outputs can be easily exported for further use.
    apr_trainingContains the dataset used for the Automated Program Repair (APR) training phase. This data is utilized by the scripts in train_src/ for fine-tuning the models.
    benchmarksIncludes JSON files representing different benchmarks, specifically HumanEval-Java and Defects4J. In this work, we have primarily focused on and revised HumanEval-Java.
    inference_and_validation_srcContains Python scripts used to generate patches and validate them across different benchmarks. These scripts play a critical role in producing and assessing model outputs.
    inference_scriptsBash scripts used to automate the process of submitting inference and validation jobs to the compute cluster. This facilitates multiple iterations of inference and validation in a streamlined manner.
    models*Stores the fine-tuned machine learning models used in the experiments. These models are the output of the fine-tuning process and are referenced by the inference scripts.
    resultsContains all the outputs from the models in JSON format, generated during the inference process. These files represent the raw experimental results.
    train_srcPython scripts for model fine-tuning. These scripts include methods for performing both full model training and LoRA fine-tuning for parameter-efficient updates.
    validation_benchmark_datasetContains the benchmark datasets used during validation.

    * Note that all contents except for the model files from the models/ folder are included in the compressed zip file in this Zenodo repository. The model files are uploaded separately to the repository to facilitate individual downloads, as several of them are relatively large (9.5-11.2GB).

    Detailed Folder Descriptions

    Analysis (analysis/)

    This folder contains Jupyter notebook scripts used to generate tables and visual analyses of the experimental data. These scripts are designed to assist in visualizing results, comparing performance metrics, and summarizing experimental outcomes. Researchers can easily export the generated tables to spreadsheets for further processing or visualization. The outputs help in validating the experiment's consistency and provide insights into the performance of various model configurations.

    Inference and Validation Source (inference_and_validation_src/)

    The Python scripts in this folder are used for generating patches and validating them against predefined benchmarks. We utilize the "Fire" library to parse parameters and execute the relevant methods efficiently. This folder contains:

    • Scripts for generating patches directly from the benchmark data or using iterative approaches.
    • Validation utilities for Defects4J and HumanEval benchmarks to ensure the generated patches are functional and comply with benchmark requirements.

    Key components include:

    • Patch generation logic.
    • Validation commands for HumanEval and Defects4J benchmarks.
    • Utilities to verify data integrity of generated JSON files.

    Training Source (train_src/)

    This folder contains the scripts used for model fine-tuning:

    • full_finetune.py: This script performs full fine-tuning of a model on a given training dataset. It updates all trainable parameters to achieve optimal model performance on the target task.

    • lora_finetune.py: This script implements LoRA (Low-Rank Adaptation) fine-tuning. LoRA is a parameter-efficient fine-tuning approach where only a smaller subset of model parameters are updated, making it effective for resource-constrained tasks.

    Inference Scripts (inference_scripts/)

    These Bash scripts are designed to automate the inference process by submitting multiple iterations of inference and validation jobs to the compute cluster. The scripts create job dependencies, ensuring that all necessary tasks are completed in a logical sequence.

    The available inference scripts include:

    • model_inferencing_adjustable_FULL_d4j_big.sh: Executes inference for specified model configurations with multiple iterations and outputs per iteration.
    • model_inferencing_adjustable_FULL_d4j_lora_big.sh: Similar to the previous script, but optimized for LoRA-based models.

    These scripts accept three parameters:

    • MODEL: The name of the model, as found in the models/ folder.
    • NUM_ITERATIONS: The number of iterations to run.
    • NUM_OUTPUTS: The number of outputs generated in each iteration.

    Citation and Zenodo links

    We hope this package serves as a useful resource for reproducing and expanding upon our research results. Please cite this work by referring to the published paper:

    Fernando Vallecillos Ruiz, Max Hort, and Leon Moonen, 2025. The Art of Repair: Optimizing Iterative Program Repair with Instruction-Tuned Models. In proceedings of the 29th International Conference on Evaluation and Assessment in Software Engineering (EASE 2025), ACM, 12 pages.

    @inproceedings{ruiz2025:art,
      title = {{The Art of Repair: Optimizing Iterative Program Repair with 
           Instruction-Tuned Models}},
      author = {Ruiz, Fernando Vallecillos and Hort, Max and Moonen, Leon},
      booktitle = {{Proceedings of the 29th International Conference on Evaluation 
             and Assessment in Software Engineering (EASE)}},
      year = {2025},
      pages = {12},
      publisher = {{ACM}},
      language = {en}
    }

    The replication package is archived on Zenodo with DOI: 10.5281/zenodo.15294695.

     
  16. h

    humaneval-for-solidity-25

    • huggingface.co
    Updated Apr 3, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    BrainDAO (2025). humaneval-for-solidity-25 [Dataset]. https://huggingface.co/datasets/braindao/humaneval-for-solidity-25
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Apr 3, 2025
    Dataset authored and provided by
    BrainDAO
    Description

    braindao/humaneval-for-solidity-25 dataset hosted on Hugging Face and contributed by the HF Datasets community

  17. a

    codellama-generations

    • aifasthub.com
    • huggingface.co
    Updated Sep 12, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    BigCode (2025). codellama-generations [Dataset]. https://www.aifasthub.com/datasets/bigcode/codellama-generations
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Sep 12, 2025
    Dataset authored and provided by
    BigCode
    Description

    Here you can find the solutions generated by of the Code Llama models to the HumanEval and multiPL-E benchmarks used in the Big Code models Leaderboard: https://huggingface.co/spaces/bigcode/bigcode-models-leaderboard.

  18. h

    AdaDecode-CodeLlama-13B-Instruct-HumanEval

    • huggingface.co
    Updated Jan 28, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yu Meng's Lab (2025). AdaDecode-CodeLlama-13B-Instruct-HumanEval [Dataset]. https://huggingface.co/datasets/meng-lab/AdaDecode-CodeLlama-13B-Instruct-HumanEval
    Explore at:
    Dataset updated
    Jan 28, 2025
    Dataset authored and provided by
    Yu Meng's Lab
    Description

    meng-lab/AdaDecode-CodeLlama-13B-Instruct-HumanEval dataset hosted on Hugging Face and contributed by the HF Datasets community

  19. h

    humaneval-mbpp-codegen-qa

    • huggingface.co
    Updated Apr 2, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Oliver Stanley (2023). humaneval-mbpp-codegen-qa [Dataset]. https://huggingface.co/datasets/OllieStanley/humaneval-mbpp-codegen-qa
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Apr 2, 2023
    Authors
    Oliver Stanley
    Description

    Dataset Card for "humaneval-mbpp-codegen-qa"

    This dataset contains prompt-reply (question-answer) pairs where the prompt is to create a Python function which satisfies the functionality described in a specified docstring. The responses are then the generated functions.

  20. h

    humaneval

    • huggingface.co
    Updated Aug 11, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Leon (2025). humaneval [Dataset]. https://huggingface.co/datasets/Leon-Leee/humaneval
    Explore at:
    Dataset updated
    Aug 11, 2025
    Authors
    Leon
    Description

    Leon-Leee/humaneval dataset hosted on Hugging Face and contributed by the HF Datasets community

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
OpenAI (2022). openai_humaneval [Dataset]. https://huggingface.co/datasets/openai/openai_humaneval
Organization logo

openai_humaneval

OpenAI HumanEval

openai/openai_humaneval

Explore at:
26 scholarly articles cite this dataset (View in Google Scholar)
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jan 1, 2022
Dataset authored and provided by
OpenAIhttp://openai.com/
License

MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically

Description

Dataset Card for OpenAI HumanEval

  Dataset Summary

The HumanEval dataset released by OpenAI includes 164 programming problems with a function sig- nature, docstring, body, and several unit tests. They were handwritten to ensure not to be included in the training set of code generation models.

  Supported Tasks and Leaderboards





  Languages

The programming problems are written in Python and contain English natural text in comments and docstrings.… See the full description on the dataset page: https://huggingface.co/datasets/openai/openai_humaneval.

Search
Clear search
Close search
Google apps
Main menu