33 datasets found

code_x_glue_ct_code_to_text
huggingface.co
Updated Oct 6, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Google (2023). code_x_glue_ct_code_to_text [Dataset]. https://huggingface.co/datasets/google/code_x_glue_ct_code_to_text
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Oct 6, 2023
Dataset authored and provided by
Googlehttp://google.com/
License
https://choosealicense.com/licenses/c-uda/https://choosealicense.com/licenses/c-uda/
Description
Dataset Card for "code_x_glue_ct_code_to_text"

Dataset Summary

CodeXGLUE code-to-text dataset, available at https://github.com/microsoft/CodeXGLUE/tree/main/Code-Text/code-to-text The dataset we use comes from CodeSearchNet and we filter the dataset as the following:

Remove examples that codes cannot be parsed into an abstract syntax tree. Remove examples that #tokens of documents is < 3 or >256 Remove examples that documents contain special tokens (e.g. or… See the full description on the dataset page: https://huggingface.co/datasets/google/code_x_glue_ct_code_to_text.
h
CodeXGLUE-CONCODE
huggingface.co
opendatalab.com
Updated Jan 30, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ahmed Soliman (2022). CodeXGLUE-CONCODE [Dataset]. https://huggingface.co/datasets/AhmedSSoliman/CodeXGLUE-CONCODE
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jan 30, 2022
Authors
Ahmed Soliman
Description
Concode dataset

A large dataset with over 100,000 examples consisting of Java classes from online code repositories, and develop a new encoder-decoder architecture that models the interaction between the method documentation and the class environment. Concode dataset is a widely used code generation dataset from Iyer's EMNLP 2018 paper Mapping Language to Code in Programmatic Context. Data statistics of concode dataset are shown in the below table:

Examples

Train 100… See the full description on the dataset page: https://huggingface.co/datasets/AhmedSSoliman/CodeXGLUE-CONCODE.
code_x_glue_cc_defect_detection
huggingface.co
Updated May 16, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Google (2024). code_x_glue_cc_defect_detection [Dataset]. https://huggingface.co/datasets/google/code_x_glue_cc_defect_detection
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
May 16, 2024
Dataset authored and provided by
Googlehttp://google.com/
License
https://choosealicense.com/licenses/c-uda/https://choosealicense.com/licenses/c-uda/
Description
Dataset Card for "code_x_glue_cc_defect_detection"

Dataset Summary

CodeXGLUE Defect-detection dataset, available at https://github.com/microsoft/CodeXGLUE/tree/main/Code-Code/Defect-detection Given a source code, the task is to identify whether it is an insecure code that may attack software systems, such as resource leaks, use-after-free vulnerabilities and DoS attack. We treat the task as binary classification (0/1), where 1 stands for insecure code and 0 for secure… See the full description on the dataset page: https://huggingface.co/datasets/google/code_x_glue_cc_defect_detection.
code_x_glue_cc_code_completion_token
huggingface.co
Updated May 16, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Google (2024). code_x_glue_cc_code_completion_token [Dataset]. https://huggingface.co/datasets/google/code_x_glue_cc_code_completion_token
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
May 16, 2024
Dataset authored and provided by
Googlehttp://google.com/
License
https://choosealicense.com/licenses/c-uda/https://choosealicense.com/licenses/c-uda/
Description
Dataset Card for "code_x_glue_cc_code_completion_token"

Dataset Summary

CodeXGLUE CodeCompletion-token dataset, available at https://github.com/microsoft/CodeXGLUE/tree/main/Code-Code/CodeCompletion-token Predict next code token given context of previous tokens. Models are evaluated by token level accuracy. Code completion is a one of the most widely used features in software development through IDEs. An effective code completion tool could improve software… See the full description on the dataset page: https://huggingface.co/datasets/google/code_x_glue_cc_code_completion_token.
O
CodeXGLUE
opendatalab.com
zip
Updated Sep 22, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Peking University (2022). CodeXGLUE [Dataset]. https://opendatalab.com/OpenDataLab/CodeXGLUE
Explore at:
zip(594618184 bytes)Available download formats
Dataset updated
Sep 22, 2022
Dataset provided by
Peking University
Sun Yat-sen University
Beihang University
Microsoft
License
https://github.com/microsoft/CodeXGLUE#licensehttps://github.com/microsoft/CodeXGLUE#license
Description
CodeXGLUE is a benchmark dataset and open challenge for code intelligence. It includes a collection of code intelligence tasks and a platform for model evaluation and comparison. CodeXGLUE stands for General Language Understanding Evaluation benchmark for CODE. It includes 14 datasets for 10 diversified code intelligence tasks covering the following scenarios: code-code (clone detection, defect detection, cloze test, code completion, code repair, and code-to-code translation) text-code (natural language code search, text-to-code generation) code-text (code summarization) text-text (documentation translation) A brief summary of CodeXGLUE is provided in the figure, including tasks, datasets, language, sizes in various states, baseline systems, providers, and short definitions of each task. Datasets highlighted in BLUE are newly introduced.
O
code-x-glue-cc-code-to-code-trans
opendatalab.com
huggingface.co
zip
Updated Dec 16, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Peking University (2023). code-x-glue-cc-code-to-code-trans [Dataset]. https://opendatalab.com/OpenDataLab/code-x-glue-cc-code-to-code-trans
Explore at:
zipAvailable download formats
Dataset updated
Dec 16, 2023
Dataset provided by
Peking University
Sun Yat-Sen University
Beihang University
License
https://github.com/microsoft/CodeXGLUE#licensehttps://github.com/microsoft/CodeXGLUE#license
Description
CodeXGLUE code-to-code-trans dataset, available at https://github.com/microsoft/CodeXGLUE/tree/main/Code-Code/code-to-code-trans The dataset is collected from several public repos, including Lucene(http://lucene.apache.org/), POI(http://poi.apache.org/), JGit(https://github.com/eclipse/jgit/) and Antlr(https://github.com/antlr/). We collect both the Java and C# versions of the codes and find the parallel functions. After removing duplicates and functions with the empty body, we split the whole dataset into training, validation and test sets.
h
CodeXGLUE-CONCODE
huggingface.co
Updated Jan 30, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Frederic (2022). CodeXGLUE-CONCODE [Dataset]. https://huggingface.co/datasets/FredericFan/CodeXGLUE-CONCODE
Explore at:
Dataset updated
Jan 30, 2022
Authors
Frederic
Description
Dataset Card for "concode-preprocessed"

More Information needed
code_x_glue_cc_code_refinement
huggingface.co
opendatalab.com
Updated Nov 23, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Google (2024). code_x_glue_cc_code_refinement [Dataset]. https://huggingface.co/datasets/google/code_x_glue_cc_code_refinement
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Nov 23, 2024
Dataset authored and provided by
Googlehttp://google.com/
License
https://choosealicense.com/licenses/c-uda/https://choosealicense.com/licenses/c-uda/
Description
Dataset Card for "code_x_glue_cc_code_refinement"

Dataset Summary

CodeXGLUE code-refinement dataset, available at https://github.com/microsoft/CodeXGLUE/tree/main/Code-Code/code-refinement We use the dataset released by this paper(https://arxiv.org/pdf/1812.08693.pdf). The source side is a Java function with bugs and the target side is the refined one. All the function and variable names are normalized. Their dataset contains two subsets ( i.e.small and medium) based on… See the full description on the dataset page: https://huggingface.co/datasets/google/code_x_glue_cc_code_refinement.
code_x_glue_cc_clone_detection_big_clone_bench
huggingface.co
opendatalab.com
Updated Apr 3, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Google (2025). code_x_glue_cc_clone_detection_big_clone_bench [Dataset]. https://huggingface.co/datasets/google/code_x_glue_cc_clone_detection_big_clone_bench
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Apr 3, 2025
Dataset authored and provided by
Googlehttp://google.com/
License
https://choosealicense.com/licenses/c-uda/https://choosealicense.com/licenses/c-uda/
Description
Dataset Card for "code_x_glue_cc_clone_detection_big_clone_bench"

Dataset Summary

CodeXGLUE Clone-detection-BigCloneBench dataset, available at https://github.com/microsoft/CodeXGLUE/tree/main/Code-Code/Clone-detection-BigCloneBench Given two codes as the input, the task is to do binary classification (0/1), where 1 stands for semantic equivalence and 0 for others. Models are evaluated by F1 score. The dataset we use is BigCloneBench and filtered following the paper… See the full description on the dataset page: https://huggingface.co/datasets/google/code_x_glue_cc_clone_detection_big_clone_bench.
h
CodeXGLUE-Code-Docstring
huggingface.co
Updated Aug 28, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
ShijiaDong (2025). CodeXGLUE-Code-Docstring [Dataset]. https://huggingface.co/datasets/ShijiaD/CodeXGLUE-Code-Docstring
Explore at:
Dataset updated
Aug 28, 2025
Authors
ShijiaDong
Description
ShijiaD/CodeXGLUE-Code-Docstring dataset hosted on Hugging Face and contributed by the HF Datasets community
O
code-x-glue-tt-text-to-text
opendatalab.com
huggingface.co
zip
Updated Jan 9, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sun Yat-Sen University (2024). code-x-glue-tt-text-to-text [Dataset]. https://opendatalab.com/OpenDataLab/code-x-glue-tt-text-to-text
Explore at:
zipAvailable download formats
Dataset updated
Jan 9, 2024
Dataset provided by
Sun Yat-Sen University
Peking University
Beihang University
License
https://github.com/microsoft/CodeXGLUE#licensehttps://github.com/microsoft/CodeXGLUE#license
Description
CodeXGLUE text-to-text dataset, available at https://github.com/microsoft/CodeXGLUE/tree/main/Text-Text/text-to-text The dataset we use is crawled and filtered from Microsoft Documentation, whose document located at https://github.com/MicrosoftDocs/.
code_x_glue_cc_code_completion_line
huggingface.co
Updated Oct 6, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Google (2023). code_x_glue_cc_code_completion_line [Dataset]. https://huggingface.co/datasets/google/code_x_glue_cc_code_completion_line
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Oct 6, 2023
Dataset authored and provided by
Googlehttp://google.com/
License
https://choosealicense.com/licenses/c-uda/https://choosealicense.com/licenses/c-uda/
Description
Dataset Card for "code_x_glue_cc_code_completion_line"

Dataset Summary

CodeXGLUE CodeCompletion-line dataset, available at https://github.com/microsoft/CodeXGLUE/tree/main/Code-Code/CodeCompletion-line Complete the unfinished line given previous context. Models are evaluated by exact match and edit similarity. We propose line completion task to test model's ability to autocomplete a line. Majority code completion systems behave well in token level completion, but fail in… See the full description on the dataset page: https://huggingface.co/datasets/google/code_x_glue_cc_code_completion_line.
O
code-x-glue-tc-nl-code-search-adv
opendatalab.com
huggingface.co
zip
Updated Dec 18, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sun Yat-Sen University (2023). code-x-glue-tc-nl-code-search-adv [Dataset]. https://opendatalab.com/OpenDataLab/code-x-glue-tc-nl-code-search-adv
Explore at:
zipAvailable download formats
Dataset updated
Dec 18, 2023
Dataset provided by
Peking University
Sun Yat-Sen University
Beihang University
License
https://github.com/microsoft/CodeXGLUE#licensehttps://github.com/microsoft/CodeXGLUE#license
Description
CodeXGLUE NL-code-search-Adv dataset, available at https://github.com/microsoft/CodeXGLUE/tree/main/Text-Code/NL-code-search-Adv The dataset we use comes from CodeSearchNet and we filter the dataset as the following:
The Artifact of the ESEC/FSE 2023 Paper Titled "Natural Language to Code:...
zenodo.org
bin, zip
Updated Aug 18, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Shangwen Wang; Shangwen Wang (2023). The Artifact of the ESEC/FSE 2023 Paper Titled "Natural Language to Code: How Far are We?" [Dataset]. http://doi.org/10.5281/zenodo.7546358
Explore at:
zip, binAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.7546358
Dataset updated
Aug 18, 2023
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Shangwen Wang; Shangwen Wang
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
In this online repository, we release the source code of each of the selected techniques as well as the experiment results from each technique (which are stored in the Results.zip file). For each technique, we also provide our scripts to fine this approach on the CodeSearchNet-Python dataset. For example, finetune.sh/inference.sh are used to finetune/evaluate CodeBERT and they are under "CodeBERT/CodeBERT".

Our evaluation dataset CodeSearchNet is a well-known benchmark and it can be downloaded on its official webpage.

The code to calculate the evaluation metrics are reused from CodeBLEU.

Below is a piece of code generated by CodeT5. In this case, CodeT5 generates a statement recurrently, which leads to the syntactic error. Despite that, the code itself fulfills certain functionalities, and that is why it can achieve a CodeBLEU of 24.9%.

def makeMimiLocal(filename): try: with open(filename, 'rb') as f: data = f.read() except IOError: data = b'' data = data.decode('utf-8') data = data.replace(b'\x00', b'\x00') data = data.replace(b'\x00', b'\x00') data = data.replace(b'\x00', b'\x00') data = data.replace(b'\x00', b'\x00') data = data.replace(b'\x00', b'\x00') data = data.replace(b'\x00', b'\x00') data = data.replace(b'\x00', b'\x00') data = data.replace(b'\x00', b'\x00') data = data.replace(b'\x00', b'\x00') data = data.replace(b'\x00', b'\x00') data = data.replace(b'\

We also release the 100 randomly-selected queries as well as the code generated by ChatGPT in the chatGPT.jsonl.
code_x_glue_tc_text_to_code
huggingface.co
Updated Dec 5, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Google (2023). code_x_glue_tc_text_to_code [Dataset]. https://huggingface.co/datasets/google/code_x_glue_tc_text_to_code
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Dec 5, 2023
Dataset authored and provided by
Googlehttp://google.com/
License
https://choosealicense.com/licenses/c-uda/https://choosealicense.com/licenses/c-uda/
Description
Dataset Card for "code_x_glue_tc_text_to_code"

Dataset Summary

CodeXGLUE text-to-code dataset, available at https://github.com/microsoft/CodeXGLUE/tree/main/Text-Code/text-to-code The dataset we use is crawled and filtered from Microsoft Documentation, whose document located at https://github.com/MicrosoftDocs/.

Supported Tasks and Leaderboards

machine-translation: The dataset can be used to train a model for generating Java code from an English natural… See the full description on the dataset page: https://huggingface.co/datasets/google/code_x_glue_tc_text_to_code.
o
CAPYBARA: Decompiled Binary Functions and Related Summaries
explore.openaire.eu
zenodo.org
Updated Oct 20, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ali Al-Kaswan; Toufique Ahmed; Maliheh Izadi; Anand Ashok Sawant; Prem Devanbu; Arie Van Deursen (2022). CAPYBARA: Decompiled Binary Functions and Related Summaries [Dataset]. http://doi.org/10.5281/zenodo.7229809
Explore at:
Unique identifier
https://doi.org/10.5281/zenodo.7229809
Dataset updated
Oct 20, 2022
Authors
Ali Al-Kaswan; Toufique Ahmed; Maliheh Izadi; Anand Ashok Sawant; Prem Devanbu; Arie Van Deursen
Description
CAPYBARA This dataset is published as part of the paper: "Extending Source Code Pre-Trained Language Models to Summarise Decompiled Binaries". It includes both the training/evaluation data as well as the raw data. The data_split folder contains .pickle files with the test and validation repos, the train repos are all the remaining repos. It also contains a .pickle file with a dictionary that specifies the optimization level for each repository. In the processed_data folder, the processed datasets can be found in .csv format. The columns of the CSV are the summaries, the original documentation, the repo, the source and decompiled code, the function name and a unique identifier. We also include the deduplicated samples in separate CSVs. The processed training files can be found in the training_data folder. Source C, decompiled, demiStripped, and stripped can each be found in their corresponding folders and are split into deduplicated and regular datasets. The data is further split into .jsonl files, for the train, test, and validation sets. These .jsonl files can be loaded into CodeT5 and CodeXGlue as is. The raw_data folder contains all the stripped and decompiled functions without any pre-processing applied. The columns of the CSV are repo, the location, the original code, the corresponding decompiled code, the function name, a unique identifier key, and the corresponding documentation for both the decompiled and stripped functions. License Copyright 2022 ########## Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at: http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.
h
CodeXGLUE-SBT-Docstring
huggingface.co
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
ShijiaDong, CodeXGLUE-SBT-Docstring [Dataset]. https://huggingface.co/datasets/ShijiaD/CodeXGLUE-SBT-Docstring
Explore at:
Authors
ShijiaDong
Description
ShijiaD/CodeXGLUE-SBT-Docstring dataset hosted on Hugging Face and contributed by the HF Datasets community
code_x_glue_cc_cloze_testing_all
huggingface.co
Updated Feb 12, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Google (2022). code_x_glue_cc_cloze_testing_all [Dataset]. https://huggingface.co/datasets/google/code_x_glue_cc_cloze_testing_all
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Feb 12, 2022
Dataset authored and provided by
Googlehttp://google.com/
License
https://choosealicense.com/licenses/c-uda/https://choosealicense.com/licenses/c-uda/
Description
Dataset Card for "code_x_glue_cc_cloze_testing_all"

Dataset Summary

CodeXGLUE ClozeTesting-all dataset, available at https://github.com/microsoft/CodeXGLUE/tree/main/Code-Code/ClozeTesting-all Cloze tests are widely adopted in Natural Languages Processing to evaluate the performance of the trained language models. The task is aimed to predict the answers for the blank with the context of the blank, which can be formulated as a multi-choice classification problem. Here we… See the full description on the dataset page: https://huggingface.co/datasets/google/code_x_glue_cc_cloze_testing_all.
code_x_glue_cc_clone_detection_poj104
huggingface.co
opendatalab.com
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Google, code_x_glue_cc_clone_detection_poj104 [Dataset]. https://huggingface.co/datasets/google/code_x_glue_cc_clone_detection_poj104
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset authored and provided by
Googlehttp://google.com/
License
https://choosealicense.com/licenses/c-uda/https://choosealicense.com/licenses/c-uda/
Description
Dataset Card for "code_x_glue_cc_clone_detection_poj_104"

Dataset Summary

CodeXGLUE Clone-detection-POJ-104 dataset, available at https://github.com/microsoft/CodeXGLUE/tree/main/Code-Code/Clone-detection-POJ-104 Given a code and a collection of candidates as the input, the task is to return Top K codes with the same semantic. Models are evaluated by MAP score. We use POJ-104 dataset on this task.

Supported Tasks and Leaderboards

document-retrieval: The… See the full description on the dataset page: https://huggingface.co/datasets/google/code_x_glue_cc_clone_detection_poj104.
h
code-code-translation-java-csharp
huggingface.co
Updated Jul 11, 2017
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Semeru Lab (2017). code-code-translation-java-csharp [Dataset]. https://huggingface.co/datasets/semeru/code-code-translation-java-csharp
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jul 11, 2017
Dataset authored and provided by
Semeru Lab
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
Dataset is imported from CodeXGLUE and pre-processed using their script.

Where to find in Semeru:

The dataset can be found at /nfs/semeru/semeru_datasets/code_xglue/code-to-code/code-to-code-trans in Semeru

CodeXGLUE -- Code2Code Translation Task Definition

Code translation aims to migrate legacy software from one programming language in a platform toanother. In CodeXGLUE, given a piece of Java (C#) code, the task is to translate the code into C#… See the full description on the dataset page: https://huggingface.co/datasets/semeru/code-code-translation-java-csharp.

Facebook

Twitter

Click to copy link

Link copied

Cite

Google (2023). code_x_glue_ct_code_to_text [Dataset]. https://huggingface.co/datasets/google/code_x_glue_ct_code_to_text

code_x_glue_ct_code_to_text

CodeXGlueCtCodeToText

google/code_x_glue_ct_code_to_text

Explore at:

3 scholarly articles cite this dataset (View in Google Scholar)

CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.

Dataset updated

Oct 6, 2023

Dataset authored and provided by

Googlehttp://google.com/

License

https://choosealicense.com/licenses/c-uda/https://choosealicense.com/licenses/c-uda/

Description

Dataset Card for "code_x_glue_ct_code_to_text"

  Dataset Summary

CodeXGLUE code-to-text dataset, available at https://github.com/microsoft/CodeXGLUE/tree/main/Code-Text/code-to-text The dataset we use comes from CodeSearchNet and we filter the dataset as the following:

Remove examples that codes cannot be parsed into an abstract syntax tree. Remove examples that #tokens of documents is < 3 or >256 Remove examples that documents contain special tokens (e.g. or… See the full description on the dataset page: https://huggingface.co/datasets/google/code_x_glue_ct_code_to_text.

Clear search

Close search

Google apps

Main menu

code_x_glue_ct_code_to_text

CodeXGLUE-CONCODE

Examples

code_x_glue_cc_defect_detection

code_x_glue_cc_code_completion_token

CodeXGLUE

code-x-glue-cc-code-to-code-trans

CodeXGLUE-CONCODE

code_x_glue_cc_code_refinement

code_x_glue_cc_clone_detection_big_clone_bench

CodeXGLUE-Code-Docstring

code-x-glue-tt-text-to-text

code_x_glue_cc_code_completion_line

code-x-glue-tc-nl-code-search-adv

The Artifact of the ESEC/FSE 2023 Paper Titled "Natural Language to Code:...

code_x_glue_tc_text_to_code

CAPYBARA: Decompiled Binary Functions and Related Summaries

CodeXGLUE-SBT-Docstring

code_x_glue_cc_cloze_testing_all

code_x_glue_cc_clone_detection_poj104

code-code-translation-java-csharp

code_x_glue_ct_code_to_text

CodeXGlueCtCodeToText

google/code_x_glue_ct_code_to_text