1 dataset found
  1. Can Developers Prompt? A Controlled Experiment for Code Documentation...

    • zenodo.org
    • data.niaid.nih.gov
    zip
    Updated Sep 11, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Hans-Alexander Kruse; Hans-Alexander Kruse; Tim Puhlfürß; Tim Puhlfürß; Walid Maalej; Walid Maalej (2024). Can Developers Prompt? A Controlled Experiment for Code Documentation Generation [Replication Package] [Dataset]. http://doi.org/10.5281/zenodo.13744961
    Explore at:
    zipAvailable download formats
    Dataset updated
    Sep 11, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Hans-Alexander Kruse; Hans-Alexander Kruse; Tim Puhlfürß; Tim Puhlfürß; Walid Maalej; Walid Maalej
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Summary of Artifacts

    This is the replication package for the paper titled 'Can Developers Prompt? A Controlled Experiment for Code Documentation Generation' that is part of the 40th IEEE International Conference on Software Maintenance and Evolution (ICSME), from October 6 to 11, 2024, located in Flagstaff, AZ, USA.

    Full Abstract

    Large language models (LLMs) bear great potential for automating tedious development tasks such as creating and maintaining code documentation. However, it is unclear to what extent developers can effectively prompt LLMs to create concise and useful documentation. We report on a controlled experiment with 20 professionals and 30 computer science students tasked with code documentation generation for two Python functions. The experimental group freely entered ad-hoc prompts in a ChatGPT-like extension of Visual Studio Code, while the control group executed a predefined few-shot prompt. Our results reveal that professionals and students were unaware of or unable to apply prompt engineering techniques. Especially students perceived the documentation produced from ad-hoc prompts as significantly less readable, less concise, and less helpful than documentation from prepared prompts. Some professionals produced higher quality documentation by just including the keyword Docstring in their ad-hoc prompts. While students desired more support in formulating prompts, professionals appreciated the flexibility of ad-hoc prompting. Participants in both groups rarely assessed the output as perfect. Instead, they understood the tools as support to iteratively refine the documentation. Further research is needed to understand which prompting skills and preferences developers have and which support they need for certain tasks.

    Author Information

    NameAffiliationEmail
    Hans-Alexander KruseUniversität Hamburgmailto:hans-alexander.kruse@studium.uni-hamburg.de" href="mailto:hans-alexander.kruse@studium.uni-hamburg.de">hans-alexander.kruse@studium.uni-hamburg.de
    Tim PuhlfürßUniversität Hamburgmailto:tim.puhlfuerss@uni-hamburg.de" href="mailto:tim.puhlfuerss@uni-hamburg.de">tim.puhlfuerss@uni-hamburg.de
    Walid MaalejUniversität Hamburgmailto:walid.maalej@uni-hamburg.de" href="mailto:walid.maalej@uni-hamburg.de">walid.maalej@uni-hamburg.de

    Citation Information

    @inproceedings{kruse-icsme-2024,
    author={Kruse, Hans-Alexander and Puhlf{\"u}r{\ss}, Tim and Maalej, Walid},
    booktitle={2022 IEEE International Conference on Software Maintenance and Evolution},
    title={Can Developers Prompt? A Controlled Experiment for Code Documentation Generation},
    year={2024},
    doi={tba},
    }
    

    Artifacts Overview

    1. Preprint

    The file kruse-icsme-2024-preprint.pdf is the preprint version of the official paper. You should read the paper in detail to understand the study, especially its methodology and results.

    2. Results

    The folder results includes two subfolders, explained in the following.

    Demographics RQ1 RQ2

    The subfolder Demographics RQ1 RQ2 provides Jupyter Notebook file evaluation.ipynb for analyzing (1) the experiment participants' submissions of the digital survey and (2) the ad-hoc prompts that the experimental group entered into their tool. Hence, this file provides demographic information about the participants and results for the research questions 1 and 2. Please refer to the README file inside this subfolder for installation steps of the Jupyter Notebook file.

    RQ2

    The subfolder RQ2 contains further subfolders with Microsoft Excel files specific to the results of research question 2:

    • The subfolder UEQ contains three times the official User Experience Questionnaire (UEQ) analysis Excel tool, with data entered from all participants/students/professionals.
    • The subfolder Open Coding contains three Excel files with the open-coding results for the free-text answers that participants could enter at the end of the survey to state additional positive and negative comments about their experience during the experiment. The Consensus file provides the finalized version of the open coding process.

    3. Extension

    The folder extension contains the code of the Visual Studio Code (VS Code) extension developed in this study to generate code documentation with predefined prompts. Please refer to the README file inside the folder for installation steps. Alternatively, you can install the deployed version of this tool, called Code Docs AI, via the https://marketplace.visualstudio.com/items?itemName=re-devtools.code-docs-ai" href="https://marketplace.visualstudio.com/items?itemName=re-devtools.code-docs-ai">VS Code Marketplace.

    You can install the tool to generate code documentation with ad-hoc prompts directly via the https://marketplace.visualstudio.com/items?itemName=zhang-renyang.chat-gpt" href="https://marketplace.visualstudio.com/items?itemName=zhang-renyang.chat-gpt">VS Code Marketplace. We did not include the code of this extension in this replication package due to license conflicts (GNUv3 vs. MIT).

    4. Survey

    The folder survey contains PDFs of the digital survey in two versions:

    • The file Survey.pdf contains the rendered version of the survey (how it was presented to participants).
    • The file SurveyOptions.pdf is an export of the LimeSurvey web platform. Its main purpose is to provide the technical answer codes, e.g., AO01 and AO02, that refer to the rendered answer texts, e.g., Yes and No. This can help you if you want to analyze the CSV files inside the results folder (instead of using the Jupyter Notebook file), as the CSVs contain the answer codes, not the answer texts. Please note that an export issue caused page 9 to be almost blank. However, this problem is negligible as the question on this page only contained one free-text answer field.

    5. Appendix

    The folder appendix provides additional material about the study:

    • The subfolder tool_screenshots contains screenshots of both tools.
    • The file few_shots.txt lists the few shots used for the predefined prompt tool.
    • The file test_functions.py lists the functions used in the experiment.

    Revisions

    VersionChangelog
    1.0.0Initial upload
    1.1.0Add paper preprint. Update abstract.
    1.2.0Update replication package based on ICSME Artifact Track reviews

    License

    See LICENSE file.

  2. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Hans-Alexander Kruse; Hans-Alexander Kruse; Tim Puhlfürß; Tim Puhlfürß; Walid Maalej; Walid Maalej (2024). Can Developers Prompt? A Controlled Experiment for Code Documentation Generation [Replication Package] [Dataset]. http://doi.org/10.5281/zenodo.13744961
Organization logo

Can Developers Prompt? A Controlled Experiment for Code Documentation Generation [Replication Package]

Explore at:
zipAvailable download formats
Dataset updated
Sep 11, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Hans-Alexander Kruse; Hans-Alexander Kruse; Tim Puhlfürß; Tim Puhlfürß; Walid Maalej; Walid Maalej
License

MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically

Description

Summary of Artifacts

This is the replication package for the paper titled 'Can Developers Prompt? A Controlled Experiment for Code Documentation Generation' that is part of the 40th IEEE International Conference on Software Maintenance and Evolution (ICSME), from October 6 to 11, 2024, located in Flagstaff, AZ, USA.

Full Abstract

Large language models (LLMs) bear great potential for automating tedious development tasks such as creating and maintaining code documentation. However, it is unclear to what extent developers can effectively prompt LLMs to create concise and useful documentation. We report on a controlled experiment with 20 professionals and 30 computer science students tasked with code documentation generation for two Python functions. The experimental group freely entered ad-hoc prompts in a ChatGPT-like extension of Visual Studio Code, while the control group executed a predefined few-shot prompt. Our results reveal that professionals and students were unaware of or unable to apply prompt engineering techniques. Especially students perceived the documentation produced from ad-hoc prompts as significantly less readable, less concise, and less helpful than documentation from prepared prompts. Some professionals produced higher quality documentation by just including the keyword Docstring in their ad-hoc prompts. While students desired more support in formulating prompts, professionals appreciated the flexibility of ad-hoc prompting. Participants in both groups rarely assessed the output as perfect. Instead, they understood the tools as support to iteratively refine the documentation. Further research is needed to understand which prompting skills and preferences developers have and which support they need for certain tasks.

Author Information

NameAffiliationEmail
Hans-Alexander KruseUniversität Hamburgmailto:hans-alexander.kruse@studium.uni-hamburg.de" href="mailto:hans-alexander.kruse@studium.uni-hamburg.de">hans-alexander.kruse@studium.uni-hamburg.de
Tim PuhlfürßUniversität Hamburgmailto:tim.puhlfuerss@uni-hamburg.de" href="mailto:tim.puhlfuerss@uni-hamburg.de">tim.puhlfuerss@uni-hamburg.de
Walid MaalejUniversität Hamburgmailto:walid.maalej@uni-hamburg.de" href="mailto:walid.maalej@uni-hamburg.de">walid.maalej@uni-hamburg.de

Citation Information

@inproceedings{kruse-icsme-2024,
author={Kruse, Hans-Alexander and Puhlf{\"u}r{\ss}, Tim and Maalej, Walid},
booktitle={2022 IEEE International Conference on Software Maintenance and Evolution},
title={Can Developers Prompt? A Controlled Experiment for Code Documentation Generation},
year={2024},
doi={tba},
}

Artifacts Overview

1. Preprint

The file kruse-icsme-2024-preprint.pdf is the preprint version of the official paper. You should read the paper in detail to understand the study, especially its methodology and results.

2. Results

The folder results includes two subfolders, explained in the following.

Demographics RQ1 RQ2

The subfolder Demographics RQ1 RQ2 provides Jupyter Notebook file evaluation.ipynb for analyzing (1) the experiment participants' submissions of the digital survey and (2) the ad-hoc prompts that the experimental group entered into their tool. Hence, this file provides demographic information about the participants and results for the research questions 1 and 2. Please refer to the README file inside this subfolder for installation steps of the Jupyter Notebook file.

RQ2

The subfolder RQ2 contains further subfolders with Microsoft Excel files specific to the results of research question 2:

  • The subfolder UEQ contains three times the official User Experience Questionnaire (UEQ) analysis Excel tool, with data entered from all participants/students/professionals.
  • The subfolder Open Coding contains three Excel files with the open-coding results for the free-text answers that participants could enter at the end of the survey to state additional positive and negative comments about their experience during the experiment. The Consensus file provides the finalized version of the open coding process.

3. Extension

The folder extension contains the code of the Visual Studio Code (VS Code) extension developed in this study to generate code documentation with predefined prompts. Please refer to the README file inside the folder for installation steps. Alternatively, you can install the deployed version of this tool, called Code Docs AI, via the https://marketplace.visualstudio.com/items?itemName=re-devtools.code-docs-ai" href="https://marketplace.visualstudio.com/items?itemName=re-devtools.code-docs-ai">VS Code Marketplace.

You can install the tool to generate code documentation with ad-hoc prompts directly via the https://marketplace.visualstudio.com/items?itemName=zhang-renyang.chat-gpt" href="https://marketplace.visualstudio.com/items?itemName=zhang-renyang.chat-gpt">VS Code Marketplace. We did not include the code of this extension in this replication package due to license conflicts (GNUv3 vs. MIT).

4. Survey

The folder survey contains PDFs of the digital survey in two versions:

  • The file Survey.pdf contains the rendered version of the survey (how it was presented to participants).
  • The file SurveyOptions.pdf is an export of the LimeSurvey web platform. Its main purpose is to provide the technical answer codes, e.g., AO01 and AO02, that refer to the rendered answer texts, e.g., Yes and No. This can help you if you want to analyze the CSV files inside the results folder (instead of using the Jupyter Notebook file), as the CSVs contain the answer codes, not the answer texts. Please note that an export issue caused page 9 to be almost blank. However, this problem is negligible as the question on this page only contained one free-text answer field.

5. Appendix

The folder appendix provides additional material about the study:

  • The subfolder tool_screenshots contains screenshots of both tools.
  • The file few_shots.txt lists the few shots used for the predefined prompt tool.
  • The file test_functions.py lists the functions used in the experiment.

Revisions

VersionChangelog
1.0.0Initial upload
1.1.0Add paper preprint. Update abstract.
1.2.0Update replication package based on ICSME Artifact Track reviews

License

See LICENSE file.

Search
Clear search
Close search
Google apps
Main menu