Lessons On Developer–AI Collaboration From 580 GitHub Conversations

:::info
Authors

Huizi Hao
Kazi Amit Hasan
Hong Qin
Marcos Macedo
Yuan Tian
Steven H. H. Ding
Ahmed E. Hassan

:::

Table Of Links

Abstract

1 Introduction

2 Data Collection

3 RQ1: What types of software engineering inquiries do developers present to ChatGPT in the initial prompt?

4 RQ2: How do developers present their inquiries to ChatGPT in multi-turn conversations?

5 RQ3: What are the characteristics of the sharing behavior?

6 Discussions

7 Threats to Validity

8 Related Work

9 Conclusion and Future Work

References

Abstract

ChatGPT has significantly impacted software development practices, providing substantial assistance to developers in a variety of tasks, including coding, testing, and debugging. Despite its widespread adoption, the impact of ChatGPT as an assistant in collaborative coding remains largely unexplored. In this paper, we analyze a dataset of 210 and 370 developers’ shared conversations with ChatGPT in GitHub pull requests (PRs) and issues. We manually examined the content of the conversations and characterized the dynamics of the sharing behavior, i.e., understanding the rationale behind the sharing, identifying the locations where the conversations were shared, and determining the roles of the developers who shared them.

Our main observations are:

(1) Developers seek ChatGPT’s assistance across 16 types of software engineering inquiries. In both conversations shared in PRs and issues, the most frequently encountered inquiry categories include code generation, conceptual questions, how-to guides, issue resolution, and code review.

(2) Developers frequently engage with ChatGPT via multi-turn conversations where each prompt can fulfill various roles, such as unveiling initial or new tasks, iterative follow-up, and prompt refinement. Multi-turn conversations account for 33.2% of the conversations shared in PRs and 36.9% in issues.

(3) In collaborative coding, developers leverage shared conversations with ChatGPT to facilitate their role-specific contributions, whether as authors of PRs or issues, code reviewers, or collaborators on issues. Our work serves as the first step towards understanding the dynamics between developers and ChatGPT in collaborative software development and opens up new directions for future research on the topic.

Introduction

Recent advances in Foundation Models (FMs) hold considerable promise for automating various software engineering tasks. FM-powered tools like GitHub Copilot 1 , Amazon CodeWhisperer 2 , and OpenAI ChatGPT 3 , are now embraced by many professional software practitioners (Zhang et al., 2023). Such technologies essentially acquire important capabilities based on massive, typically natural-language, data sets and are able to suggest recommendations to software developers, providing source code completion, automatic generation of documentation, or other types of software engineering support.

Current research on FMs for software engineering has primarily focused on evaluating the effectiveness of these models, including specialized versions modified through prompt engineering or fine-tuning, against traditional automated software engineering solutions using standard benchmarks (Jiang et al., 2023; Lu et al., 2023; Hou et al., 2023; Siddiq et al., 2023; Deng et al., 2024; Guo et al., 2024). In contrast, a limited number of studies (Vaithilingam et al., 2022; Ziegler et al., 2022; Mozannar et al., 2022; Barke et al., 2023; Liang et al., 2024) have investigated how FM-powered tools are practically employed by software developers within the software development life cycle.

Most of these studies focus on GitHub Copilot, an FM-powered code generation tool via small-scale (20-42 participants) user studies or surveys. The usage of ChatGPT in software development encompasses a broader spectrum. Unlike Copilot, which primarily aids in code completion and generation, ChatGPT is designed to generate human-like text based on the received input, i.e., prompt. This makes ChatGPT a more versatile tool that is applicable to a wider range of tasks beyond coding.

Despite its potential, there is no research on the dynamics of developer interactions with ChatGPT, the challenges faced, and the opportunities it offers for open-source projects. Moreover, given that modern software systems are crafted by teams rather than isolated individuals, the capacity of ChatGPT to bolster and transform collaborative practices has not been explored. This potential extends well beyond its current use in individual tasks, suggesting a significant yet unexplored impact of FM-powered tools on team-based software development.

To fill the gap, we analyze developers’ shared ChatGPT conversations within GitHub issues and pull requests (PRs). We postulate that these shared conversations will not only reveal how developers engage with ChatGPT in the quality assurance of open-source software projects, such as resolving issues, but also uncover the usage of the shared conversation in collaborative coding, such as contributing to PRs and issues. Link sharing was introduced in ChatGPT in May 2023 (OpenAI, 2024b), allowing users to generate a unique URL for a ChatGPT conversation. This feature enables users to share a snapshot of an entire conversation up to the point of the link being shared.

Such shared conversations can help highlight important messages, reference discussions for collaborative purposes, or create a point for future reference (OpenAI, 2024a). Figure 1 shows an example of a conversation with ChatGPT being shared within a code review comment on a GitHub pull request. The shared conversation contains one turn – a complete cycle of a developer posing a question or problem and ChatGPT responding.

The code reviewer proposed a request asking the pull request author to use a JavaScript library named Day.js to parse and format dates. To demonstrate an example of how Day.js can be utilized, the reviewer asked ChatGPT how to replace a code snippet using Day.js, and ChatGPT responded. Then, the pull request author implemented the suggested code provided by ChatGPT. This example demonstrates the potential benefit of sharing a conversation with ChatGPT in a collaborative coding environment like GitHub.

To better understand the characteristics of shared conversations and their implications on collaborative software engineering practice, we manually analyzed a dataset containing 580 shared ChatGPT conversations within GitHub PRs and issues, specifically, 210 in PRs and 370 in issues. Our empirical study addresses the following three research questions (RQs):

==RQ1: What types of software engineering inquiries do developers present to ChatGPT in the initial prompt?== In this RQ, we manually examined the content of all initial prompts, 580 prompts in total, in the collected shared conversations. We developed a taxonomy composed of 16 types of software engineering-related inquiries. The most frequently encountered inquiries are about code generation, addressing conceptual questions, providing how-to guidance, assisting with issue resolution, and conducting code reviews.

==RQ2: How do developers present their inquiries to ChatGPT in multi-turn conversations?== In this RQ, we manually examined how developers interact with ChatGPT in multi-turn conversations, which consist of several rounds of prompts and responses between a developer and ChatGPT. Specifically, we investigated how developers structure their followup prompts subsequent to the initial prompt. We developed a taxonomy composed of seven types of roles a prompt plays within a multi-turn conversation. Our findings reveal that developers actively engage in multiple rounds of interactions with the aim of improving the quality of ChatGPT’s responses. This is primarily achieved through the posting of follow-up questions and the refinement of prior prompts.

==RQ3: What are the characteristics of the sharing behavior?== In this RQ, we investigate the patterns in the sharing behaviors of developers. More specifically, we examine where those conversations are shared by whom and for what purpose. Our analysis reveals that developers utilize shared conversations as a way to complement their role-specific contributions, facilitating a more efficient and transparent collaborative process. The main contributions of this paper are described as follows:

We present a comprehensive analysis of how developers share their conversations with ChatGPT within the context of open-source projects, particularly focusing on GitHub pull requests and issues. Our study introduces two taxonomies to classify the dynamics of these interactions: one for the types of software engineering inquiries in developers’ prompts and another for the roles of prompts in multi-turn conversations with ChatGPT. To support further research, our replication package contains 580 and 654 manually annotated prompts aligned with these taxonomies.
Our study reveals the multifaceted usage of ChatGPT in software engineering. Beyond assisting with technical tasks, we observe the usage of shared conversations to support collaboration among developers in open-source projects.
We provide implications for developers and software engineering researchers based on our findings. These implications offer insights to improve the use of FM-powered tools like ChatGPT further in collaborative software development.

We organize the remainder of the paper as follows. Section 2 introduces our dataset. Section 3-5 present methodology and answers to each of the three research questions. Section 6 discusses the implications of findings for practitioners and researchers. Section 7 presents threats to validity, and Section 8 presents related work. Finally, Section 9 concludes the paper.