Table of Links
Abstract and 1. Introduction
2. Contexts, Methods, and Tasks
3. Mixed Outcomes
3.1. Quality and 3.2 Productivity
3.3. Learning and 3.4 Cost
4. Moderators
4.1. Task Types & Complexity
4.2. Compatibility
4.3. Communication
4.4. Collaboration
4.5. Logistics
5. Discussion and Future Work
5.1. LLM, Your pAIr Programmer?
5.2. LLM, A Better pAIr Programmer?
5.3. LLM, Students’ pAIr Programmer?
6. Conclusion, Acknowledgments, and References
5.3 LLM, Students’ pAIr Programmer?
As reviewed in Section 2, most current studies that evaluate the efficacy of Copilot are conducted with experienced software developers. If we estimate Copilot’s problem-solving abilities as an average student in introductory programming classes, evaluating its performance when pairing up with a professional software developer with much more expertise may not bring enough benefit to the professional. Therefore, working with LLM’s current capabilities, it seems like a student-AI pair programming setup would be the most promising to explore, so the next question is: how should we best support student-AI pair programming?
Re-prioritize programming skills. Co-working with AI requires a special skill set, and future work could explore how to support students to better develop these crucial skills. Bird et al. [12] argued that the popularity of LLM-based programming assistants will result in the growing importance of reviewing code as a skill for developers. Nonetheless, in Perscheid et al. [64]’s interview, none of the professional developers remembered training on debugging at school. There is already rich literature on debugging and testing instructions [2, 50, 77], but logistical challenges like the lack of instructional time still exist [23, 50], and educators need to better prepare students with debugging and testing skills needed to work with unreliable AI.
Integrate AIEd frameworks. On the theoretical side, Holstein et al. [33] developed a framework to map ways to mutually augment humans and AI in education, for example, by augmenting interpretation, action, scalability, and capacity. Future works can use existing theories in the AI education space to improve the design of the AI pAIr programming partner, and further investigate if LLMs bring new focus and affordances to previous human-AI education frameworks.
Support explanation and communication with students. Previous attempts of using AI agent as pair programming partner have shown some preliminary success in knowledge transfer and retention [28, 69], and the limitation discussed was the lack of discussion and explanation [41]. Nowadays, as an LLM-based agent can support more natural interaction and provide good quality explanations in the introductory programming context [42], it would be interesting to explore if LLM-based AI could resolve some limitations mentioned in pedagogical and conversational agent works before. Self-reflection and explanation techniques may also be adopted to make up for the communication aspect as in human-human pair programming.
Match expertise with students. As discussed in Section 4, matching expertise is a tricky problem. Lui and Chan [45] found that expert-expert pair may not gain as much of an advantage over an expert solo programmer, in comparison to novice-novice pair vs. a solo novice. Meanwhile, pairing two novices together raise concerns of “the blind leading the blind,” but pairing a novice with an expert may lead to lower self-esteem of the novice [4]. Given all these complexities, when it comes to a student-AI pair and when we only care about the student’s learning gains, there are a lot of research questions to ask. If we have full control of the perceived skill level of the AI partner, should we configure it to be similar to the student, slightly higher-skilled, or a lot better? Would it be beneficial to have both a peer AI agent but also a tutor AI agent to assist if students get stuck?
Avoid over-helping students. For programming learners, it would be important to configure the LLM-based programming assistant to avoid over-help. In the few studies that examined novice interaction with Copilot [66] or a customized programming environment based on LLM-based code generation model Codex [39]. Prather et al. [66] found that novices do have unique interaction patterns with Copilot and a tendency to rely on and trust the generated code too much. Kazemitabaar et al. [39] discussed design implications including control over-use and support complete novices. There have also been concerns about academic integrity and changing perception of learning when LLM-based programming tools become easily accessible to students [10, 66, 68], which need further explorations for student-AI pair programming.
Boost students’ self-confidence. Last but not least, pair programming has been shown to benefit students with lower self-efficacy and self-confidence levels [81] and women [47] more, which could make it a pedagogical tool to engage more vulnerable or underrepresented populations in CS. When an AI is introduced in pair programming, would the same benefit retains? How should we present the AI differently to make it compatible to students with different confidence levels? How do we mitigate the risks of unreliable but seemingly authoritative AI? LLMs may be an opportunity to address some existing challenges in student-student pair programming (as summarized in Table 2), but there are still a lot of open questions to ask.
Authors:
(1) Qianou Ma (Corresponding author), Carnegie Mellon University, Pittsburgh, USA ([email protected]);
(2) Tongshuang Wu, Carnegie Mellon University, Pittsburgh, USA ([email protected]);
(3) Kenneth Koedinger, Carnegie Mellon University, Pittsburgh, USA ([email protected]).