Table of Links
Abstract and 1 Introduction
2. Prior conceptualisations of intelligent assistance for programmers
3. A brief overview of large language models for code generation
4. Commercial programming tools that use large language models
5. Reliability, safety, and security implications of code-generating AI models
6. Usability and design studies of AI-assisted programming
7. Experience reports and 7.1. Writing effective prompts is hard
7.2. The activity of programming shifts towards checking and unfamiliar debugging
7.3. These tools are useful for boilerplate and code reuse
8. The inadequacy of existing metaphors for AI-assisted programming
8.1. AI assistance as search
8.2. AI assistance as compilation
8.3. AI assistance as pair programming
8.4. A distinct way of programming
9. Issues with application to end-user programming
9.1. Issue 1: Intent specification, problem decomposition and computational thinking
9.2. Issue 2: Code correctness, quality and (over)confidence
9.3. Issue 3: Code comprehension and maintenance
9.4. Issue 4: Consequences of automation in end-user programming
9.5. Issue 5: No code, and the dilemma of the direct answer
10. Conclusion
A. Experience report sources
References
9.5. Issue 5: No code, and the dilemma of the direct answer
Finally, it is not a foregone conclusion that users are even interested in code. As Blackwell’s model of attention investment notes, in many cases the user may be content to perform an action manually, rather than invest in creating a reusable automation (Blackwell, 2002a; J. Williams et al., 2020). Spreadsheet users, in particular, are often not sensitive to the level of automation or automatability of a given workflow, using a mix of manual, automated, and semi-automated techniques to achieve the goal at hand (Pandita et al., 2018).
Spreadsheet users often need ad-hoc transformations of their data that they will, in all likelihood, never need again. It may be that we can express this transformation as a program, but if the user is interested in the output and not the program, is it important, or even necessary, to communicate this fact to the user? One can argue that increasing the user’s awareness of the flexibility and fallibility of the process of delivering an inferred result (i.e., enabling them to critically evaluate the output (Sarkar et al., 2015)) can build agency, confidence, trust, and resilience. This issue is related to information retrieval’s “dilemma of the direct answer” (Potthast et al., 2021), raised in response to the increased phenomenon of search engines directly answering queries in addition to simply listing retrieved results.
However, if the programming language used is not related to the languages familiar to the end-user, or the user is a complete novice, it is exceedingly difficult for them to make any sense of it, as was shown by Lau et al. (2021) in their study of Excel users encountering Python code. Yet, there are socio-technical motivations for using an unfamiliar target language: long-term testing of LLM assistance shows that it shines when paired with high-level APIs that capture use cases well (Section 7). One advantage of the Python ecosystem is that it has an unparalleled set of libraries and APIs for data wrangling. An LLM-assisted tool that emits Excel formulas is therefore less likely to solve user problems than Python statements. In the longer term, this might be mitigated by developing a rich set of data manipulation libraries in the Excel formula language.