Table of Links
Abstract and 1 Introduction
2 Related Work
2.1 Program Synthesis
2.2 Creativity Support Tools for Animation
2.3 Generative Tools for Design
3 Formative Steps
4 Logomotion System and 4.1 Input
4.2 Preprocess Visual Information
4.3 Visually-Grounded Code Synthesis
5 Evaluations
5.1 Evaluation: Program Repair
5.2 Methodology
5.3 Findings
6 Evaluation with Novices
7 Discussion and 7.1 Breaking Away from Templates
7.2 Generating Code Around Visuals
7.3 Limitations
8 Conclusion and References
5.1 Evaluation: Program Repair
We next conducted an evaluation specifically centered around the visually-grounded program repair stage to methodically understand the sorts of errors LogoMotion would make (RQ3) and was capable
of debugging (RQ4). We provide empirical analysis of this stage testing different experimental settings.
5.2 Methodology
68% of the animation runs from the outset (after program synthesis) were error-free and did not require program repair. The other 32% required the program repair stage. Within this stage, we modulated a 1) hyperparameter 𝑘 and 2) whether or not image context was provided. 𝑘 upper bounded the number of attempts an LLM could take to solve the bug, and was varied from 1 to 4 attempts. Varying 𝑘 is modeled after the pass@k methodology proposed by HumanEval [21], where 𝑘 code samples are generated in attempts to solve a problem and the fraction of problems solved is the solve rate. In this case, the pass@k framework is applied in the context of program repair / self-refinement.
Authors:
(1) Vivian Liu, Columbia University ([email protected]);
(2) Rubaiat Habib Kazi, Adobe Research ([email protected]);
(3) Li-Yi Wei, Adobe Research ([email protected]);
(4) Matthew Fisher, Adobe Research ([email protected]);
(5) Timothy Langlois, Adobe Research ([email protected]);
(6) Seth Walker, Adobe Research ([email protected]);
(7) Lydia Chilton, Columbia University ([email protected]).
This paper is available on arxiv under CC BY-NC-ND 4.0 DEED license.