Table of Links
Abstract and 1 Introduction
1.1 The twincode platform
1.2 Pilot Studies
1.3 Other Gender Identities and 1.4 Structure of the Paper
2 Related Work
3 Original Study (Seville Dec, 2021) and 3.1 Participants
3.2 Experiment Execution
3.3 Factors (Independent Variables)
3.4 Response Variables (Dependent Variables)
3.5 Confounding Variables
3.6 Data Analysis
4 First Replication (Berkeley May, 2022)
4.1 Participants
4.2 Experiment Execution
4.3 Data Analysis
5 Discussion and Threats to Validity and 5.1 Operationalization of the Cause Construct — Treatment
5.2 Operationalization of the Effect Construct — Metrics
5.3 Sampling the Population — Participants
6 Conclusions and Future Work
6.1 Replication in Different Cultural Background
6.2 Using Chatbots as Partners and AI-based Utterance Coding
Datasets, Compliance with Ethical Standards, Acknowledgements, and References
A. Questionnaire #1 and #2 response items
B. Evolution of the twincode User Interface
C. User Interface of tag-a-chat
4 First Replication (Berkeley May, 2022)
In this section, the first replication carried out at the University of California Berkeley in May 2022 is reported focusing mainly on the changes in the participants and the experiment execution with respect to the original experiment, since the research questions and variables were the same in both studies. For each change, an estimation of their impact on the four types of experimental validity described by [61] is included, following the recommendations by [11] about reporting the impact of changes in replications using a 7-point discrete scale from −3 to +3. A summary of the impact of those changes is presented in Table 5, including the labels of the aforementioned scale in its legend.
4.1 Participants
In the replication carried out at the University of California, Berkeley, the participants were mainly first year students enrolled in the CS61A (The Structure and Interpretation of Computer Programs) and CS88 (Computational Structures in Data Science) courses. Applying the same criteria than for the original experiment, the final number of valid subjects was 46, arranged in 23 pairs. Only 6 students, i.e. 3 pairs, were excluded from the initial 52 participants. One pair was dropped due to the disclosure of their identities during the pair programming tasks; another pair was dropped because one of its partners did not actively participate in the experimental tasks; and the third pair was excluded because they lost their connection to the twincode platform repeatedly and their metrics could not be properly collected. Among the remaining 46 valid subjects, 26 identified as woman (56.52%) and the rest as man (43.48%) during the registration process[10].
Note that, contrary to the original experiment, the percentage of women is above that of men because the CS61A and CS88 introductory courses are taken also by students from other majors, usually with a higher presence of women than in Computer Science majors, where is around 25% [58]. Note also that despite the 6 dropped subjects, the percentage of women in the control (12 women, 52.17%) and experimental (14 women, 60.87%) groups were close to each other.
From our point of view, this change in the sampled population from third-year Spanish students to first-year U.S. students, and the higher percentage of women increased external validity, but the reduction in 50% of the number of subjects (46 pairs to 23 pairs) reduced conclusion validity.
Authors:
(1) Amador Duran, I3US Institute, Universidad de Sevilla, Sevilla, Spain and SCORE Lab, Universidad de Sevilla, Sevilla, Spain ([email protected]);
(2) Pablo Fernandez, I3US Institute, Universidad de Sevilla, Sevilla, Spain and SCORE Lab, Universidad de Sevilla, Sevilla, Spain ([email protected]);
(3) Beatriz Bernardez, I3US Institute, Universidad de Sevilla, Sevilla, Spain and SCORE Lab, Universidad de Sevilla, Sevilla, Spain ([email protected]);
(4) Nathaniel Weinman, Computer Science Division, University of California, Berkeley, Berkeley, USA ([email protected]);
(5) Aslıhan Akalın, Computer Science Division, University of California, Berkeley, Berkeley, USA ([email protected]);
(6) Armando Fox, Computer Science Division, University of California, Berkeley, Berkeley, USA ([email protected]).