Gamekins Helps Students Write Smarter Code And Find More Bugs

Table of Links

Abstract and 1 Introduction

2.1 Software Testing

2.2 Gamification of Software Testing

3 Gamifying Continuous Integration and 3.1 Challenges in Teaching Software Testing

3.2 Gamification Elements of Gamekins

3.3 Gamified Elements and the Testing Curriculum

4 Experiment Setup and 4.1 Software Testing Course

4.2 Integration of Gamekins and 4.3 Participants

4.4 Data Analysis

4.5 Threats to Validity

5.1 RQ1: How did the students use Gamekins during the course?

5.2 RQ2: What testing behavior did the students exhibit?

5.3 RQ3: How did the students perceive the integration of Gamekins into their projects?

6 Related Work

7 Conclusions, Acknowledgments, and References

5.2 RQ2: What testing behavior did the students exhibit?

On average, the students wrote 14.8 tests for the Analyzer project and 21.8 for the Fuzzer (Fig. 7c). One of the goals of their assignments was to achieve 100 % line coverage (Fig. 7a) which they came close to achieving in both projects, with mean coverage percentages of 91.9 % and 95.5 % respectively. In terms of mutation scores (Fig. 7b), the students obtained a higher score in the Analyzer (mean 78.2 %) compared to the Fuzzer (mean 65.2 %), which may suggest that implementing and testing the Fuzzer was more challenging, for example, because of the necessity to use mocking techniques and special handling of edge cases. Despite this, the students received good grades overall for both projects (Fig. 7e), with mean grades of 87.4 %, and 89.6 % respectively. The students executed their projects

Figure 6: Examples for easy and hard challenges

Figure 7: Statistics on the testing behavior

in the CI more than 1300 times, averaging almost 26 runs per student. The majority of these runs were successful (93 %). The high number of runs can be attributed to the number of commits of both projects, as depicted in Fig. 7d (15 for Analyzer and 30 for Fuzzer).

Figure 8: Correlations between the points achieved in Gamekins and different metrics of the both projects

There are significant correlations between the points accumulated by participants using Gamekins and various factors, including coverage (Fig. 8a and Fig. 8b), mutation score (Fig. 8c), number of tests (Fig. 8d) and commits (Fig. 8f), and grade (Fig. 8e) for both projects combined. These positive correlations demonstrate that engaging with Gamekins and positive testing behavior are indeed

Figure 9: Complete git log of a student from 2019 (without, Gamekins)

Figure 10: Excerpt git log of a student from 2022 (with Gamekins)

related, which is important since (1) it confirms that the gamification elements are meaningful and do not distract students, and (2) it suggests that challenges and quests are effective metrics for comparing and grading students in a software testing course.

Compared to the class of 2019 which did not use Gamekins, both classes performed similarly in terms of the number of tests (Fig. 7c) and line coverage (Fig. 7a). The mutation scores were higher in 2022 (Fig. 7b), although the difference is not statistically significant (𝑝 = 0.11). The slight increase may be because students from 2019 stopped testing their projects once they achieved 100 % coverage, while the students from 2022 continued to solve Mutation Challenges even after reaching high coverage. There is a significant difference (𝑝 < 0.001) regarding the correctness of the outputs of the Analyzer (Fig. 7f): In 2022, almost all students had correct outputs (mean 10.6), whereas the students from 2019 had a lower mean of 8.3. This suggests that using Gamekins helped to identify bugs leading to incorrect output. This conjecture is supported by the

Figure 11: Responses regarding Gamekins

Figure 12: Responses regarding gamification elements

responses from the survey (G15), where two-thirds of the students stated that they discovered bugs while using Gamekins.

The commit histories also provide evidence of bugs being found, with individual commits referring to bugs being fixed. More generally, in the 2022 course, the students committed their code in a more fine-grained manner (Fig. 7d), which is evidence for their interactions with Gamekins. The mean number of commits in 2022 was 15, compared to 7 in 2019; this difference is statistically significant (𝑝 < 0.001). Figure 10 shows an example commit history (git log) of a student of 2022 committing tests individually, whereas Fig. 9 exemplifies the common behavior from 2019 without Gamekins when students committed their code in only a few commits. Overall, these findings suggest that the students in 2022 were more diligent in committing their code and addressing bugs.

Authors:

(1) Philipp Straubinger, University of Passau, Germany;

(2) Gordon Fraser, University of Passau, Germany.

Gamekins Helps Students Write Smarter Code and Find More Bugs | HackerNoon

Table of Links

5.2 RQ2: What testing behavior did the students exhibit?

Leave a Reply Cancel reply

Stay Connected

Latest News

Axis of Autocrats: North Korea’s escalating role in Russia’s Ukraine War

The 20 Coolest Cloud Software Companies Of The 2025 Cloud 100

Teen gunman kills student and then himself at Nashville high school

Incremental Vertex Processing Boosts PageRank Update Efficiency on Dynamic Graphs | HackerNoon

World of Software is your one-stop website for the latest tech news and updates, follow us now to get the news that matters to you.

Quick Link

Topics

Sign Up for Our Newsletter

Table of Links

5.2 RQ2: What testing behavior did the students exhibit?

Sign Up For Daily Newsletter

Be keep up! Get the latest breaking news delivered straight to your inbox.

Leave a Reply Cancel reply

Stay Connected

Latest News