By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
World of SoftwareWorld of SoftwareWorld of Software
  • News
  • Software
  • Mobile
  • Computing
  • Gaming
  • Videos
  • More
    • Gadget
    • Web Stories
    • Trending
    • Press Release
Search
  • Privacy
  • Terms
  • Advertise
  • Contact
Copyright © All Rights Reserved. World of Software.
Reading: Measuring AI Creativity: Study Methods for Comedians & LLMs | HackerNoon
Share
Sign In
Notification Show More
Font ResizerAa
World of SoftwareWorld of Software
Font ResizerAa
  • Software
  • Mobile
  • Computing
  • Gadget
  • Gaming
  • Videos
Search
  • News
  • Software
  • Mobile
  • Computing
  • Gaming
  • Videos
  • More
    • Gadget
    • Web Stories
    • Trending
    • Press Release
Have an existing account? Sign In
Follow US
  • Privacy
  • Terms
  • Advertise
  • Contact
Copyright © All Rights Reserved. World of Software.
World of Software > Computing > Measuring AI Creativity: Study Methods for Comedians & LLMs | HackerNoon
Computing

Measuring AI Creativity: Study Methods for Comedians & LLMs | HackerNoon

News Room
Last updated: 2026/03/07 at 8:27 AM
News Room Published 7 March 2026
Share
Measuring AI Creativity: Study Methods for Comedians & LLMs | HackerNoon
SHARE

Table of Links

Abstract and 1. Introduction

  1. Methods
  2. Quantitative Results and Creativity Support Index
  3. Qualitative Results from Focus Group Discussions
  4. Discussion
  5. Mitigations and Conclusion and Acknowledgments
  6. Ethical Guidance References

A. Related Work on Computational Humour, AI and Comedy

B. Participant Questionaire

C. Focus

2 METHODS

Our study was designed to address a challenging problem, with on one hand, limitations of LLMs (stereotypes, inability to distinguish comedic offensiveness from harmful speech, cultural erasure and homogeneisation of content), and on the other hand, the use of LLMs for a creative writing task. For this reason, we asked a group of experts—professional comedians and performers—who are used both to thinking about thorny questions of identity, offensiveness and censorship in their work, and to employing language in a highly creative way. We chose artists who already use AI in their work and expected them to be somewhat knowledgeable and open to using AI: this likely biased our results[3].

We ran workshops with 20 comedians who use AI creatively. The first workshop with 10 participants was run in person at Edinburgh Festival Fringe 2023; the following 3 workshops with 3, 4 and 3 participants were run online. We reached out to comedians performing in Edinburgh during Fringe, or in our network, and attempted to recruit as diverse (along linguistic, cultural, gender, sexual, national and racial dimensions) a pool of comedians as possible given the constraints of the study[4]. Participants had contrasting views on AI for comedy writing, from “AI is very bad at this, and I don’t want to live in a world where it gets better” (p15) to “I liked the details that I got. I think those details sparked my imagination, and I think I could use them to write something” (p20). Participants were asked to register on the Prolific platform[5] and invited to join a specific study thanks to an allowlist. The study was approved by the research ethics committee of our institution. The information sheet and consent forms were shared with the participants, their active consent was obtained at the beginning of the workshop and they had the right to withdraw without prejudice at any time. The Prolific platform handled the payment of their participation fee, set to £300 for 3 hours.

We started each 3-hour session by describing the agenda and goals of the workshop, sharing the information sheet and consent forms with the participants, and asking them to start filling out a short anonymous survey about their background in comedy, previous exposure to AI and usage of AI in performance (full questionnaire in Appendix B.1).

2.1 Writing exercise

We then proceeded with a comedy-writing exercise, in which participants spent around 45 minutes on their own, using an LLM. We encouraged participants to try to use the LLM in a way that would generate useful material “that they would be comfortable presenting in a comedy context”, but emphasized that we did not require a fully-finished product by the end of the writing exercise. We invited them to use the language(s) they felt the most comfortable with[6]. We also suggested they could use the tool to 1) generate, rate/detect or explain jokes, 2) co-write jokes via iterative prompting, step-bystep or using examples, and 3) analyse, re-write or complete some of their previous material. In the first workshop (in person), we provided participants with access to ChatGPT-3.5 [79] served via a plain text interface similar to ChatGPT. In the following 3 workshops, we invited participants to use their own preferred model via their personal account: participants used ChatGPT-3.5, ChatGPT-4 [77] and Google Bard powered by Gemini Pro [98] (December 2023 version). Note that the choice of such instruction-tuned models was motivated by their popularity and ease of access by comedians, and more complex prompting strategies, such as used in Dramatron [71], could have produced higher-quality outputs.

2.2 Creativity Support Tools evaluation

Following the writing exercise, we asked participants to fill out three surveys. The first survey was about their experience with the AI system for writing comedy material and contained nine questions from previous studies [53, 71, 114] that assessed LLMs for creative writing on the 5-level Likert scale (see Appendix B.2). The second survey was used to calculate the Creativity Support Index (CSI) [25] of the writing tool, which itself was adapted from the NASA Task Load Index [43]. CSI is estimated in a psychometric survey that measures six dimensions of creativity support: Exploration, Expressiveness, Immersion, Enjoyment, Results Worth Effort, and Collaboration (see specific questions in Appendix B.3), and is a number between 0 and 100, where 90 is considered excellent and 50 mediocre. The third survey contained free-form questions on one thing that the “AI system” (the LLM writing tool) did well, one improvement, and open-ended comments on the writing session and on the survey.

2.3 Focus Group Questions

In order to guide the discussion, we prepared two sets of questions[7] (see Appendix B.4 for the full list of questions). The first set of questions pertained to the usefulness of the outputs generated by the LLM tool for personal writing, differences between using an LLM or searching for inspiration using Wikipedia or a search engine, the types of comedy that can be produced by an LLM, and concerns about the ownership of LLM-generated outputs.

The second set of questions addressed the comedy writing process of the participants, as well as the topics introduced in Section 1.2, namely various biases and stereotypes of LLMs, problems with moderation strategies employed by LLMs, the importance of context and delivery or whether some forms of cultural appropriation or homogeneisation could happen. We invited discussions about the use of other comedians’ work, and also challenged the participants with question on whether the AI has a “voice” and if humour can be quantified.

2.4 Focus Group Analysis

In our workshops, we had followed focus group methodology described in [74, 76] (engaging a group of participants in an informal one hour discussion focused around a particular topic, activity, or stimulus material, with a team of two moderators). Transcripts of focus groups were recorded as audio recordings, then automatically transcribed using speech recognition tools in Google Meet, and manually verified as well as compared against notes taken by the moderators. After transcription, audio and video recordings were destroyed. Like in the surveys, participants were anonymised: authors independently reviewed the transcripts to remove any personally identifiable information from the transcripts. We then performed constant comparison analysis to analyze the transcripts of the focus groups [76]. We first identified initial codes using sentence-by-sentence open coding. We then grouped those codes into themes, and found themes that were coherent across focus groups. Data from four focus groups allowed us to achieve data saturation [17, 63].

Results section 3 summarises the quantitative results[8] derived from the Creativity Support Tool evaluation (Sect. 2.2), while results section 4 details the observations made by the participants during focus groups (Sect. 2.4). Please note that this paper is an exploration of external perspectives rather than an endorsement of any one of them; in particular, this paper does not seek to undertake any legal evaluation.

:::info
Authors:

(1) Piotr W. Mirowski∗, Google DeepMind London, UK ([email protected]);

(2) Juliette Love∗, Google DeepMind London, UK ( [email protected]);

(3) Kory Mathewson, Google DeepMind Montréal, QC, Canada ([email protected]);

(4) Shakir Mohamed, Google DeepMind London, UK ([email protected]).

:::


:::info
This paper is available on arxiv under CC BY 4.0 license.

:::

[3] Our biased selection criteria of participants might, and likely do, lead to biased opinions as compared to the much more broad population of comedians and performers, which might be reflected in more favourable judgment of the Creativity Support Index of LLM writing tools. Future research might explore the diversity of opinions in creative communities across a greater range of familiarity with AI tools and openness to using them in their own creative practices. Exploring those opinions would significantly increase the scope of the paper and would make a compelling follow-up study.

[4] A demographic analysis of opinions might be a possible avenue for future investigations, but it would require a different study design and participant recruiting process.

[5] https://prolific.com

[6] Languages included German, Dutch, English, French, Hindi, Swedish and Tamil.

[7] Question-led focus groups are useful to start discussions, but we acknowledge the limitation that questions can bias the participants’ responses.

[8] Full outputs of the writing sessions, all individual survey results and raw transcripts from the focus groups will be shared in anonymised form as supplementary material, once our work is published.

Sign Up For Daily Newsletter

Be keep up! Get the latest breaking news delivered straight to your inbox.
By signing up, you agree to our Terms of Use and acknowledge the data practices in our Privacy Policy. You may unsubscribe at any time.
Share This Article
Facebook Twitter Email Print
Share
What do you think?
Love0
Sad0
Happy0
Sleepy0
Angry0
Dead0
Wink0
Previous Article Sony Reportedly Pulls the Plug on PlayStation Exclusive PC Ports Sony Reportedly Pulls the Plug on PlayStation Exclusive PC Ports
Next Article AWS Introduces Nested Virtualization on EC2 Instances AWS Introduces Nested Virtualization on EC2 Instances
Leave a comment

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Stay Connected

248.1k Like
69.1k Follow
134k Pin
54.3k Follow

Latest News

AMD GAIA 0.16 Introduces C++17 Agent Framework For Building AI PC Agents In Pure C++
AMD GAIA 0.16 Introduces C++17 Agent Framework For Building AI PC Agents In Pure C++
Computing
Regulate AWS and Microsoft, says UK cloud provider survey | Computer Weekly
Regulate AWS and Microsoft, says UK cloud provider survey | Computer Weekly
News
Bitcoin Hyper Price Prediction 2026: Banking Sector Blocks Crypto Legislation as Global Markets Stay Risk Off and Pepeto Builds the DeFi Stack No Meme Coin Has Ever Launched With
Bitcoin Hyper Price Prediction 2026: Banking Sector Blocks Crypto Legislation as Global Markets Stay Risk Off and Pepeto Builds the DeFi Stack No Meme Coin Has Ever Launched With
Gadget
Elsevier journal under fire over ‘AI-generated’ review comments
Software

You Might also Like

AMD GAIA 0.16 Introduces C++17 Agent Framework For Building AI PC Agents In Pure C++
Computing

AMD GAIA 0.16 Introduces C++17 Agent Framework For Building AI PC Agents In Pure C++

1 Min Read
The 5 Best Suits From Marvel’s Spider-Man | HackerNoon
Computing

The 5 Best Suits From Marvel’s Spider-Man | HackerNoon

8 Min Read
Educational Byte: What is a Crypto ETF? | HackerNoon
Computing

Educational Byte: What is a Crypto ETF? | HackerNoon

6 Min Read
A Further Exploration of the AGI Delusion  | HackerNoon
Computing

A Further Exploration of the AGI Delusion | HackerNoon

7 Min Read
//

World of Software is your one-stop website for the latest tech news and updates, follow us now to get the news that matters to you.

Quick Link

  • Privacy Policy
  • Terms of use
  • Advertise
  • Contact

Topics

  • Computing
  • Software
  • Press Release
  • Trending

Sign Up for Our Newsletter

Subscribe to our newsletter to get our newest articles instantly!

World of SoftwareWorld of Software
Follow US
Copyright © All Rights Reserved. World of Software.
Welcome Back!

Sign in to your account

Lost your password?