Authors:
(1) Evan Shieh, Young Data Scientists League ([email protected]);
(2) Faye-Marie Vassel, Stanford University;
(3) Cassidy Sugimoto, School of Public Policy, Georgia Institute of Technology;
(4) Thema Monroe-White, Schar School of Policy and Government & Department of Computer Science, George Mason University ([email protected]).
Table of Links
Abstract and 1 Introduction
1.1 Related Work and Contributions
2 Methods and Data Collection
2.1 Textual Identity Proxies and Socio-Psychological Harms
2.2 Modeling Gender, Sexual Orientation, and Race
3 Analysis
3.1 Harms of Omission
3.2 Harms of Subordination
3.3 Harms of Stereotyping
4 Discussion, Acknowledgements, and References
SUPPLEMENTAL MATERIALS
A OPERATIONALIZING POWER AND INTERSECTIONALITY
B EXTENDED TECHNICAL DETAILS
B.1 Modeling Gender and Sexual Orientation
B.2 Modeling Race
B.3 Automated Data Mining of Textual Cues
B.4 Representation Ratio
B.5 Subordination Ratio
B.6 Median Racialized Subordination Ratio
B.7 Extended Cues for Stereotype Analysis
B.8 Statistical Methods
C ADDITIONAL EXAMPLES
C.1 Most Common Names Generated by LM per Race
C.2 Additional Selected Examples of Full Synthetic Texts
D DATASHEET AND PUBLIC USE DISCLOSURES
D.1 Datasheet for Laissez-Faire Prompts Dataset
D DATASHEET AND PUBLIC USE DISCLOSURES
D.1 Datasheet for Laissez-Faire Prompts Dataset
Following guidance from Gebru, et al. [79], we document our Laissez-Faire Prompts Dataset (technical details for construction described above) using a Datasheet.
D.1.1 Motivation
1. For what purpose was the dataset created?
We created this dataset for the purpose of studying biases in response to open-ended prompts that describe everyday usage, including students interfacing with language-model-based writing assistants and screenwriters or authors using generative language models to assist in fictional writing.
2. Who created the dataset (for example, which team, research group) and on behalf of which entity (for example, company, institution, organization)?
Evan Shieh created the dataset for the sole purpose of this research project.
3. Who funded the creation of the dataset?
The creation of the dataset was personally funded by the authors.
4. Any other comments?
This dataset primarily studies the context of life in the United States, although we believe that many of the same principles used in its construction can be adapted to settings in other nations and societies globally. This dataset provides a starting point for the analysis of generative language models. We use the term generative language model over the popularized alternative of “large language model” (or “LLM”) for multiple reasons. First, we believe that “large” is a subjective term with no clear scientific standard, and is used largely in the same way that “big” in “big data” is. An example highlighting this is Microsoft’s marketing material describing their model Phi as a “small language model”, despite it having 2.7 billion parameters [80], a number that may have been depicted by other developers as “large” just five years ago [81]. Secondly, we prefer to describe the models we study as “generative” to highlight the feature that this dataset assesses – namely, the capability of such models to product synthetic text. This contrasts non-generative uses of language models such as “text embedding”, or the mapping of written expressions (characters, words, and/or sentences) to mathematical vector representations through algorithms such as word2vec [82].
D.1.2 Composition
5. What do the instances that comprise the dataset represent (for example, documents, photos, people, countries)?
The instances comprising the dataset represent (1) synthetic texts generated by five generative language models (ChatGPT 3.5, ChatGPT 4, Claude 2.0, Llama 2 (7B chat), and PaLM 2) in response to open-ended prompts listed in Tables S3, S4, and S5 in addition to (2) co-reference labels for gender references and names of the fictional characters represented in each synthetic text, extracted directly from the synthetic text.
6. How many instances are there in total (of each type, if appropriate)?
There are 500,000 instances in total or 100K per model that can be further subdivided into 50K power-neutral prompts and 50K power-laden prompts, each of which contains 15K Learning prompts, 15K Labor prompts, and 20K Love prompts.
7. Does the dataset contain all possible instances or is it a sample (not necessarily random) of instances from a larger set?
Yes, the dataset contains all instances we collected from the generative language models used in this study.
8. What data does each instance consist of?
Model: Which language model generated the text
Time: Time of text generation
Domain: Domain for the prompt (Learning, Labor, or Love)
Power Dynamic: Power-Neutral or Power-Laden
Subject: Character described in prompt (e.g. actor, star student)
Object: Secondary character, if applicable (e.g. loyal fan, struggling student)
Query: Prompt given to language model
Response: Synthetic text in response to Query from the generative language model
Label Query: Prompt used for autolabeling the Response
Label Response: Synthetic text in response to Label Query from the fine-tuned labeling model
Subject References: Extracted gender references to the Subject character
Object References: Extracted gender references to the Object character, if applicable
Subject Name: Extracted name of the Subject character (“Unspecified” or blank means no name found)
Object Name: Extracted name of the Object character, if applicable (“Unspecified” or blank means no name found)
9. Is there a label or target associated with each instance?
None except for extracted gender references and extracted name, which is hand-labeled in 4,600 evaluation examples.
10. Is any information missing from individual instances?
Yes, when LMs return responses containing only whitespace, which we observe in some Llama 2 instances.
11. Are relationships between individual instances made explicit (for example, users’ movie ratings, social network links)?
No, each individual instance is self-contained.
12. Are there recommended data splits (for example, training, development/validation, testing)?
No.
13. Are there any errors, sources of noise, or redundancies in the dataset?
In extracted gender references / names, we estimate a precision error of < 2% and recall error of < 3%.
14. Is the dataset self-contained, or does it link to or otherwise rely on external resources (for example, websites, tweets, other datasets)?
The dataset is self-contained, but for our study we rely on external resources, including datasets containing realworld individuals with self-identified race by first name, which we use for modeling racial associations to names. We do not release linkages to these datasets in the interest of preserving privacy.
15. Does the dataset contain data that might be considered confidential (for example, data that is protected by legal privilege or by doctor–patient confidentiality, data that includes the content of individuals’ non-public communications)?
No.
16. Does the dataset contain data that, if viewed directly, might be offensive, insulting, threatening, or might otherwise cause anxiety?
Yes, including the stereotyping harms we describe in this paper. While we are releasing our dataset for audit transparency and in the hopes of furthering responsible AI research, we disclose the adverse impacts that reading our dataset may be triggering and upsetting to readers. Furthermore, some studies suggest that the act of warning that LMs may generate biased outputs may lead to increased anticipatory anxiety while having mixed results on actually dissuading readers from engaging [77]. We hope that this risk will be outweighed by the benefits of protecting susceptible consumers from otherwise subliminal harms.
17. Does the dataset identify any subpopulations (for example, by age, gender)?
No subpopulations of real-world individuals are identified in this dataset.
18. Is it possible to identify individuals (that is, one or more natural persons), either directly or indirectly (that is, in combination with other data) from the dataset?
Not that we are aware of, as all data included is synthetic text generated from language models. However, since the public is not fully aware of what data or annotations are used in the training processes for the models we study, we cannot guarantee against the possibility of leaked personally identifiable information.
19. Does the dataset contain data that might be considered sensitive in any way (for example, data that reveals race or ethnic origins, sexual orientations, religious beliefs, political opinions or union memberships, or locations; financial or health data; biometric or genetic data; forms of government identification, such as social security numbers; criminal history)?
Not for real individuals. Our dataset extracts gender references and names for synthetically generated characters.
20. Any other comments?
For researchers interested in reproduction of our study, if you require access to the data we mention in question 14, please follow the instructions listed in the papers by the authors we cite.
D.1.3 Collection Process
21. How was the data associated with each instance acquired? Was the data directly observable (for example, raw text, movie ratings), reported by subjects (for example, survey responses), or indirectly inferred/ derived from other data (for example, part-of-speech tags, model-based guesses for age or language)?
The data in each instance was acquired through prompting generative language models for audit purposes.
22. What mechanisms or procedures were used to collect the data (for example, hardware apparatuses or sensors, manual human curation, software programs, software APIs)?
For ChatGPT 3.5, ChatGPT 4, Claude 2.0, and PaLM 2, we used software APIs in combination with texts pulled directly from the online user interface (specifically, 10K of the 100K instances for Claude 2.0). For Llama 2 (7B), we deployed the model on Google Colaboratory instances using HuggingFace software libraries.
23. If the dataset is a sample from a larger set, what was the sampling strategy (for example, deterministic, probabilistic with specific sampling probabilities)?
N/A.
24. Who was involved in the data collection process (for example, students, crowdworkers, contractors) and how were they compensated (for example, how much were crowdworkers paid)?
Only the authors of the study were involved in the data labeling process. For data collection, we paid a student intern $16,000 at a rate of $45 per hour (this included other duties unrelated to the paper as well).
25. Over what timeframe was the data collected?
Data collection was conducted from August 16th to November 7th, 2023.
26. Were any ethical review processes conducted (for example, by an institutional review board)?
No, as no human subjects were involved.
27. Did you collect the data from the individuals in question directly, or obtain it via third parties or other sources (for example, websites)?
N/A – no human subjects involved.
28. Were the individuals in question notified about the data collection?
N/A – no human subjects involved.
29. Did the individuals in question consent to the collection and use of their data?
N/A – no human subjects involved.
30. If consent was obtained, were the consenting individuals provided with a mechanism to revoke their consent in the future or for certain uses?
N/A – no human subjects involved.
31. Has an analysis of the potential impact of the dataset and its use on data subjects (for example, a data protection impact analysis) been conducted?
N/A – no human subjects involved.
32. Any other comments? No.
D.1.4 Preprocessing / Cleaning / Labeling
33. Was any preprocessing/clean ing/labeling of the data done (for example, discretization or bucketing, tokenization, part-of-speech tagging, SIFT feature extraction, removal of instances, processing of missing values)?
Yes, we trimmed whitespace from the synthetic text generations.
34. Was the “raw” data saved in addition to the preprocessed/cleaned/ labeled data (for example, to support unanticipated future uses)?
Yes – this can be made available upon request to the corresponding authors.
35. Is the software that was used to preprocess/clean/label the data available?
Yes – we are open sourcing this as part of our data as well.
36. Any other comments?
No.
D.1.5 Uses
37. Has the dataset been used for any tasks already?
Only for this study so far.
38. Is there a repository that links to any or all papers or systems that use the dataset?
Not currently, although we request that any researchers who want to access this dataset provide such information.
39. What (other) tasks could the dataset be used for?
This dataset can be used for (1) additional auditing studies, (2) training co-reference resolution models that will perform specifically on topics related to what we study in our paper (i.e. in English, 100 words or less, with similar prompts).
40. Is there anything about the composition of the dataset or the way it was collected and preprocessed/ cleaned/labeled that might impact future uses?
Yes, the labeled gender references are built off of the word lists we provide in Table S6, which we acknowledge is not a complete schema. This will need to be extended or modified to account for future genders of interest.
41. Are there tasks for which the dataset should not be used?
We condemn the usage of our dataset in any possible system that is used to target, harass, harm, or otherwise discriminate against real-world individuals inhabiting minoritized gender, race, and sexual orientation identities, including the harms we study in this paper. One disturbing recent abuse of automated models is illuminated by a 2020 civil lawsuit National Coalition on Black Civic Participation v. Wohl [78], which describes how a group of defendants used automated robocalls to target and attempt to intimidate tens of thousands of Black voters ahead of the November 2020 US election. To mitigate the risks of our models being used in such a system, we do not release our trained models for coreference resolution, and will ensure that any open-source access to our dataset is mediated by repositories that require researchers to document their use cases before receiving access.
42. Any other comments?
No.
D.1.6 Distribution
43. Will the dataset be distributed to third parties outside of the entity (for example, company, institution, organization) on behalf of which the dataset was created?
Yes, the dataset will be made publicly available.
44. How will the dataset be distributed (for example, tarball on website, API, GitHub)?
Does the dataset have a digital object identifier (DOI)? The dataset will be distributed on a website provider with functionality that requires accessing users to contact the authors and state the purpose of usage before access is granted. No DOI has been assigned as the time of this writing.
45. When will the dataset be distributed? Upon publication.
46. Will the dataset be distributed under a copyright or other intellectual property (IP) license, and/or under applicable terms of use (ToU)?
Yes, we will provide a ToU in addition to linking to the ToU of the developers of the five language models we study.
47. Have any third parties imposed IP-based or other restrictions on the data associated with the instances?
Yes, the developers of the language models we study.
48. Do any export controls or other regulatory restrictions apply to the dataset or to individual instances? No.
49. Any other comments? No.
D.1.7 Maintenance
50. Who will be supporting/hosting/maintaining the dataset?
The first corresponding author will be maintaining the dataset.
51. How can the owner/curator/ manager of the dataset be contacted (for example, email address)?
Please contact us directly through Harvard Dataverse: https://doi.org/10.7910/DVN/WF8PJD.
52. Is there an erratum?
One will be started and maintained as part of our distribution process.
53. Will the dataset be updated (for example, to correct labeling errors, add new instances, delete instances)?
Yes, to correct labeling errors.
54. If the dataset relates to people, are there applicable limits on the retention of the data associated with the instances (for example, were the individuals in question told that their data would be retained for a fixed period of time and then deleted)?
N/A – no human subjects or relationships involved.
55. Will older versions of the dataset continue to be supported/hosted/ maintained?
Yes, the dataset will be versioned.
56. If others want to extend/augment/build on/contribute to the dataset, is there a mechanism for them to do so?
No, we ask any interested individuals to contact us on a case-by-case basis.
-
Any other comments?
No.