Authors:
(1) Anh V. Vu, University of Cambridge, Cambridge Cybercrime Centre ([email protected]);
(2) Alice Hutchings, University of Cambridge, Cambridge Cybercrime Centre ([email protected]);
(3) Ross Anderson, University of Cambridge, and University of Edinburgh ([email protected]).
Table of Links
Abstract and 1 Introduction
2. Deplatforming and the Impacts
2.1. Related Work
2.2. The Kiwi Farms Disruption
3. Methods, Datasets, and Ethics, and 3.1. Forum and Imageboard Discussions
3.2. Telegram Chats and 3.3. Web Traffic and Search Trends Analytics
3.4. Tweets Made by the Online Community and 3.5. Data Licensing
3.6. Ethical Considerations
4. The Impact on Forum Activity and Traffic, and 4.1. The Impact of Major Disruptions
4.2. Platform Displacement
4.3. Traffic Fragmentation
5. The Impacts on Relevant Stakeholders and 5.1. The Community that Started the Campaign
5.2. The Industry Responses
5.3. The Forum Operators
5.4. The Forum Members
6. Tensions, Challenges, and Implications and 6.1. The Efficacy of the Disruption
6.2. Censorship versus Free Speech
6.3. The Role of Industry in Content Moderation
6.4. Policy Implications
6.5. Limitations and Future Work
7. Conclusion, Acknowledgments, and References
Appendix A.
The disruption campaign started on Twitter on 22 August 2022 with tweets posted under the hashtag #dropkiwifarms. We gathered the main tweets plus associated metadata, such as posting time and reactions (e.g., replies, retweets, likes, and quotes) using SNSCRAPE, an open-source Python framework for social network scrapers.[11] As they use Twitter APIs as the underlying method, the data are likely to be complete. We collected 11 076 tweets made by 3 886 users, spanning the entire campaign period. This data helps us understand the community reaction throughout the campaign, when the industry took action, and when the forum recovered. There might be more related tweets without the hashtag #dropkiwifarms of which we are unaware, but scanning the whole Twitter space is infeasible. It is likely that the trend measured by our collection is representative as the campaign was congregated around this hashtag.
3.5. Data Licensing
Our datasets and scripts for data collection and analysis are available to academics, as well as an interactive web portal to assist those who lack technical skills to access our data [67]. However, as both researchers and actors such as forum members might be exposed to risk and harm [68], we decline to make our data publicly accessible. It is our standard practice at the Cambridge Cybercrime Centre to require our licensees to sign an agreement to prevent misuse, to ensure the data will be handled appropriately, and to keep us informed about research outcomes [69]. We have a long history of sharing such sensitive data, and robust procedures carefully crafted in conjunction with legal academics, university lawyers and specialist external counsel to enable data sharing across multiple jurisdictions.
3.6. Ethical Considerations
Our work was formally approved by our institutional Ethics Review Board (ERB) for data collection and analysis. Our datasets are collected on publicly available forums and channels, which are accessible to all. We collected the forum when it was hosted in the US; according to a 2022 US court case, scraping public data is legal [70]. Our scraping method does not violate any regulations and does not cause negative consequences to the targeted websites e.g., bandwidth congestion or denial of service. It would be impractical to send thousands of messages to gain consent from all forum and Telegram members; we assume they are aware that their activity on public online places will be widely accessible.
In contrast to some previous work on online forums, we name the investigated forums in this paper. Pseudonymising the forum name is pointless because of the high-profile campaign being studied. Thus, we avoid the pretence that the forum is not identifiable and shift the focus to accounting for the potential harms to both researchers and involved actors associated with our research. We designed our analysis to operate ethically and collectively by only presenting aggregated behaviours to avoid private and sensitive information of individuals being inferred. This is in accordance with the British Society of Criminology Statement on Ethics [71].
Researchers may be at risk and may experience various elevated digital threats when doing work on sensitive resources [68], [72]. Studying extremist forums may introduce a higher risk of retaliation than other forums, resulting in mental or physical harm. We have taken measures to
minimise potential harm to researchers and involved actors when doing studies with human subjects and at-risk populations [73], [74]. For example, we consider options to anonymise authors’ names or use pseudonyms for any publication related to the project, including this paper, if necessary. We also refrain from directly looking at media, which may cause emotional harm; our scrapers thus only collect text while discarding images and videos. Although all datasets are widely accessible and can be gathered by the public, we refrain from scraping private and protected posts behind the login wall due to safety and legality concerns.