Table of links
Abstract
1 Introduction
2 Related Work
2.1 Fairness and Bias in Recommendations
2.2 Quantifying Gender Associations in Natural Language Processing Representations
3 Problem Statement
4 Methodology
4.1 Scope
4.3 Flag
5 Case Study
5.1 Scope
5.2 Implementation
5.3 Flag
6 Results
6.1 Latent Space Visualizations
6.2 Bias Directions
6.3 Bias Amplification Metrics
6.4 Classification Scenarios
7 Discussion
8 Limitations & Future Work
9 Conclusion and References
8 Limitations & Future Work
Our most noticeable limitation when implementing these techniques was the lack of distinct and well-labeled user pairings for metric calculation, a common occurrence in recommendation settings. We demonstrated methods to overcome this limitation, but this work could be further refined in the future to avoid possibly introducing more bias into the evaluation via practitioner-defined entity pairing techniques. In the future, specific to this case study, we plan to explore counterfactual user vectors according to gender to create distinct pairs. Counterfactual user pairings would isolate the feature within the latent space and potentially reduce attributing spurious relationships between users solely to gender differences. It is important to note that this workaround is only available for models trained with entity attributes. This limitation would remain when evaluating the implicit or systematic bias of an LFR algorithm.
Another limitation that could be addressed with future work is exploring gender bias in a non-binary manner. In general, algorithmic bias research is done on binary groups because it is inherently easier to measure biased relationships between two groups rather than multiple groups. Methods for measuring bias for multiple groups could be further developed to capture multi-group relationships. Instead of creating novel metrics, researchers could also expand the analysis to compare more than two groups. However, this can quickly create an overwhelming amount of comparisons as the number of groups for evaluation grows. Finally, this evaluation was performed on a proprietary industry system for a type of media known for listening patterns highly related to the listener’s gender. We would like to conduct these evaluation techniques on public data sets (such as Last.FM) and other recommendation algorithms to understand if our methodology performs well when levels of attribute association bias are not as distinct as those found within our podcast recommendations case study.
9 Conclusion and References
Our framework provides a clear path in uncovering potentially harmful stereotyped relationships resulting from attribute association bias resulting from an LFR model. In showcasing our techniques on an industry case study, we found that our proposed methodologies successfully measured and flagged attribute association bias. Additionally, we uncovered clear advantages and disadvantages for our proposed methods to help practitioners choose the appropriate techniques for their scoped evaluations. The success of our methodologies in uncovering attribute association bias highlights the importance of understanding how stereotypical relationships can become embedded into trained recommendation latent spaces. For example, our ability to predict user gender from podcast vectors demonstrates how leveraging these vectors as attributes in downstream models can introduce implicit user gender bias in subsequent outputs, even if owners of downstream models intentionally remove user gender as a training feature. The ability for listening history to predict user gender showcases that user gender bias is embedded within the podcast vectors, meaning their use can inherently introduce gender bias into other modeling systems. Understanding this type of representation bias becomes increasingly crucial in industry recommendation systems where embeddings are used across models owned by different teams. If attribute association bias is left unchecked in hybrid recommendation scenarios, teams are at risk of amplifying systematic representation harms resulting from providing stereotyped recommendations for stakeholders.
We also demonstrate that classifiers can serve as valuable tools for uncovering behaviors of bias in representation learning outside of NLP, thus opening the doors for future work leveraging innovative evaluation techniques across representation learning disciplines. Based on our results, training classification models to highlight potential attribute association bias helped showcase more nuanced behavior that may be lost when solely using other metrics. The ability to design specific classification scenarios enabled us to observe different ways bias could be captured and how item embeddings assumed to be associated with specific attributes can display varying behavior. We observed nuanced results such as increased prediction accuracy for podcast embeddings associated with high levels of female listening when user gender was not a feature, which would otherwise require more complex metrics and methodologies to highlight how behavior between entities differs when attribute association bias is present.
Similar to findings by Basta et al. [8], our results support the notion that capturing and understanding the behavior of gender bias in more implicitly biased recommendation vector embeddings is a complicated and nuanced task, requiring further analysis beyond our results showcased in this paper. We hope our evaluation framework serves as a building block for future research addressing representative harms and attribute association bias in recommendation systems.
References
[1] [n. d.]. The right pitch: A look into the popularity of podcast hosts by gender. https://www.attexperts.com/podcast-host-gender-vs-genre [2] Himan Abdollahpouri and Robin Burke. 2019. Reducing Popularity Bias in Recommendation Over Time. https://doi.org/10.48550/arXiv. 1906.11711 Number: arXiv:1906.11711 arXiv:1906.11711 [cs].
[3] Himan Abdollahpouri, Robin Burke, and Bamshad Mobasher. 2018. Value-Aware Item Weighting for Long-Tail Recommendation. CoRR abs/1802.05382 (2018). arXiv:1802.05382 http://arxiv.org/abs/1802. 05382
[4] Himan Abdollahpouri, Masoud Mansoury, Robin Burke, Bamshad Mobasher, and Edward Malthouse. 2021. User-Centered Evaluation of Popularity Bias in Recommender Systems. In Proceedings of the 29th ACM Conference on User Modeling, Adaptation and Personalization (Utrecht, Netherlands) (UMAP ’21). Association for Computing Machinery, New York, NY, USA, 119–129. https://doi.org/10.1145/ 3450613.3456821
[5] Larry Alexander. 1992. What makes wrongful discrimination wrong? Biases, preferences, stereotypes, and proxies. University of Pennsylvania Law Review 141, 1 (1992), 149–219.
[6] Xavier Amatriain and Justin Basilico. 2015. Recommender Systems in Industry: A Netflix Case Study. Springer US, Boston, MA, 385–419. https://doi.org/10.1007/978-1-4899-7637-6_11
[7] Solon Barocas, Anhong Guo, Ece Kamar, Jacquelyn Krones, Meredith Ringel Morris, Jennifer Wortman Vaughan, W. Duncan Wadsworth, and Hanna Wallach. 2021. Designing Disaggregated Evaluations of AI Systems: Choices, Considerations, and Tradeoffs. In Proceedings of the 2021 AAAI/ACM Conference on AI, Ethics, and Society (Virtual Event, USA) (AIES ’21). Association for Computing Machinery, New York, NY, USA, 368–378. https://doi.org/10.1145/3461702.3462610
[8] Christine Basta, Marta R. Costa-jussà, and Noe Casas. 2019. Evaluating the Underlying Gender Bias in Contextualized Word Embeddings. In Proceedings of the First Workshop on Gender Bias in Natural Language Processing. Association for Computational Linguistics, Florence, Italy, 33–39. https://doi.org/10.18653/v1/W19-3805
[9] Lex Beattie, Dan Taber, and Henriette Cramer. 2022. Challenges in Translating Research to Practice for Evaluating Fairness and Bias in Recommendation Systems. In Proceedings of the 16th ACM Conference on Recommender Systems (Seattle, WA, USA) (RecSys ’22). Association for Computing Machinery, New York, NY, USA, 528–530. https: //doi.org/10.1145/3523227.3547403
[10] Pauwke Berkers. 2012. Gendered scrobbling: Listening behaviour of young adults on Last. fm. Interactions: Studies in Communication & Culture 2, 3 (2012), 279–296.
[11] Rishabh Bhardwaj, Navonil Majumder, and Soujanya Poria. 2021. Investigating gender bias in BERT. Cognitive Computation 13, 4 (2021), 1008–1018.
[12] Kelli S Boling and Kevin Hull. 2018. Undisclosed information—Serial is my favorite murder: Examining motivations in the true crime podcast audience. Journal of Radio & Audio Media 25, 1 (2018), 92–108.
[13] Tolga Bolukbasi, Kai-Wei Chang, James Y Zou, Venkatesh Saligrama, and Adam T Kalai. 2016. Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings. In Advances in Neural Information Processing Systems, D. Lee, M. Sugiyama, U. Luxburg, I. Guyon, and R. Garnett (Eds.), Vol. 29. Curran Associates, Inc., Barcelona, Spain. https://proceedings.neurips.cc/paper_files/paper/ 2016/file/a486cd07e4ac3d270571622f4f316ec5-Paper.pdf
[14] Robin Burke. 2007. Hybrid Web Recommender Systems. Springer Berlin Heidelberg, Berlin, Heidelberg, 377–408. https://doi.org/10.1007/978- 3-540-72079-9_12
[15] Aylin Caliskan, Joanna J. Bryson, and Arvind Narayanan. 2017. Semantics derived automatically from language corpora contain human-like biases. Science 356, 6334 (2017), 183–186. https://doi.org/10.1126/science.aal4230 arXiv:https://www.science.org/doi/pdf/10.1126/science.aal4230
[16] Le Chen, Ruijun Ma, Anikó Hannák, and Christo Wilson. 2018. Investigating the Impact of Gender on Rank in Resume Search Engines. Association for Computing Machinery, New York, NY, USA, 1–14. https://doi.org/10.1145/3173574.3174225
[17] Paul Covington, Jay Adams, and Emre Sargin. 2016. Deep Neural Networks for YouTube Recommendations. In Proceedings of the 10th ACM Conference on Recommender Systems (Boston, Massachusetts, USA) (RecSys ’16). Association for Computing Machinery, New York, NY, USA, 191–198. https://doi.org/10.1145/2959100.2959190
[18] Clay Martin Craig, Mary Elizabeth Brooks, and Shannon Bichard. 2023. Podcasting on purpose: Exploring motivations for podcast use among young adults. International Journal of Listening 37, 1 (2023), 39–48. https://doi.org/10.1080/10904018.2021.1913063 arXiv:https://doi.org/10.1080/10904018.2021.1913063
[19] Abhisek Dash, Abhijnan Chakraborty, Saptarshi Ghosh, Animesh Mukherjee, and Krishna P. Gummadi. 2021. When the Umpire is Also a Player: Bias in Private Label Product Recommendations on E-Commerce Marketplaces. In Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency (Virtual Event, Canada) (FAccT ’21). Association for Computing Machinery, New York, NY, USA, 873–884. https://doi.org/10.1145/3442188.3445944
[20] Lavinia De Divitiis, Federico Becattini, Claudio Baecchi, and Alberto Del Bimbo. 2023. Disentangling Features for Fashion Recommendation. ACM Trans. Multimedia Comput. Commun. Appl. 19, 1s, Article 39 (jan 2023), 21 pages. https://doi.org/10.1145/3531017 [21] Yupei Du, Qixiang Fang, and Dong Nguyen. 2021. Assessing the Reliability of Word Embedding Gender Bias Measures. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Online and Punta Cana, Dominican Republic, 10012–10034. https://doi.org/10. 18653/v1/2021.emnlp-main.785
[22] Cynthia Dwork, Moritz Hardt, Toniann Pitassi, Omer Reingold, and Richard Zemel. 2012. Fairness through Awareness. In Proceedings of the 3rd Innovations in Theoretical Computer Science Conference (Cambridge, Massachusetts) (ITCS ’12). Association for Computing Machinery, New York, NY, USA, 214–226. https://doi.org/10.1145/2090236.2090255
[23] Michael D. Ekstrand, Anubrata Das, Robin Burke, and Fernando Diaz. 2021. Fairness and Discrimination in Information Access Systems. arXiv:2105.05779 https://arxiv.org/abs/2105.05779
[24] Michael D Ekstrand, Anubrata Das, Robin Burke, Fernando Diaz, et al. 2022. Fairness in Information Access Systems. Foundations and Trends® in Information Retrieval 16, 1-2 (2022), 1–177.
[25] Michael D. Ekstrand, Mucun Tian, Mohammed R. Imran Kazi, Hoda Mehrpouyan, and Daniel Kluver. 2018. Exploring Author Gender in Book Rating and Recommendation. In Proceedings of the 12th ACM Conference on Recommender Systems (Vancouver, British Columbia, Canada) (RecSys ’18). Association for Computing Machinery, New York, NY, USA, 242–250. https://doi.org/10.1145/3240323.3240373
[26] Avriel Epps-Darling, Romain Takeo Bouyer, and Henriette Cramer. 2020. Artist gender representation in music streaming. In Proceedings of the 21st International Society for Music Information Retrieval Conference (ISMIR 2020). ISMIR. Montréal, Canada, 248–254.
[27] Kawin Ethayarajh, David Duvenaud, and Graeme Hirst. 2019. Understanding Undesirable Word Embedding Associations. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Florence, Italy, 1696–1705. https://doi.org/10.18653/v1/P19-1166
[28] Andres Ferraro, Xavier Serra, and Christine Bauer. 2021. Break the Loop: Gender Imbalance in Music Recommenders. In Proceedings of the 2021 Conference on Human Information Interaction and Retrieval (Canberra ACT, Australia) (CHIIR ’21). Association for Computing Machinery, New York, NY, USA, 249–254. https://doi.org/10.1145/ 3406522.3446033
[29] Sahin Cem Geyik, Stuart Ambler, and Krishnaram Kenthapadi. 2019. Fairness-Aware Ranking in Search & Recommendation Systems with Application to LinkedIn Talent Search. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (Anchorage, AK, USA) (KDD ’19). Association for Computing Machinery, New York, NY, USA, 2221–2231. https://doi.org/10.1145/ 3292500.3330691
[30] Hila Gonen and Yoav Goldberg. 2019. Lipstick on a Pig: Debiasing Methods Cover up Systematic Gender Biases in Word Embeddings But do not Remove Them. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Association for Computational Linguistics, Minneapolis, Minnesota, 609–614. https://doi.org/10.18653/v1/N19-1061
[31] Marie Charlotte Götting. 2022. Podcasts in the UK – statistics &; facts. https://www.statista.com/topics/6908/podcasts-in-the-uk/ #topicOverview
[32] Benjamin P Lange, Peter Wühr, and Sascha Schwarz. 2021. Of Time Gals and Mega Men: Empirical findings on gender differences in digital game genre preferences and the accuracy of respective gender stereotypes. Frontiers in Psychology 12 (2021), 657430.
[33] Megan Lazovick. 2022. Women podcast listeners: closing the listening gender gap.
[34] Paul Pu Liang, Irene Mengze Li, Emily Zheng, Yao Chong Lim, Ruslan Salakhutdinov, and Louis-Philippe Morency. 2020. Towards Debiasing Sentence Representations. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Online, 5502–5515. https: //doi.org/10.18653/v1/2020.acl-main.488
[35] Yang Liu, Eunice Jun, Qisheng Li, and Jeffrey Heer. 2019. Latent Space Cartography: Visual Analysis of Vector Space Embeddings. Computer Graphics Forum 38, 3 (2019), 67–78. https://doi.org/10.1111/cgf.13672 arXiv:https://onlinelibrary.wiley.com/doi/pdf/10.1111/cgf.13672
[36] Michael Madaio, Lisa Egede, Hariharan Subramonyam, Jennifer Wortman Vaughan, and Hanna Wallach. 2022. Assessing the Fairness of AI Systems: AI Practitioners’ Processes, Challenges, and Needs for Support. Proceedings of the ACM on Human-Computer Interaction 6, CSCW1 (2022), 1–26.
[37] Michael A. Madaio, Luke Stark, Jennifer Wortman Vaughan, and Hanna Wallach. 2020. Co-Designing Checklists to Understand Organizational Challenges and Opportunities around Fairness in AI. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems (Honolulu, HI, USA) (CHI ’20). Association for Computing Machinery, New York, NY, USA, 1–14. https://doi.org/10.1145/3313831.
[38] Chandler May, Alex Wang, Shikha Bordia, Samuel R. Bowman, and Rachel Rudinger. 2019. On Measuring Social Biases in Sentence Encoders. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Association for Computational Linguistics, Minneapolis, Minnesota, 622–628. https://doi.org/10.18653/v1/N19-1063 Beattie et al.
[39] Ninareh Mehrabi, Fred Morstatter, Nripsuta Saxena, Kristina Lerman, and Aram Galstyan. 2021. A survey on bias and fairness in machine learning. ACM Computing Surveys (CSUR) 54, 6 (2021), 1–35.
[40] Rishabh Mehrotra, James McInerney, Hugues Bouchard, Mounia Lalmas, and Fernando Diaz. 2018. Towards a Fair Marketplace: Counterfactual Evaluation of the Trade-off between Relevance, Fairness & Satisfaction in Recommendation Systems. In Proceedings of the 27th ACM International Conference on Information and Knowledge Management (Torino, Italy) (CIKM ’18). Association for Computing Machinery, New York, NY, USA, 2243–2251. https://doi.org/10.1145/3269206.3272027
[41] Alessandro B Melchiorre, Navid Rekabsaz, Emilia Parada-Cabaleiro, Stefan Brandl, Oleg Lesota, and Markus Schedl. 2021. Investigating gender fairness of recommendation algorithms in the music domain. Information Processing & Management 58, 5 (2021), 102666.
[42] Brett Millar. 2008. Selective hearing: Gender bias in the music preferences of young adults. Psychology of music 36, 4 (2008), 429–445. [43] Zahra Nazari, Christophe Charbuillet, Johan Pages, Martin Laurent, Denis Charrier, Briana Vecchione, and Ben Carterette. 2020. Recommending Podcasts for Cold-Start Users Based on Music Listening and Taste. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval (Virtual Event, China) (SIGIR ’20). Association for Computing Machinery, New York, NY, USA, 1041–1050. https://doi.org/10.1145/3397271.3401101
[44] Preksha Nema, Alexandros Karatzoglou, and Filip Radlinski. 2021. Disentangling Preference Representations for Recommendation Critiquing with ß-VAE. In Proceedings of the 30th ACM International Conference on Information & Knowledge Management (Virtual Event, Queensland, Australia) (CIKM ’21). Association for Computing Machinery, New York, NY, USA, 1356–1365. https://doi.org/10.1145/ 3459637.3482425
[45] Shiva Omrani Sabbaghi and Aylin Caliskan. 2022. Measuring Gender Bias in Word Embeddings of Gendered Languages Requires Disentangling Grammatical Gender Signals. In Proceedings of the 2022 AAAI/ACM Conference on AI, Ethics, and Society (Oxford, United Kingdom) (AIES ’22). Association for Computing Machinery, New York, NY, USA, 518–531. https://doi.org/10.1145/3514094.3534176
[46] Hadas Orgad, Seraphina Goldfarb-Tarrant, and Yonatan Belinkov. 2022. How Gender Debiasing Affects Internal Model Representations, and Why It Matters. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, Seattle, United States, 2602–2628. https://doi.org/10.18653/ v1/2022.naacl-main.188
[47] Inioluwa Deborah Raji, Andrew Smart, Rebecca N. White, Margaret Mitchell, Timnit Gebru, Ben Hutchinson, Jamila Smith-Loud, Daniel Theron, and Parker Barnes. 2020. Closing the AI Accountability Gap: Defining an End-to-End Framework for Internal Algorithmic Auditing. In Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency (Barcelona, Spain) (FAT* ’20). Association for Computing Machinery, New York, NY, USA, 33–44. https://doi.org/10.1145/3351095.3372873
[48] Brianna Richardson, Jean Garcia-Gathright, Samuel F. Way, Jennifer Thom, and Henriette Cramer. 2021. Towards Fairness in Practice: A Practitioner-Oriented Rubric for Evaluating Fair ML Toolkits. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems (Yokohama, Japan) (CHI ’21). Association for Computing Machinery, New York, NY, USA, Article 236, 13 pages. https: //doi.org/10.1145/3411764.3445604
[49] Beatrice Savoldi, Marco Gaido, Luisa Bentivogli, Matteo Negri, and Marco Turchi. 2021. Gender Bias in Machine Translation. Transactions of the Association for Computational Linguistics 9 (08 2021), 845–874. https://doi.org/ 10.1162/tacla00401 arXiv:https://direct.mit.edu/tacl/articlepdf/doi/10.1162/tacla00401/1957705/tacla00401.pdf
[50] Dougal Shakespeare, Lorenzo Porcaro, Emilia Gómez, and Carlos Castillo. 2020. Exploring artist gender bias in music recommendation. In Proceedings of the ImpactRS Workshop at ACM RecSys ’20. Virtual Event, Brazil.
[51] Guy Shani and Asela Gunawardana. 2011. Evaluating Recommendation Systems. Springer US, Boston, MA, 257–297. https://doi.org/10.1007/ 978-0-387-85820-3_8
[52] Arthur D Soto-Vásquez, M Olguta Vilceanu, and Kristine C Johnson. 2022. “Just hanging with my friends”: US Latina/o/x perspectives on parasocial relationships in podcast listening during COVID-19. Popular Communication 20, 4 (2022), 324–337.
[53] Yi Chern Tan and L. Elisa Celis. 2019. Assessing Social and Intersectional Biases in Contextualized Word Representations. In Advances in Neural Information Processing Systems, H. Wallach, H. Larochelle, A. Beygelzimer, F. d’Alché-Buc, E. Fox, and R. Garnett (Eds.), Vol. 32. Curran Associates, Inc., Vancouver, Canada. https://proceedings.neurips.cc/paper_files/paper/2019/file/ 201d546992726352471cfea6b0df0a48-Paper.pdf
[54] The Artificial Intelligence Channel. 2017. The Trouble with Bias – NIPS 2017 Keynote – Kate Crawford #NIPS2017. https://www.youtube.com/ watch?v=fMym_BKWQzk
[55] Mike Thelwall. 2019. Reader and author gender and genre in Goodreads. Journal of Librarianship and Information Science 51, 2 (2019), 403–430.
[56] Laurens Van der Maaten and Geoffrey Hinton. 2008. Visualizing data using t-SNE. Journal of machine learning research 9, 11 (2008), 2579–2605.
[57] Sahil Verma, Ruoyuan Gao, and Chirag Shah. 2020. Facets of Fairness in Search and Recommendation. In Bias and Social Aspects in Search and Recommendation, Ludovico Boratto, Stefano Faralli, Mirko Marras, and Giovanni Stilo (Eds.). Springer International Publishing, Cham, 1–11.
[58] X. Wang, Q. Li, D. Yu, P. Cui, Z. Wang, and G. Xu. 5555. Causal Disentanglement for Semantics-Aware Intent Learning in Recommendation. IEEE Transactions on Knowledge and Data Engineering 01 (mar 5555), 1–1. https://doi.org/10.1109/TKDE.2022.3159802
[59] Xuezhi Wang, Nithum Thain, Anu Sinha, Flavien Prost, Ed H. Chi, Jilin Chen, and Alex Beutel. 2021. Practical Compositional Fairness: Understanding Fairness in Multi-Component Recommender Systems. In Proceedings of the 14th ACM International Conference on Web Search and Data Mining (Virtual Event, Israel) (WSDM ’21). Association for Computing Machinery, New York, NY, USA, 436–444. https://doi.org/ 10.1145/3437963.3441732
[60] Peter Wühr, Benjamin P Lange, and Sascha Schwarz. 2017. Tears or fears? Comparing gender stereotypes about movie preferences to actual preferences. Frontiers in psychology 8 (2017), 428.
[61] Haiyang Zhang, Alison Sneyd, and Mark Stevenson. 2020. Robustness and Reliability of Gender Bias Assessment in Word Embeddings: The Role of Base Pairs. In Proceedings of the 1st Conference of the AsiaPacific Chapter of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing. Association for Computational Linguistics, Suzhou, China, 759–769. https://aclanthology.org/2020.aacl-main.76
[62] Yin Zhang, Ziwei Zhu, Yun He, and James Caverlee. 2020. ContentCollaborative Disentanglement Representation Learning for Enhanced Recommendation. In Proceedings of the 14th ACM Conference on Recommender Systems (Virtual Event, Brazil) (RecSys ’20). Association for Computing Machinery, New York, NY, USA, 43–52. https: //doi.org/10.1145/3383313.3412239
[63] Jieyu Zhao, Tianlu Wang, Mark Yatskar, Ryan Cotterell, Vicente Ordonez, and Kai-Wei Chang. 2019. Gender Bias in Contextualized Word Embeddings. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Evaluation Framework for Understanding Sensitive Attribute Association Bias in Latent Factor Recommendation Algorithms Language Technologies, Volume 1 (Long and Short Papers). Association for Computational Linguistics, Minneapolis, Minnesota, 629–634. https://doi.org/10.18653/v1/N19-1064
[64] Zihao Zhao, Jiawei Chen, Sheng Zhou, Xiangnan He, Xuezhi Cao, Fuzheng Zhang, and Wei Wu. 2022. Popularity Bias Is Not Always Evil: Disentangling Benign and Harmful Bias for Recommendation. IEEE Transactions on Knowledge and Data Engineering 01 (2022), 1–13. https://doi.org/10.1109/TKDE.2022.3218994
[65] Yu Zheng, Chen Gao, Xiang Li, Xiangnan He, Yong Li, and Depeng Jin. 2021. Disentangling User Interest and Conformity for Recommendation with Causal Embedding. In Proceedings of the Web Conference 2021 (Ljubljana, Slovenia) (WWW ’21). Association for Computing Machinery, New York, NY, USA, 2980–2991. https://doi.org/10.1145/ 3442381.3449788
[66] Pei Zhou, Weijia Shi, Jieyu Zhao, Kuan-Hao Huang, Muhao Chen, Ryan Cotterell, and Kai-Wei Chang. 2019. Examining Gender Bias in Languages with Grammatical Gender. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). Association for Computational Linguistics, Hong Kong, China, 5276–5284. https://doi.org/10.18653/v1/D19-1531
:::info
Authors:
- Lex Beattie
- Isabel Corpus
- Lucy H. Lin
- Praveen Ravichandran
:::
:::info
This paper is available on arxiv under CC by 4.0 Deed (Attribution 4.0 International) license.
:::
