
OpenAlex is a bibliographic catalogue of scientific papers, authors and institutions accessible in open access mode, named after the Library of Alexandria. It's citation coverage is excellent and I hope you will find utility in this listing of citing articles!
If you click the article title, you'll navigate to the article, as listed in CrossRef. If you click the Open Access links, you'll navigate to the "best Open Access location". Clicking the citation count will open this listing for that article. Lastly at the bottom of the page, you'll find basic pagination options.
Requested Article:
Detoxifying Language Models Risks Marginalizing Minority Voices
Albert Xu, Eshaan Pathak, Eric Wallace, et al.
Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (2021)
Open Access | Times Cited: 52
Albert Xu, Eshaan Pathak, Eric Wallace, et al.
Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (2021)
Open Access | Times Cited: 52
Showing 1-25 of 52 citing articles:
On the Opportunities and Risks of Foundation Models
Rishi Bommasani, Drew A. Hudson, Ehsan Adeli, et al.
arXiv (Cornell University) (2021)
Open Access | Times Cited: 1575
Rishi Bommasani, Drew A. Hudson, Ehsan Adeli, et al.
arXiv (Cornell University) (2021)
Open Access | Times Cited: 1575
Red Teaming Language Models with Language Models
Ethan Perez, Saffron Huang, Francis Song, et al.
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing (2022)
Open Access | Times Cited: 133
Ethan Perez, Saffron Huang, Francis Song, et al.
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing (2022)
Open Access | Times Cited: 133
Leveraging Generative AI and Large Language Models: A Comprehensive Roadmap for Healthcare Integration
Ping Yu, Hua Xu, Xia Hu, et al.
Healthcare (2023) Vol. 11, Iss. 20, pp. 2776-2776
Open Access | Times Cited: 127
Ping Yu, Hua Xu, Xia Hu, et al.
Healthcare (2023) Vol. 11, Iss. 20, pp. 2776-2776
Open Access | Times Cited: 127
ToxiGen: A Large-Scale Machine-Generated Dataset for Adversarial and Implicit Hate Speech Detection
Thomas Hartvigsen, Saadia Gabriel, Hamid Palangi, et al.
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) (2022)
Open Access | Times Cited: 115
Thomas Hartvigsen, Saadia Gabriel, Hamid Palangi, et al.
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) (2022)
Open Access | Times Cited: 115
Bias and Fairness in Large Language Models: A Survey
Isabel O. Gallegos, Ryan A. Rossi, Joe Barrow, et al.
Computational Linguistics (2024) Vol. 50, Iss. 3, pp. 1097-1179
Open Access | Times Cited: 108
Isabel O. Gallegos, Ryan A. Rossi, Joe Barrow, et al.
Computational Linguistics (2024) Vol. 50, Iss. 3, pp. 1097-1179
Open Access | Times Cited: 108
Visual Adversarial Examples Jailbreak Aligned Large Language Models
Xiangyu Qi, Kaixuan Huang, Ashwinee Panda, et al.
Proceedings of the AAAI Conference on Artificial Intelligence (2024) Vol. 38, Iss. 19, pp. 21527-21536
Open Access | Times Cited: 26
Xiangyu Qi, Kaixuan Huang, Ashwinee Panda, et al.
Proceedings of the AAAI Conference on Artificial Intelligence (2024) Vol. 38, Iss. 19, pp. 21527-21536
Open Access | Times Cited: 26
Challenges in Detoxifying Language Models
Johannes Welbl, Amelia Glaese, Jonathan Uesato, et al.
(2021)
Open Access | Times Cited: 66
Johannes Welbl, Amelia Glaese, Jonathan Uesato, et al.
(2021)
Open Access | Times Cited: 66
Language Generation Models Can Cause Harm: So What Can We Do About It? An Actionable Survey
Sachin Kumar, Vidhisha Balachandran, Lucille Njoo, et al.
(2023)
Open Access | Times Cited: 26
Sachin Kumar, Vidhisha Balachandran, Lucille Njoo, et al.
(2023)
Open Access | Times Cited: 26
A large-scale audit of dataset licensing and attribution in AI
Shayne Longpre, Robert Mahari, Anthony Chen, et al.
Nature Machine Intelligence (2024) Vol. 6, Iss. 8, pp. 975-987
Open Access | Times Cited: 12
Shayne Longpre, Robert Mahari, Anthony Chen, et al.
Nature Machine Intelligence (2024) Vol. 6, Iss. 8, pp. 975-987
Open Access | Times Cited: 12
Mix and Match: Learning-free Controllable Text Generationusing Energy Language Models
Fatemehsadat Mireshghallah, Kartik Goyal, Taylor Berg-Kirkpatrick
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) (2022), pp. 401-415
Open Access | Times Cited: 34
Fatemehsadat Mireshghallah, Kartik Goyal, Taylor Berg-Kirkpatrick
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) (2022), pp. 401-415
Open Access | Times Cited: 34
Time Waits for No One! Analysis and Challenges of Temporal Misalignment
Kelvin Luu, Daniel Khashabi, Suchin Gururangan, et al.
Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (2022)
Open Access | Times Cited: 30
Kelvin Luu, Daniel Khashabi, Suchin Gururangan, et al.
Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (2022)
Open Access | Times Cited: 30
SafetyKit: First Aid for Measuring Safety in Open-domain Conversational Systems
Emily Dinan, Gavin Abercrombie, A. Bergman, et al.
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) (2022)
Open Access | Times Cited: 28
Emily Dinan, Gavin Abercrombie, A. Bergman, et al.
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) (2022)
Open Access | Times Cited: 28
Multilingualism and AI: The Regimentation of Language in the Age of Digital Capitalism
Britta Schneider
Signs and Society (2022) Vol. 10, Iss. 3, pp. 362-387
Open Access | Times Cited: 20
Britta Schneider
Signs and Society (2022) Vol. 10, Iss. 3, pp. 362-387
Open Access | Times Cited: 20
Detoxifying Text with MaRCo: Controllable Revision with Experts and Anti-Experts
Skyler Hallinan, Alisa Liu, Yejin Choi, et al.
(2023), pp. 228-242
Open Access | Times Cited: 12
Skyler Hallinan, Alisa Liu, Yejin Choi, et al.
(2023), pp. 228-242
Open Access | Times Cited: 12
NLPositionality: Characterizing Design Biases of Datasets and Models
Sebastin Santy, Jenny T. Liang, Ronan Le Bras, et al.
(2023), pp. 9080-9102
Open Access | Times Cited: 12
Sebastin Santy, Jenny T. Liang, Ronan Le Bras, et al.
(2023), pp. 9080-9102
Open Access | Times Cited: 12
Towards a Conversational Ethics of Large Language Models
Hendrik Kempt, Alon Lavie, Saskia K. Nagel
American Philosophical Quarterly (2024) Vol. 61, Iss. 4, pp. 339-354
Closed Access | Times Cited: 4
Hendrik Kempt, Alon Lavie, Saskia K. Nagel
American Philosophical Quarterly (2024) Vol. 61, Iss. 4, pp. 339-354
Closed Access | Times Cited: 4
Utilizing subjectivity level to mitigate identity term bias in toxic comments classification
Zhixue Zhao, Ziqi Zhang, Frank Hopfgartner
Online Social Networks and Media (2022) Vol. 29, pp. 100205-100205
Open Access | Times Cited: 18
Zhixue Zhao, Ziqi Zhang, Frank Hopfgartner
Online Social Networks and Media (2022) Vol. 29, pp. 100205-100205
Open Access | Times Cited: 18
Getting AI Right: Introductory Notes on AI & Society
James Manyika
Daedalus (2022) Vol. 151, Iss. 2, pp. 5-27
Open Access | Times Cited: 17
James Manyika
Daedalus (2022) Vol. 151, Iss. 2, pp. 5-27
Open Access | Times Cited: 17
The case of romantic relationships: analysis of the use of metaphorical frames with ‘traditional family’ and related terms in political Telegram posts in three countries and three languages
Sviatlana Höhn
Lodz Papers in Pragmatics (2025)
Open Access
Sviatlana Höhn
Lodz Papers in Pragmatics (2025)
Open Access
Metaethical perspectives on ‘benchmarking’ AI ethics
Travis LaCroix, Alexandra Sasha Luccioni
AI and Ethics (2025)
Open Access
Travis LaCroix, Alexandra Sasha Luccioni
AI and Ethics (2025)
Open Access
Reward modeling for mitigating toxicity in transformer-based language models
Farshid Faal, Ketra Schmitt, Jia Yuan Yu
Applied Intelligence (2022) Vol. 53, Iss. 7, pp. 8421-8435
Closed Access | Times Cited: 16
Farshid Faal, Ketra Schmitt, Jia Yuan Yu
Applied Intelligence (2022) Vol. 53, Iss. 7, pp. 8421-8435
Closed Access | Times Cited: 16
A Security Risk Taxonomy for Prompt-Based Interaction With Large Language Models
Erik Derner, Kristina Batistič, Jan Zahálka, et al.
IEEE Access (2024) Vol. 12, pp. 126176-126187
Open Access | Times Cited: 3
Erik Derner, Kristina Batistič, Jan Zahálka, et al.
IEEE Access (2024) Vol. 12, pp. 126176-126187
Open Access | Times Cited: 3
KoSBI: A Dataset for Mitigating Social Bias Risks Towards Safer Large Language Model Applications
Hwaran Lee, Seok‐Hee Hong, Joonsuk Park, et al.
(2023)
Open Access | Times Cited: 8
Hwaran Lee, Seok‐Hee Hong, Joonsuk Park, et al.
(2023)
Open Access | Times Cited: 8
Self-Detoxifying Language Models via Toxification Reversal
Chak Tou Leong, Yi Cheng, Jiashuo Wang, et al.
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing (2023), pp. 4433-4449
Open Access | Times Cited: 6
Chak Tou Leong, Yi Cheng, Jiashuo Wang, et al.
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing (2023), pp. 4433-4449
Open Access | Times Cited: 6
Leashing the Inner Demons: Self-Detoxification for Language Models
Canwen Xu, Zexue He, Zhankui He, et al.
Proceedings of the AAAI Conference on Artificial Intelligence (2022) Vol. 36, Iss. 10, pp. 11530-11537
Open Access | Times Cited: 10
Canwen Xu, Zexue He, Zhankui He, et al.
Proceedings of the AAAI Conference on Artificial Intelligence (2022) Vol. 36, Iss. 10, pp. 11530-11537
Open Access | Times Cited: 10