OpenAlex Citation Counts

OpenAlex is a bibliographic catalogue of scientific papers, authors and institutions accessible in open access mode, named after the Library of Alexandria. It's citation coverage is excellent and I hope you will find utility in this listing of citing articles!

If you click the article title, you'll navigate to the article, as listed in CrossRef. If you click the Open Access links, you'll navigate to the "best Open Access location". Clicking the citation count will open this listing for that article. Lastly at the bottom of the page, you'll find basic pagination options.

Requested Article:

Deduplicating Training Data Makes Language Models Better
Katherine Lee, Daphne Ippolito, Andrew Nystrom, et al.
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) (2022)
Open Access | Times Cited: 161

Showing 1-25 of 161 citing articles:

On the Opportunities and Risks of Foundation Models
Rishi Bommasani, Drew A. Hudson, Ehsan Adeli, et al.
arXiv (Cornell University) (2021)
Open Access | Times Cited: 1565

Multitask Prompted Training Enables Zero-Shot Task Generalization
Victor Sanh, Albert Webson, Colin Raffel, et al.
arXiv (Cornell University) (2021)
Open Access | Times Cited: 466

A survey on large language model (LLM) security and privacy: The Good, The Bad, and The Ugly
Yifan Yao, Jinhao Duan, Kaidi Xu, et al.
High-Confidence Computing (2024) Vol. 4, Iss. 2, pp. 100211-100211
Open Access | Times Cited: 243

TimeLMs: Diachronic Language Models from Twitter
Daniel Loureiro, Francesco Barbieri, Leonardo Neves, et al.
(2022)
Open Access | Times Cited: 155

Imagen Editor and EditBench: Advancing and Evaluating Text-Guided Image Inpainting
Su Wang, Chitwan Saharia, Ceslee Montgomery, et al.
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2023), pp. 18359-18369
Open Access | Times Cited: 74

Repairing the Cracked Foundation: A Survey of Obstacles in Evaluation Practices for Generated Text
Sebastian Gehrmann, Elizabeth A. Clark, Thibault Sellam
Journal of Artificial Intelligence Research (2023) Vol. 77, pp. 103-166
Open Access | Times Cited: 71

How to DP-fy ML: A Practical Guide to Machine Learning with Differential Privacy
Natalia Ponomareva, Hussein Hazimeh, А.В. Куракин, et al.
Journal of Artificial Intelligence Research (2023) Vol. 77, pp. 1113-1201
Open Access | Times Cited: 62

The BigScience ROOTS Corpus: A 1.6TB Composite Multilingual Dataset
Hugo Laurençon, Lucile Saulnier, Thomas J. Wang, et al.
arXiv (Cornell University) (2023)
Open Access | Times Cited: 50

Foundation Models and Fair Use
Peter Henderson, Xuechen Li, Dan Jurafsky, et al.
SSRN Electronic Journal (2023)
Open Access | Times Cited: 50

Efficient Methods for Natural Language Processing: A Survey
Marcos Treviso, Ji-Ung Lee, Tianchu Ji, et al.
Transactions of the Association for Computational Linguistics (2023) Vol. 11, pp. 826-860
Open Access | Times Cited: 48

Protein language models are biased by unequal sequence sampling across the tree of life
Frances Ding, Jacob Steinhardt
bioRxiv (Cold Spring Harbor Laboratory) (2024)
Open Access | Times Cited: 24

Towards trustworthy LLMs: a review on debiasing and dehallucinating in large language models
Zichao Lin, Shuyan Guan, Wending Zhang, et al.
Artificial Intelligence Review (2024) Vol. 57, Iss. 9
Open Access | Times Cited: 19

Mitigating Large Language Model Hallucinations via Autonomous Knowledge Graph-Based Retrofitting
Xinyan Guan, Yanjiang Liu, Hongyu Lin, et al.
Proceedings of the AAAI Conference on Artificial Intelligence (2024) Vol. 38, Iss. 16, pp. 18126-18134
Open Access | Times Cited: 18

Are Large Pre-Trained Language Models Leaking Your Personal Information?
Jie Huang, Hanyin Shao, Kevin Chen–Chuan Chang
(2022)
Open Access | Times Cited: 47

Deep lexical hypothesis: Identifying personality structure in natural language.
Andrew D. Cutler, David Condon
Journal of Personality and Social Psychology (2022) Vol. 125, Iss. 1, pp. 173-197
Open Access | Times Cited: 44

Speak, Memory: An Archaeology of Books Known to ChatGPT/GPT-4
Kent K. Chang, Mackenzie Cramer, Sandeep Soni, et al.
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing (2023), pp. 7312-7327
Open Access | Times Cited: 40

Towards a Cleaner Document-Oriented Multilingual Crawled Corpus
Julien Abadji, Pedro Ortiz Suarez, Laurent Romary, et al.
arXiv (Cornell University) (2022)
Open Access | Times Cited: 39

Data Contamination: From Memorization to Exploitation
Inbal Magar, Roy Schwartz
(2022), pp. 157-165
Open Access | Times Cited: 39

Language Model Behavior: A Comprehensive Survey
Tyler A. Chang, Benjamin Bergen
Computational Linguistics (2023) Vol. 50, Iss. 1, pp. 293-350
Open Access | Times Cited: 39

Looking for a Needle in a Haystack: A Comprehensive Study of Hallucinations in Neural Machine Translation
Ricardo Rei, Elena Voita, André F. T. Martins
(2023)
Open Access | Times Cited: 38

medBERT.de: A comprehensive German BERT model for the medical domain
Keno K. Bressem, Jens-Michalis Papaioannou, Paul Grundmann, et al.
Expert Systems with Applications (2023) Vol. 237, pp. 121598-121598
Open Access | Times Cited: 28

Language Generation Models Can Cause Harm: So What Can We Do About It? An Actionable Survey
Sachin Kumar, Vidhisha Balachandran, Lucille Njoo, et al.
(2023)
Open Access | Times Cited: 26

Invited Paper: VerilogEval: Evaluating Large Language Models for Verilog Code Generation
Mingjie Liu, Nathaniel Pinckney, Brucek Khailany, et al.
2015 IEEE/ACM International Conference on Computer-Aided Design (ICCAD) (2023), pp. 1-8
Closed Access | Times Cited: 26

How Much Do Language Models Copy From Their Training Data? Evaluating Linguistic Novelty in Text Generation Using RAVEN
R. Thomas McCoy, Paul Smolensky, Tal Linzen, et al.
Transactions of the Association for Computational Linguistics (2023) Vol. 11, pp. 652-670
Open Access | Times Cited: 25

Page 1 - Next Page

Cookie	Duration	Description
cookielawinfo-checkbox-advertisement	1 year	Set by the GDPR Cookie Consent plugin, this cookie is used to record the user consent for the cookies in the "Advertisement" category .
cookielawinfo-checkbox-analytics	1 year	Set by the GDPR Cookie Consent plugin, this cookie is used to record the user consent for the cookies in the "Analytics" category .
cookielawinfo-checkbox-functional	1 year	The cookie is set by the GDPR Cookie Consent plugin to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	1 year	Set by the GDPR Cookie Consent plugin, this cookie is used to record the user consent for the cookies in the "Necessary" category .
cookielawinfo-checkbox-others	1 year	Set by the GDPR Cookie Consent plugin, this cookie is used to store the user consent for cookies in the category "Others".
cookielawinfo-checkbox-performance	1 year	Set by the GDPR Cookie Consent plugin, this cookie is used to store the user consent for cookies in the category "Performance".
CookieLawInfoConsent	1 year	Records the default button state of the corresponding category & the status of CCPA. It works only in coordination with the primary cookie.
PHPSESSID	session	This cookie is native to PHP applications. The cookie is used to store and identify a users' unique session ID for the purpose of managing user session on the website. The cookie is a session cookies and is deleted when all the browser windows are closed.

Requested Article:

Showing 1-25 of 161 citing articles:

Your Privacy