Can AI-Generated Text be Reliably Detected?

March 17, 2023

Vinu Sankar Sadasivan, Aounon Kumar, Sriram Balasubramanian, Wenxiao Wang, Soheil Feizi

Computer Science

Computation and Language

Artificial Intelligence

Machine Learning

In this paper, both empirically and theoretically, we show that several AI-text detectors are not reliable in practical scenarios. Empirically, we show that paraphrasing attacks, where a light paraphraser is applied on top of a large language model (LLM), can break a whole range of detectors, including ones using watermarking schemes as well as neural network-based detectors and zero-shot classifiers. Our experiments demonstrate that retrieval-based detectors, designed to evade paraphrasing attacks, are still vulnerable to recursive paraphrasing. We then provide a theoretical impossibility result indicating that as language models become more sophisticated and better at emulating human text, the performance of even the best-possible detector decreases. For a sufficiently advanced language model seeking to imitate human text, even the best-possible detector may only perform marginally better than a random classifier. Our result is general enough to capture specific scenarios such as particular writing styles, clever prompt design, or text paraphrasing. We also extend the impossibility result to include the case where pseudorandom number generators are used for AI-text generation instead of true randomness. We show that the same result holds with a negligible correction term for all polynomial-time computable detectors. Finally, we show that even LLMs protected by watermarking schemes can be vulnerable against spoofing attacks where adversarial humans can infer hidden LLM text signatures and add them to human-generated text to be detected as text generated by the LLMs, potentially causing reputational damage to their developers. We believe these results can open an honest conversation in the community regarding the ethical and reliable use of AI-generated text.

Towards Possibilities & Impossibilities of AI-generated Text Detection: A Survey

October 23, 2023

96% Match

Soumya Suvra Ghosal, Souradip Chakraborty, Jonas Geiping, Furong Huang, ... , Bedi Amrit Singh

Computation and Language

Artificial Intelligence

Large Language Models (LLMs) have revolutionized the domain of natural language processing (NLP) with remarkable capabilities of generating human-like text responses. However, despite these advancements, several works in the existing literature have raised serious concerns about the potential misuse of LLMs such as spreading misinformation, generating fake news, plagiarism in academia, and contaminating the web. To address these concerns, a consensus among the research commun...

Find SimilarView on arXiv

Humanizing Machine-Generated Content: Evading AI-Text Detection through Adversarial Attack

April 2, 2024

95% Match

Ying Zhou, Ben He, Le Sun

Computation and Language

Cryptography and Security

Machine Learning

With the development of large language models (LLMs), detecting whether text is generated by a machine becomes increasingly challenging in the face of malicious use cases like the spread of false information, protection of intellectual property, and prevention of academic plagiarism. While well-trained text detectors have demonstrated promising performance on unseen test data, recent research suggests that these detectors have vulnerabilities when dealing with adversarial att...

Find SimilarView on arXiv

Detecting AI-Generated Text: Factors Influencing Detectability with Current Methods

June 21, 2024

95% Match

Kathleen C. Fraser, Hillary Dawkins, Svetlana Kiritchenko

Computation and Language

Computers and Society

Large language models (LLMs) have advanced to a point that even humans have difficulty discerning whether a text was generated by another human, or by a computer. However, knowing whether a text was produced by human or artificial intelligence (AI) is important to determining its trustworthiness, and has applications in many domains including detecting fraud and academic dishonesty, as well as combating the spread of misinformation and political propaganda. The task of AI-gen...

Find SimilarView on arXiv

A Survey on LLM-generated Text Detection: Necessity, Methods, and Future Directions

October 23, 2023

94% Match

Junchao Wu, Shu Yang, Runzhe Zhan, Yulin Yuan, ... , Chao Lidia S.

Computation and Language

Artificial Intelligence

The powerful ability to understand, follow, and generate complex language emerging from large language models (LLMs) makes LLM-generated text flood many areas of our daily lives at an incredible speed and is widely accepted by humans. As LLMs continue to expand, there is an imperative need to develop detectors that can detect LLM-generated text. This is crucial to mitigate potential misuse of LLMs and safeguard realms like artistic expression and social networks from harmful ...

Find SimilarView on arXiv

Decoding the AI Pen: Techniques and Challenges in Detecting AI-Generated Text

March 9, 2024

94% Match

Sara Abdali, Richard Anarfi, ... , He Jia

Computation and Language

Artificial Intelligence

Machine Learning

Large Language Models (LLMs) have revolutionized the field of Natural Language Generation (NLG) by demonstrating an impressive ability to generate human-like text. However, their widespread usage introduces challenges that necessitate thoughtful examination, ethical scrutiny, and responsible practices. In this study, we delve into these challenges, explore existing strategies for mitigating them, with a particular emphasis on identifying AI-generated text as the ultimate solu...

Find SimilarView on arXiv

The Science of Detecting LLM-Generated Texts

February 4, 2023

94% Match

Ruixiang Tang, Yu-Neng Chuang, Xia Hu

Computation and Language

Artificial Intelligence

The emergence of large language models (LLMs) has resulted in the production of LLM-generated texts that is highly sophisticated and almost indistinguishable from texts written by humans. However, this has also sparked concerns about the potential misuse of such texts, such as spreading misinformation and causing disruptions in the education system. Although many detection approaches have been proposed, a comprehensive understanding of the achievements and challenges is still...

Find SimilarView on arXiv

Almost AI, Almost Human: The Challenge of Detecting AI-Polished Writing

February 21, 2025

94% Match

Shoumik Saha, Soheil Feizi

Computation and Language

Artificial Intelligence

Human-Computer Interaction

Machine Learning

The growing use of large language models (LLMs) for text generation has led to widespread concerns about AI-generated content detection. However, an overlooked challenge is AI-polished text, where human-written content undergoes subtle refinements using AI tools. This raises a critical question: should minimally polished text be classified as AI-generated? Misclassification can lead to false plagiarism accusations and misleading claims about AI prevalence in online content. I...

Find SimilarView on arXiv

Paraphrasing evades detectors of AI-generated text, but retrieval is an effective defense

March 23, 2023

94% Match

Kalpesh Krishna, Yixiao Song, Marzena Karpinska, ... , Iyyer Mohit

Computation and Language

Cryptography and Security

Machine Learning

The rise in malicious usage of large language models, such as fake content creation and academic plagiarism, has motivated the development of approaches that identify AI-generated text, including those based on watermarking or outlier detection. However, the robustness of these detection algorithms to paraphrases of AI-generated text remains unclear. To stress test these detectors, we build a 11B parameter paraphrase generation model (DIPPER) that can paraphrase paragraphs, c...

Find SimilarView on arXiv

On the Reliability of Watermarks for Large Language Models

June 7, 2023

94% Match

John Kirchenbauer, Jonas Geiping, Yuxin Wen, Manli Shu, Khalid Saifullah, Kezhi Kong, Kasun Fernando, Aniruddha Saha, ... , Goldstein Tom

Machine Learning

Computation and Language

Cryptography and Security

As LLMs become commonplace, machine-generated text has the potential to flood the internet with spam, social media bots, and valueless content. Watermarking is a simple and effective strategy for mitigating such harms by enabling the detection and documentation of LLM-generated text. Yet a crucial question remains: How reliable is watermarking in realistic settings in the wild? There, watermarked text may be modified to suit a user's needs, or entirely rewritten to avoid dete...

Find SimilarView on arXiv

RADAR: Robust AI-Text Detection via Adversarial Learning

July 7, 2023

93% Match

Xiaomeng Hu, Pin-Yu Chen, Tsung-Yi Ho

Computation and Language

Artificial Intelligence

Machine Learning

Recent advances in large language models (LLMs) and the intensifying popularity of ChatGPT-like applications have blurred the boundary of high-quality text generation between humans and machines. However, in addition to the anticipated revolutionary changes to our technology and society, the difficulty of distinguishing LLM-generated texts (AI-text) from human-generated texts poses new challenges of misuse and fairness, such as fake content generation, plagiarism, and false a...

Find SimilarView on arXiv