StealthRL: Reinforcement Learning Paraphrase Attacks for Multi-Detector Evasion of AI-Text Detectors

📰 ArXiv cs.AI

StealthRL uses reinforcement learning to generate paraphrases that evade AI-text detectors while preserving semantics

advanced Published 23 Mar 2026

Action Steps

Train a paraphrase policy using Group Relative Policy Optimization (GRPO) with LoRA adapters
Optimize the policy against a multi-detector ensemble to evade detection
Use a large language model like Qwen3-4B as the base model for paraphrasing
Evaluate the robustness of AI-text detectors under realistic adversarial conditions

Who Needs to Know This

AI engineers and researchers benefit from StealthRL as it helps stress-test detector robustness, while product managers and security teams can use it to improve the security of AI-text detectors

Key Insight

💡 Reinforcement learning can be used to generate paraphrases that preserve semantics while evading detection by AI-text detectors