Weight space Detection of Backdoors in LoRA Adapters
📰 ArXiv cs.AI
arXiv:2602.15195v3 Announce Type: replace-cross Abstract: LoRA adapters let users fine-tune large language models (LLMs) efficiently. However, LoRA adapters are shared through open repositories like Hugging Face Hub \citep{huggingface_hub_docs}, making them vulnerable to backdoor attacks. Current detection methods require running the model with test input data -- making them impractical for screening thousands of adapters where the trigger for backdoor behavior is unknown. We detect poisoned ada
DeepCamp AI