FuseFSS: Efficient Secure LLM Inference with Function Secret Sharing

📰 ArXiv cs.AI

arXiv:2606.09551v1 Announce Type: cross Abstract: Two-server secure inference allows a client to query a hosted large language model (LLM) without revealing prompts or embeddings. Recent GPU systems based on function secret sharing (FSS) make linear layers efficient, but fixed-point nonlinearities and helper operations remain a bottleneck because each operator is typically implemented as a bespoke protocol with its own comparisons, wrap-around corrections, and preprocessing material. We present

Published 9 Jun 2026

Read full paper → ← Back to Reads