Provable Anytime Ensemble Sampling Algorithms in Nonlinear Contextual Bandits

📰 ArXiv cs.AI

Learn provable anytime ensemble sampling algorithms for nonlinear contextual bandits and improve regret bounds in generalized linear and neural bandit settings

advanced Published 12 May 2026

Action Steps

Implement Generalized Linear Ensemble Sampling (GLM-ES) for generalized linear bandits using maximum likelihood estimation
Develop Neural Ensemble Sampling (Neural-ES) for neural contextual bandits using deep learning techniques
Analyze regret bounds for both GLM-ES and Neural-ES algorithms to evaluate their performance
Compare the performance of GLM-ES and Neural-ES with existing ensemble sampling methods
Apply the proposed algorithms to real-world problems in nonlinear contextual bandits to test their efficacy

Who Needs to Know This

Researchers and engineers working on contextual bandits and ensemble sampling methods can benefit from this article to improve their algorithms and regret bounds

Key Insight

💡 Provable anytime ensemble sampling algorithms can improve regret bounds in nonlinear contextual bandits

Full Article

Title: Provable Anytime Ensemble Sampling Algorithms in Nonlinear Contextual Bandits

Abstract:
arXiv:2510.10730v2 Announce Type: replace-cross Abstract: We provide a unified algorithmic framework for ensemble sampling in nonlinear contextual bandits and develop corresponding regret bounds for two most common nonlinear contextual bandit settings: Generalized Linear Ensemble Sampling (GLM-ES) for generalized linear bandits and Neural Ensemble Sampling (Neural-ES) for neural contextual bandits. Both methods maintain multiple estimators for the reward model parameters via maximum likelihood e

Read full paper → ← Back to Reads