Contrast-Enhanced Gating in GRUs for Robust Low-Data Sequence Learning

📰 ArXiv cs.AI

arXiv:2402.09034v3 Announce Type: replace-cross Abstract: Activation functions govern how recurrent networks regulate and transmit information across temporal dependencies. Despite advances in sequence modelling, gated recurrent units (GRUs) still depend on the standard sigmoid and tanh nonlinearities, which can produce weak gate separation and unstable learning, particularly when training data are limited. We introduce squared sigmoid-tanh (SST), a parameter-free activation that squares the gat

Published 29 Apr 2026
Read full paper → ← Back to Reads