This is part 3 of my new multi-part series 🐍 Towards Mamba State Space Models for Images, Videos and Time Series.
Mamba, the model to be said to replace the mighty Transformer, has come a long way from the initial idea of using state space models (SSMs) in deep learning.
Mamba adds selectivity to state space models which results in Transformer-like performance while maintaining the sub-quadratic work complexity of SSMs. Their efficient selective scan is 40x faster than a standard implementation and can achieve 5x more throughput as a Transformer.
Join me on this deep dive on Mamba where we will discover how selectivity solves the limitations of previous SSMs, how Mamba overcomes new obstacles that come with those changes and how we can incorporate Mamba into a modern deep learning architecture.