Optimizers are an essential part of everyone working in machine learning.
We all know optimizers determine how the model will converge the loss function during gradient descent. Thus, using the right optimizer can boost the performance and the efficiency of model training.
Besides classic papers, many books explain the principles behind optimizers in simple terms.
However, I recently found that the performance of Keras 3 optimizers doesn’t quite match the mathematical algorithms described in these books, which made me a bit anxious. I worried about misunderstanding something or about updates in the latest version of Keras affecting the optimizers.
So, I reviewed the source code of several common optimizers in Keras 3 and revisited their use cases. Now I want to share this knowledge to save you time and help you master Keras 3 optimizers more quickly.
If you’re not very familiar with the latest changes in Keras 3, here’s a quick rundown: Keras 3 integrates TensorFlow, PyTorch, and JAX, allowing us to use cutting-edge deep learning frameworks easily through Keras APIs.