Batch Normalization

Home

How does one find the best learning rate? It is well known the learning rate is the single most important parameter

one must tune in order to get a good final result. Leslie Smith was the pioneer in the Learning Rate Range Finder,

and after FastAI edited it, it has become extremely useful!

 

https://arxiv.org/pdf/1506.01186.pdf [Cyclical Learning Rates for Training Neural Networks (2015)]

https://sgugger.github.io/how-do-you-find-a-good-learning-rate.html [Understanding the difficulty of training deep feedforward neural networks (2018)]

 

https://www.youtube.com/watch?v=bR7z2MA0p-o

Youtube video of Leslie Smith presenting his findings.

 

Essentially, start from a small learning rate (1e-7), then exponentially increase until a large one (10 or 100) for 200 iterations. Each time, you consume one mini-batch. Record the smoothed loss, and inspect the graph!

 

 

 

 

 

 

 

 

 

Copyright Daniel Han 2024. Check out Unsloth!