How does one find the best learning rate? It is
well known the learning rate is the single most important parameter
one must tune in order to
get a good final result. Leslie Smith was the pioneer in the Learning Rate
Range Finder,
and after FastAI edited
it, it has become extremely useful!
https://arxiv.org/pdf/1506.01186.pdf
[Cyclical Learning Rates for Training Neural Networks (2015)]
https://sgugger.github.io/how-do-you-find-a-good-learning-rate.html [Understanding
the difficulty of training deep feedforward neural networks (2018)]
https://www.youtube.com/watch?v=bR7z2MA0p-o
Youtube
video of Leslie Smith presenting his findings.
Essentially, start from a small learning rate
(1e-7), then exponentially increase until a large one (10 or 100) for 200
iterations. Each time, you consume one mini-batch. Record the smoothed loss, and inspect the graph!
|
Copyright Daniel Han 2024. Check out Unsloth!