What Does Hyperparameter Tuning Do? [13 Replies Found]

Before we discuss these various tuning methods, I’d like to quickly revisit the purpose of splitting our data into training, validation, and test data 😊 The ultimate goal for any machine learning model is to learn from examples in such a manner that the model is capable of generalizing the learning to new instances which it has not yet seen 🔥 To evaluate the generalization ability of the model, train only a small portion of the dataset. [1]
Let’s imagine we create a population of N Machine Learning models with some predefined Hyperparameters. Once we have calculated the accuracy of each model, we can decide whether to keep half the best models or not. So that we can again have a number of models, it is possible to create offsprings with similar Hyperparameters. We can now calculate the accuracy for each model again and then repeat the cycle for an arbitrary number of years. This way you can get the very best. Models will survive at the end of the process. (last revision 47 days back by Elaina Plimmer from Urumqi in China. [2]
Image #2
Manual search can be iterative, but each time you give a value for hyperparameters, the model will use that value to determine optimal performance. This is an exhaustive process. Instead of repeating this step, you can give multiple values for each hyperparameter to the model at once and then let it decide which one is best. Some people might think I’m talking about grid searching, when in fact this is hyperparameter tuning. Each value is randomly chosen by the model and it attempts to find the best fit for the dataset. There are always a few options that could be optimal, but this is possible. Even though random search is faster and more efficient, it usually provides the most optimal results. It is therefore a win/win situation. This article was edited by Elijah Gonzales, Yiwu (China), October 12, 2021. [3]
Image #3
Oreilly.com It also mentioned that there are smarter tuning options. Unlike the “dumb” alternatives of grid search and random search, smart hyperparameter tuning is much less parallelizable. Instead of creating all candidate points at once and then evaluating them in parallel, smart tuning uses a limited number of settings. Once they have evaluated their quality, the techniques choose which hyperparameter settings to use, then select where to sample. This process is sequential and inherently iterative. This process isn’t very parallelizable. This is because it allows for fewer computations, and thus reduces the total number of calculations. It is possible to save time and make fewer evaluations. Wall clock If time is your primary goal and you have the funds to buy multiple machines, I recommend sticking with random search. [4]
Image #4

Article references

  1. https://www.jeremyjordan.me/hyperparameter-tuning/
  2. https://towardsdatascience.com/hyperparameters-optimization-526348bb8e2d
  3. https://www.mygreatlearning.com/blog/hyperparameter-tuning-explained/
  4. https://www.oreilly.com/library/view/evaluating-machine-learning/9781492048756/ch04.html
Mehreen Alberts

Written by Mehreen Alberts

I'm a creative writer who has found the love of writing once more. I've been writing since I was five years old and it's what I want to do for the rest of my life. From topics that are close to my heart to everything else imaginable!

Who Owns Pentair? [7 ANSWERS FOUND]

(SOLVED!) How Do You Grow Potatoes In Texas?