Model Generalization in Machine Learning!

In continuation to the below post ,the topic of this post is - "Model Generalization in Machine Learning"

What is the meaning of Model Generalization?

Once you select the Model  that is statistically the best Model  during the training phase, the next step is to apply this model on Validation Data set and Test Data set.

Often we are encountered with 2 types of Errors.
1. Bias
2. Variance

As you increase the  size of the training data-set , the Gradient Descent algorithm will try to optimise and average out the parameters due to which the  - increase in size of the training data-set can potentially cause the increase in Training Error



While the cross validation error decrease with an increase in training data-set
This is because now you have a large data-set and easy to generalize the Model mathematically

You are now mathematically correct but this Model is not applicable to Business for real time

 High Bias Issue: Reject the Model
1.A case where the Gap between Training error and cross validation error is high in Small Training data set.
2.When you increase the training sample , the gap between CV Error and Training Error will decrease but the error will continue to be high mathematically below the Desired Performance = Reject Model






Alternatively you can encounter Variance when
1.A case where the Gap between Training error and cross validation error is high in Small Training data set.
2.When you increase the training sample , the gap between CV Error and Training Error will decrease, the error will be close to desired performance if you increase the Training Data-set  = Accept the Model








Model Generalization Error in Regression = Bias + Variance which can be fixed by the following ways








Comments