In continuation to the below post ,the topic of this post is - "Model Generalization in Machine Learning"
What is the meaning of Model Generalization?
Once you select the Model that is statistically the best Model during the training phase, the next step is to apply this model on Validation Data set and Test Data set.
Often we are encountered with 2 types of Errors.
1. Bias
2. Variance
As you increase the size of the training data-set , the Gradient Descent algorithm will try to optimise and average out the parameters due to which the - increase in size of the training data-set can potentially cause the increase in Training Error
While the cross validation error decrease with an increase in training data-set
This is because now you have a large data-set and easy to generalize the Model mathematically
You are now mathematically correct but this Model is not applicable to Business for real time
High Bias Issue: Reject the Model
1.A case where the Gap between Training error and cross validation error is high in Small Training data set.
2.When you increase the training sample , the gap between CV Error and Training Error will decrease but the error will continue to be high mathematically below the Desired Performance = Reject Model
Alternatively you can encounter Variance when
1.A case where the Gap between Training error and cross validation error is high in Small Training data set.
2.When you increase the training sample , the gap between CV Error and Training Error will decrease, the error will be close to desired performance if you increase the Training Data-set = Accept the Model
Model Generalization Error in Regression = Bias + Variance which can be fixed by the following ways
What is the meaning of Model Generalization?
Once you select the Model that is statistically the best Model during the training phase, the next step is to apply this model on Validation Data set and Test Data set.
Often we are encountered with 2 types of Errors.
1. Bias
2. Variance
As you increase the size of the training data-set , the Gradient Descent algorithm will try to optimise and average out the parameters due to which the - increase in size of the training data-set can potentially cause the increase in Training Error
While the cross validation error decrease with an increase in training data-set
This is because now you have a large data-set and easy to generalize the Model mathematically
You are now mathematically correct but this Model is not applicable to Business for real time
High Bias Issue: Reject the Model
1.A case where the Gap between Training error and cross validation error is high in Small Training data set.
2.When you increase the training sample , the gap between CV Error and Training Error will decrease but the error will continue to be high mathematically below the Desired Performance = Reject Model
Alternatively you can encounter Variance when
1.A case where the Gap between Training error and cross validation error is high in Small Training data set.
2.When you increase the training sample , the gap between CV Error and Training Error will decrease, the error will be close to desired performance if you increase the Training Data-set = Accept the Model
Model Generalization Error in Regression = Bias + Variance which can be fixed by the following ways
Comments
Post a Comment