Introducing Gradient Boosting
- Gradient boosting is a machine learning technique for regression and classification problems
- Gradient boosting typically produces a prediction model in the form of an ensemble of weak prediction models
- Typically, gradient boosting produces decision trees
Describing Gradient Boosting for Regression
-
Choose a loss function
- Loss values represent how off our model is when making predictions for an observation
- Compute an initial prediction that is the average of the response
- Compute the residuals between each observation and
-
Fit a regression tree to the values
- Usually, these trees are shallow, but larger than a stump
-
Send each observation through the new tree
- Then, each observation is associated with a leaf
-
Compute that is an average for each leaf
- Each is the average of the response values of all the observations associated with the leaf
-
Create a new prediction that is:
- Here, is the learning rate used for regularization
- Here, is the average associated with the leaf for the observation
-
Repeat steps , until we build different shallow trees
- In practice, typically
Iteratively Building New Trees
- Each tree that is built in step is a shallow tree that minimizes the cost function
- Splits are determined using a greedy split-finding algorithm
- Specifically, it iterates over all the possible splits on all the features
- Then, determine the best split with the highest information gain
- The depth of the tree is determined using a hyperparameter
- The maximum number of leaved is determined using a hyperparameter too