Introducing Gradient Boosting
- Gradient boosting is a machine learning technique for regression and classification problems
- Gradient boosting typically produces a prediction model in the form of an ensemble of weak prediction models
- Typically, gradient boosting produces decision trees
Describing Gradient Boosting for Classification
-
Choose a loss function
- Loss values represent how off our model is when making predictions for an observation
- Compute an initial prediction that is the probability of the response:
- Compute the residuals between each observation and
-
Fit a regression tree to the values
- Usually, these trees are shallow, but larger than a stump
-
Send each observation through the new tree
- Then, each observation is associated with a leaf
-
Compute that is the following transformation for each leaf
- Here, is the sum of all the residuals in leaf
- Here, is the sum of all the previous predictions in leaf
-
Create a new prediction that is:
- Here, is the learning rate used for regularization
- Here, is the tree output associated with the leaf for the observation
-
Repeat steps , until we build different shallow trees
- In practice, typically
Iteratively Building New Trees
- Each tree that is built in step is a shallow tree that minimizes the cost function
- Splits are determined using a greedy split-finding algorithm
- Specifically, it iterates over all the possible splits on all the features
- Then, determine the best split with the highest information gain
- The depth of the tree is determined using a hyperparameter
- The maximum number of leaved is determined using a hyperparameter too
References
Previous
Next