Introducing Gradient Boosting
- Gradient boosting is a machine learning technique for regression and classification problems
- Gradient boosting typically produces a prediction model in the form of an ensemble of weak prediction models
- Typically, gradient boosting produces decision trees
Describing Gradient Boosting for Classification
- 
Choose a loss function - Loss values represent how off our model is when making predictions for an observation
 
- Compute an initial prediction that is the probability of the response:
- Compute the residuals between each observation and
- 
Fit a regression tree to the values - Usually, these trees are shallow, but larger than a stump
 
- 
Send each observation through the new tree - Then, each observation is associated with a leaf
 
- 
Compute that is the following transformation for each leaf - Here, is the sum of all the residuals in leaf
- Here, is the sum of all the previous predictions in leaf
 
- 
Create a new prediction that is: - Here, is the learning rate used for regularization
- Here, is the tree output associated with the leaf for the observation
 
- 
Repeat steps , until we build different shallow trees - In practice, typically
 
Iteratively Building New Trees
- Each tree that is built in step is a shallow tree that minimizes the cost function
- Splits are determined using a greedy split-finding algorithm
- Specifically, it iterates over all the possible splits on all the features
- Then, determine the best split with the highest information gain
- The depth of the tree is determined using a hyperparameter
- The maximum number of leaved is determined using a hyperparameter too
References
Previous
Next