I have a question regarding the calculation of Mu in tree nodes in Boosting tree method, on second step.
I want to understand how to get it manually. I did the first step successfully. But the first step algorythm used on step 2 doesn't give the same result as Statistica
In regression problem mu of node always equal to average residuals from the previous step, but in classification problem this idea does not provide desired result.
For example, on 2nd step I have dependent variable Y
and independent X:
after tree spliting, left node have only 1 point and dependent variable is 0.570946. Why is mu in this node = 1.165356? May be I should calculate any coefficient? Or anything else?
Statistica project and data set are in attachment.
Here attach is the macro (BTreesScript.svb) provided by our developer Danny Scott and it will perform the computations for mu and var for each node for your example data . The computations are based on Algorithm 6 of the “Greedy Function Approximation: A Gradient Boosting Machine”. (Jerome H. F, 1999).The mu’s for each node are based on the formula for gamma in Algorithm 6. See macro subroutine “ComputeMuAndVar” for more details.
Thank you very much. Your answer was very helpful.
You are very welcome. I am glad that I can help.
Can you help me with Summary of Boosted Trees Graph? The manual says: "Summary button to display a graph of the average squared prediction error over the successive boosting steps" and in regression problem I've understood how to calculate this term for every point and I can calculate it, but in classification problem I can't understand how to calculate the point. Could you help me and give me the formula?