Mu Tree Node in Classification Boosted tree method - Statistica General Discussion - Statistica - Dell Community

Mu Tree Node in Classification Boosted tree method

Mu Tree Node in Classification Boosted tree method

This question is answered
ex stat.rar

Hello

I have a question regarding the calculation of Mu in tree nodes in Boosting tree method, on second step.

I want to understand how to get it manually. I did the first step successfully. But the first step algorythm used on step 2 doesn't give the same result as Statistica

In regression problem mu of node always equal to average residuals from the previous step, but in classification problem this idea does not provide desired result. 

For example, on 2nd step I have dependent variable Y

0.570947
-0.42905
0.570947
-0.42905
-0.42905
-0.42905
0.570947
0.119203
0.119203
0.119203

and independent X:

98.655
98.701
98.399
98.032
97.962
98.981
94.024
99.41
100.327
99.014

after tree spliting, left node have only 1 point  and dependent variable is 0.570946. Why is mu in this node = 1.165356? May be I should calculate any coefficient? Or anything else?

Statistica project and data set are in attachment.

Verified Answer
  • BTreesScript.zip

    Here attach is the macro (BTreesScript.svb) provided by our developer Danny Scott and it will  perform the computations for mu and var for each node for your example data . The computations are based on Algorithm 6 of the “Greedy Function Approximation: A Gradient Boosting Machine”. (Jerome H. F, 1999).The mu’s for each node are based on the formula for gamma in Algorithm 6.  See macro subroutine “ComputeMuAndVar” for more details.   

All Replies
  • BTreesScript.zip

    Here attach is the macro (BTreesScript.svb) provided by our developer Danny Scott and it will  perform the computations for mu and var for each node for your example data . The computations are based on Algorithm 6 of the “Greedy Function Approximation: A Gradient Boosting Machine”. (Jerome H. F, 1999).The mu’s for each node are based on the formula for gamma in Algorithm 6.  See macro subroutine “ComputeMuAndVar” for more details.   

  • Thank you very much. Your answer was very helpful.

  • You are very welcome. I am glad that I can help.Big Smile

  • Hello again.

    Can you help me with Summary of Boosted Trees Graph? The manual says: "Summary button to display a graph of the average squared prediction error over the successive boosting steps" and in regression problem I've understood how to calculate this term for every point and I can calculate it, but in classification problem I can't understand how to calculate the point. Could you help me and give me the formula?