SPSS Statistics

SPSS Statistics

Your hub for statistical analysis, data management, and data documentation. Connect, learn, and share with your peers! 

 View Only
  • 1.  decision tree generation conditions when resulting depth = 0

    Posted Thu November 19, 2020 12:54 PM

    I am using IBM decision tree (25) for a research study, using EXHAUSTIVE CHAID method. I have a question on the conditions when a tree is not being generated at all (i.e., tree depth is 0). For example, with the same dataset which I used to generate decision trees, one run may give me a result showing the tree, another run (with a different random seed) simply include no independent variable, having 0 tree depth. and the prediction of the testing sample set seemed either to be a) the category with the most cases in the training set, or b) a random prediction.

    Can I have the answers to the following:

    1) what conditions would result in a tree with a depth of 0, thus no independent variables included

    2) what was the method used by this method, when no independent variables were used in the tree for predicting the testing sample? is it randomly, or based on the category with most cases, or something else

    I could not get answers from the pdf Manuel published by IBM decision tree.

    many thanks!






    #SPSSStatistics
    #Support
    #SupportMigration


  • 2.  RE: decision tree generation conditions when resulting depth = 0

    Posted Thu November 19, 2020 02:33 PM

    Splitting criteria are specified in the Criteria subdialog box (or the equivalent syntax). In addition, the CHAID tab specifies the required significance level for splitting and whether to apply a Bonferroni correction. So if no potential split meets the significance level or the group size criterion applies, there will not be a split at that level.






    #SPSSStatistics
    #Support
    #SupportMigration


  • 3.  RE: decision tree generation conditions when resulting depth = 0

    Posted Thu November 19, 2020 03:04 PM

    Dear Jon, thanks for the quick reply. This would clear my doubt on my first question (i also attached my syntax below).


    Can you answer my second question: when there is no split at the depth (0) level. The model still tried to predict the cases from testing sample set based on the training sample. What was the method used for that prediction?


    ......................................................

    SET SEED=RANDOM.

    SHOW SEED.


    TREE Q59new [n] BY ......(removed for simplicity here)

     /TREE DISPLAY=TOPDOWN NODES=STATISTICS BRANCHSTATISTICS=YES NODEDEFS=YES SCALE=AUTO 

     /DEPCATEGORIES USEVALUES=[0 1 2 3] 

     /PRINT MODELSUMMARY CLASSIFICATION RISK 

     /RULES NODES=ALL SYNTAX=INTERNAL TYPE=SCORING 

     /METHOD TYPE=EXHAUSTIVECHAID 

     /GROWTHLIMIT MAXDEPTH=5 MINPARENTSIZE=5 MINCHILDSIZE=2 

     /VALIDATION TYPE=SPLITSAMPLE(80) OUTPUT=BOTHSAMPLES 

     /CHAID ALPHASPLIT=0.05 SPLITMERGED=YES CHISQUARE=PEARSON CONVERGE=0.001 MAXITERATIONS=100 

      ADJUST=BONFERRONI INTERVALS=5 

     /COSTS EQUAL 

     /MISSING NOMINALMISSING=MISSING.






    #SPSSStatistics
    #Support
    #SupportMigration


  • 4.  RE: decision tree generation conditions when resulting depth = 0

    Posted Fri November 20, 2020 03:46 AM

    The CHAID tab of the dialog and the Criteria subdialog define the conditions for a split. If no split possibilities meet those requirements, splitting stops. That could happen even at the root node, in which case you have a null model that would just use the mean or other constant statistic at that level.






    #SPSSStatistics
    #Support
    #SupportMigration