SPSS Statistics

 View Only
  • 1.  P for linear trend: Covariates or factors?

    Posted Tue February 16, 2021 07:30 AM

    Hello everyone,

    I would like to understand how the software calculates the p-value in a "Tests of model effects table" in Poisson regression, when a categorical variable is entered into the "covariates" or "factors"? I'm asking this, because in a paper using Poisson regression (table below) was shown a "p for trend". I can understand the "p for trend" is the p-value found in a "tests of model effects", while the other p-values are the ones for each factor reported on "parameter estimates". That is OK, so far so good!

    What I am trying to figure out is if those p for trend were taken from the "tests of model effects" when the variable was entered into the "factors" window, how it is supposed to be, or if the p for trend was taken from the "tests of model effects" when the variable was entered into the "covariates" window. I am not worried if this would be right or wrong for now. What I'm interested in, is to understand if the p value reported on "tests of model effects" when the categorical variable (3 categories as the picture) is entered into the "covariates" window (one degree of freedom) is expected to be equal or pretty similar to the p-value reported on the parameter estimates (the last category in comparison to the first one) when the variable is entered in the "factors" window. To understand if it makes any sense according to some algorithms or something like that.  It seems that when it is treated as a continuous variable (covariates), the software calculation looks the same as the calculation between the first and last category (factors) on parameter estimates. Is it make sense, or it was such a coincidence all the "p for trend" taken from the "tests of model effects" being the same of the p reported on "parameter estimates" when the variable was a "factors" as the article's table show? If the p is taken from the "tests of model effects" when the varibale is enterd into the "factors" window, this never happens. The p form there is always diffrent from the p's for any category on the parameter estimates. I have tested this with my personal data set and it looks the same. I am attaching one example (SPSS outputs), but this has happened with all my categorical ordinal variables.


    Any help would be amazing. I'm struggling to report the p for linear trend, using some categorical variables, but still comparing the last category with the first one, as some epidemiological studies did. But they don't report from where they get those p values.

    Thank you in advance!

    Rafael



    ------------------------------
    Rafael Mello
    Federal University of Technology - Parana, Brazil.
    Post-Graduate Program in Physical Education.
    ------------------------------

    #SPSSStatistics


  • 2.  RE: P for linear trend: Covariates or factors?

    IBM Champion
    Posted Tue February 16, 2021 09:39 AM
    If you have a variable with only two levels, it would not matter whether it was treated as a covariate or as a factor.  But with more than two levels, the results for model effects would be different, since as a covariate the effect would be assumed to be linear, which would generally be wrong.  (I can't read the images as on my computer they are too small, and blowing them up makes it too blurry.)

    ------------------------------
    Jon Peck
    ------------------------------



  • 3.  RE: P for linear trend: Covariates or factors?

    Posted Tue February 16, 2021 10:36 AM
    Hi Jon, 
    Thank you for answering. As you said  "with more than two levels, the results for model effects would be different, since as a covariate the effect would be assumed to be linear". I think this linearity is the "trick" used in some papers (as the table attached) to report the p for linear trend.
    It seems the categorical variable with three levels, when treated as "covariates" shows a p that would be the same one of the reported on parameter estimates between the first and last categpry, when this same variable is treated as "factors".
    I have attached the images just in case, but for me, when i click on the previous images they becomes full sized. Didn't work to you?

    Thank you!



    ------------------------------
    Rafael Mello
    Federal University of Technology - Parana, Brazil.
    Post-Graduate Program in Physical Education.
    ------------------------------



  • 4.  RE: P for linear trend: Covariates or factors?

    Posted Thu February 18, 2021 06:05 PM
    Hi Rafael,

    To flesh out what Jon was saying a bit, if you have a predictor with only two levels, then there can be only two distinct values for that predictor, however it's coded in your data. If it's treated as a covariate, changing those values can reverse the sign of a regression coefficient for the covariate, and can stretch or compress the scaling so that a single unit change in it is associated with a smaller or larger predicted change in the linear predictor in your model, but in a model without other predictors or with all the same other predictors, the square of the ratio of the estimated coefficient to its standard error, which is the Wald chi-square test statistic, will not change. That test statistic will be the same in the Parameter Estimates table and the Tests of Model Effects table with Type III tests in the GENLIN procedure in SPSS Statistics.

    If you enter that predictor as a categorical factor, there will be two indicator variables used to represent the two levels of the factor. Assuming an intercept (or another factor) is already in the model, the second of these indicators will be redundant or linearly dependent, and the estimate will be aliased to 0. If the values are sorted in descending order for the factor, then the variable used to estimate the single available parameter in the model will be that for comparing the higher level to the lower level. The parameter estimates will be the same numerically as if you had entered a 0-1 coded variable as a covariate (assuming no containing effects in the model). Note that the model produces the same predicted values whether the two-level predictor is entered as a covariate or as a factor.

    When you have more than two levels of a predictor, it is no longer the case that there are the same kind of equivalences between using it as a covariate and as a factor. If it's used as a covariate, then as with the two-level predictor, there's only a single parameter estimated and a single degree of freedom effect is being tested. If you enter it as a factor, then (assuming data in all levels) for a factor with k levels, there will be k-1 degrees of freedom in the effect, based on k-1 linearly independent coded variables representing that k-level factor. The test in the Tests of Model Effects table is an omnibus test that tests all k-1 degrees of freedom, or that all levels of the k-level factor are the same in the relevant population. It thus does not correspond to any test for an individual parameter, and you should not expect the p value from that table to match the p value for any parameter estimate.

    It's not clear to me that the table you're showing with the highlighted p values was produced based on an analysis with SPSS Statistics, but whether or not it was, none of the p values shown appear to be from the equivalent of a Tests of Model Effects table, because those tests would have two degrees of freedom, not one. It looks like the three-level factor was represented with indicator coding where the lowest level was last, resulting in it being a de facto reference category, with the non-zero parameter estimates comparing the second and third levels, respectively, to the first. If you print out the estimable functions for the factor effect, the first one, labeled L2, will have coefficients -1, 0, 1 for the three levels in order from low to high (reading bottom up in the table column). Aside from a scaling factor, this is the coding used in polynomial contrasts for three-level factors for the linear effect. I suspect that this is the basis of the use of the p value for that contrast (highest level vs. lowest) as a test of linear trend. As long as the other function is linearly independent so that the combination spans the same column space, the p value for that parameter estimate will match what you would get for the linear term if you used a program that offered polynomial contrasts (GENLIN doesn't offer contrasts per se, but GLM and UNIANOVA do, and you can confirm this with linear models in those procedures; the principles apply to all generalized linear models).

    Note that this equivalence no longer applies once the factor has more than three levels. No simple contrast between levels of the factor will be the same as a linear trend across levels.

    ------------------------------
    David Nichols
    ------------------------------