Hello everyone,
I am struggling with fitting the model to overdispersed (positively skewed) data and want to ask for your opinion.
I measured for how long a specific behavior (B in [s]) lasted in tested subjects during a fixed time of observation.
There are two independent variables/predictors, i.e., subject's sex (S: male or female) and genotype (G: 1 or 2).
My research question is whether the subject's sex or genotype affects the duration of behavior B and whether the genotype modulates sex's effect.
EXP: B ~ S + G + S*G
My data do not follow the assumptions of the general linear model, so I decided to go with generalized linear models
(as far as I know, regular, non-parametric tests cannot estimate the factors' interaction, in which I am interested).
I cannot use GLMs with gamma distribution since behavior B did not appear for many subjects (B = 0 s), yet these cases are relevant for my experiment.
I decided to try GLMs Poisson and then ZIP, but they do not fit data appropriately. The best fit had GLMs negative binomial regression, and here is my question:
My data for B is a continuous variable (time measured in [s]). For the sake of my experiment, I can use the integer values (i.e., I can substitute 30,35 s --> 31 s)
but is this the only available approach for me to use NBR, and is it legitimate in your opinion?
Have you any other ideas on how I can handle this design and data to estimate S*G interaction?
I will genuinely appreciate your feedback.
------------------------------
Natalia
------------------------------
#SPSSStatistics