useR! 2008: Count data and Model comparison


Today’s first focus sessions was planned around modeling. Two presentations stood out for me, were the ones by Christian Kleiber on generalized regression on count data, and Gianmarco Altoè on bootstrapped model comparison.

Christan Kleiber presented a very interesting package regarding regression models for count data. Classical count data models are for instance poisson regression, which is offered by several packages already in R-Project. Using many of the code already available in R, Kleiber wrote several functions, for instance for efficiently estimating zero-inflated models or so-called Hurdle models. Although apparently developed for use in econometrics, I can easily see the use for this package, especially regarding the zero-inflated models.

I think that the presentation given by Gianmarco Altoè, and especially the package DeltaR he developed, can be very valuable to some types of research. As a statistician, he was asked for the possibility to compare the proportion of variance explained by different regression models, estimated using different samples. I don’t see myself using this, since as a sociologist I try to get samples that cover the whole population as best as possible anyway. However, especially in disciplines such as psychology, management studies, or perhaps even development studies, I can see the use of model comparisons.

I do wonder though, that if we are comparing the models based on different samples, if we are not implicitly assuming that the two samples are subsets of a single sample. If that should be the case, we don’t need to apply this type of comparison and we could better merge the data and perform a single analysis, focused on the comparison between the two groups.

Leave a Reply