Skip to content
Snippets Groups Projects
Name Last commit Last update
non-package/images
vignettePlus
README.md

Public repository for spaMM

CRAN CRAN RStudio mirror downloads Rdoc

What is spaMM ?

spaMM is an R package originally designed for fitting spatial generalized linear Mixed Models, particularly the so-called geostatistical model allowing prediction in continuous space. But it is now a more general-purpose package for fitting mixed models, spatial or not, and with efficient methods for both geostatistical and autoregressive models. Several non-GLM response families are now handled. It can fit multivariate-response models, including some of interest in quantitative genetics or species-distribution modeling. It can also fit models with non-gaussian random effects (e.g., Beta- or Gamma-distributed), structured dispersion models (including residual dispersion models with random effects), and implements several variants of Laplace and PQL approximations, including (but not limited to) those discussed in the h-likelihood literature (see References).

What to look for (or not) here ?

This repository provides whatever information I do not try to put into the R package, such as its vignette-like gentle introduction (latest version: 2023/08/30) and the slides from the presentation of spaMM at the useR2021 conference.

It might also be used to distribute development versions of spaMM. However, use a CRAN repository for standard installation of the package, and see the (unofficial) CRAN github repository for an archive of sources for all versions of spaMM previously published on CRAN.

General features of spaMM

The spaMM package was developed first to fit mixed-effect models with spatial correlations, which commonly occur in ecology, but it has since been developed into a more general package for inferences under models with or without spatially-correlated random effects, including multivariate-response models. To make it competitive to fit large data sets, spaMM has distinct algorithms for three cases: sparse precision, sparse correlation, and dense correlation matrices, and is efficient to fit geostatistical, autoregressive, and other mixed models on large data sets. Notable features include:

  • Fitting spatial and non-spatial correlation models: geostatistical models with random-effect terms following the Matern as well as the much less known Cauchy correlation models, autoregressive models described by an adjacency matrix, AR(p) and ARMA(p,q) time-series models (ARp and ARMA), or an arbitrary given precision or correlation matrix (corrMatrix). Conditional spatial effects can be fitted, as in (say) Matern(female|...) + Matern(male|...) to fit distinct random effects for females and males (e.g., Tonnabel et al., 2021). Brave users can even define their own parametric correlation models, to be fitted as any other random effect (the corrFamily feature).
  • A further class of spatial correlation models, "Interpolated Markov Random Fields" (IMRF) covers widely publicized approximations of Matérn models (Lindgren et al. 2011) and the multiresolution model of Nychka et al. 2015.
  • Symmetric and antisymmetric dyadic interaction effects (such as considered in so-called Bradley-Terry models or in diallel experiments) can be fitted as fixed or as random effects (see e.g. X.antisym, diallel or antisym documentations)
  • Allowed response families include beta response, beta-binomial, the Conway-Maxwell-Poisson (COMPoisson), and two negative binomial families. Zero-truncated variants of the poisson and negative-binomial families are handled;
  • All the above features combined in multivariate-response models;
  • A replacement function for glm, useful when the latter (or even glm2) fails to fit a model;
  • A syntax close to that of glm or [g]lmer.
  • Many extractor methods similar to those in stats or nlme/lmer, and functions for inference beyond the fits, such as confint() for confidence intervals of fixed-effect parameters, predict() and related functions for point prediction and prediction variances, and compatibility with functions from other packages such as multcomp::glht() and lmerTest procedures providing F tests using Satterthwaite method (see `post-fit` and anova documentation items);
  • Simple facilities for quickly drawing maps from model fits, using only base graphic functions. See here for more elaborate examples of producing maps. The animated graphics on this page is from an application using the IsoriX package.

References

The performance of likelihood ratio tests based on spaMM fits, and the impact of some likelihood approximations, were assessed for spatial GLMMs in: Rousset F., Ferdy J.-B. (2014) Testing environmental and genetic effects in the presence of spatial autocorrelation. Ecography, 37: 781-790. Also available here is the Supplementary Appendix G from that paper, including comparisons with a trick that has been uncritically used to constrain the functions lmer and glmmPQL to analyse spatial models.

For some substantial use of various features of spaMM, see e.g. the IsoriX project, or a story about social dominance in hyaenas, or yet another depressing story about climate change, or the life-history of mothers of twins, or a comparison of prediction by LMMs and by random-forest methods (in supplementary material of a paper on protected area personnel), or analyses of dyadic interactions in mandrills.

Initial development drew inspiration from work by Lee and Nelder on h-likelihood (e.g. Lee, Nelder & Pawitan, 2006; Lee & Lee 2012; see also Molas and Lesaffre, 2010), and spaMM retains from that work several distinctive features, such as specific methods to fit models with non-gaussian random effects, structured dispersion models with random effects, and implementation of several variants of Laplace and PQL approximations. However, later versions have increasingly relied on additional insights. Notably, the default likelihood approximation now goes beyond those discussed in these works, and is the same Laplace approximation as in TMB (Kristensen et al., 2016) and packages based on TMB, in particular where this departs from what is discussed in the h-likelihood literature (i.e., for GLM families with non-canonical link, or response families not of the GLM class).

Credits

Initial development was supported by a PEPS grant from the CNRS and University of Montpellier.