Exploring longitudinal data with historical random forests

Ecological data sets often are observational, rather than experimental, with a variety of characteristics that present analysis challenges in the traditional statistical framework. The number of observations may be small while the number of predictors is large (i.e., the small n, large p problem). The nature of relationships between the response variable and predictor variables may be unknown and non-linear. Predictor variables may exhibit multicollinearity among themselves or interaction with the response. The response variable may be “distributionally challenged”. Observations may not be independent, instead being clustered or measured over time. And yet, the researcher wants to know which predictor variables are “important” and how those predictors are related to the response. The random forest model is an appealing modeling tool that addresses some (but not all) of the challenges inherent in ecological data and has seen applications in the last decade. However the standard software does not accommodate lack of independence among observations and so is of limited value for longitudinal or otherwise clustered data.

I introduce the R package htree (Sexton 2018) which produces a nonparametric estimate of the relationship between a response variable and its previous history as well as the histories of time-varying predictor variables; it also provides measures of variable importance. The htree package fits random forest and gradient boosted ensemble models. My focus here is on the historical random forest as fitted by the hrf function. The advantages and disadvantages of this model will be illustrated by an analysis of longitudinal vegetation data collected sporadically over 20 years on 96 plots at Camp W.G. Williams, a training site located south of Salt Lake City and operated by the Utah National Guard. The goals of the analysis are to document shifts in species abundance over time and to identify and characterize the predictor variables that may drive these changes.

Susan Durham
Susan Durham
Utah State University