Beyond bivariate correlations: three-block partial least squares illustrated with vegetation, soil, and topography
Abstract:
Ecologists, particularly those engaged in biogeomorphic studies, often seek to connect data from three or more domains. Using three-block partial least squares regression, we present a procedure to quantify and define bi-variance and tri-variance of data blocks related to plant communities, their soil parameters, and topography. Bi-variance indicates the total amount of covariation between these three domains taken in pairs, whereas tri-variance refers to the common variance shared by all domains. We characterized relationships among three domains (plant communities, soil properties, topography) for a salt marsh, four coastal dunes, and two temperate forests spanning several regions in the world. We defined the specific bi- and tri-variances for the ecological systems we included in this study and addressed larger questions about how these variances scale with each other looking at generalities across systems. We show that a system tends to exhibit high bi-variance and tri-variance (tight coupling among domains) when subjected to the effects of frequent and widespread (i.e., broadly acting) hydrogeomorphic disturbance. When major disturbance events are uncommon, bi-variance and tri-variance decrease, because the formation of vegetation, soil, and topographic patterns is primarily localized, and the couplings of these properties diverge over space, contingent upon site-specific disturbance history and/or fine-grained environmental heterogeneity. We also demonstrate that the bi-variance and tri-variance of a whole system are not consistently either greater or smaller than those of the associated sub-zones. This point implies that the overall correlation structure among vegetation, soil, and topography is conserved across spatial scales. This paper addresses a critical aspect of ecology: the conceptual and analytical integration of data across multiple domains. By example, we show that bi-variances and tri-variances provide useful insight into how the strength of couplings among vegetation, soil, and topography data blocks varies across scales and disturbance regimes. Though we describe the simplest case of multi-variance beyond the usual two-block linear statistical model, this approach can be extended to any number of data domains, making integration tractable and more supportive of holistic inferences