Section 14 Generalized linear mixed modeling (species proportions, vegetation data and planting year)

In this script, we run generalized linear mixed models to test the association between bird species proportions (rainforest and open-country species) and restoration type. In addition, we run generalized linear mixed models to test associations between species proportions and habitat (vegetation structure) using site-pair name (actively restored and naturally regenerating were specified to be paired) and repeat visits as random effects. Lastly, we assess associations between year since restoration began and bird species proportions.

14.1 Install required libraries

library(tidyverse)
library(dplyr)
library(stringr)
library(vegan)
library(ggplot2)
library(scico)
library(psych)
library(rcompanion)
library(multcomp)
library(lme4)
library(sjPlot)

# Source any custom/other internal functions necessary for analysis
source("code/01_internal-functions.R")

14.2 Load the necessary data for statistical modeling

# Load data from previous scripts
vegData <-  read.csv("results/summaryVeg.csv") %>%
  filter(!str_detect(Site_ID, 'OLCAP5B'))
vegPcaScores <- read.csv("results/pcaVeg.csv") %>%
  filter(!str_detect(Site_ID, 'OLCAP5B'))
propVisit <- read.csv("results/acoustic-detections-across-visits.csv")

14.3 Getting data ready in a format for generalized linear mixed modeling

# prep veg data
prepVegData <- vegPcaScores %>%  
  rename(Site = Site_ID) %>%
  mutate(siteCode = str_extract(Site, pattern = "\\w+\\d+")) %>%
  mutate(siteCode = factor(siteCode)) %>%
  rename(Restoration.type = Site_type) %>%
  mutate(across(Restoration.type, factor))  

# prep bird species proportions
prepBirdData <- propVisit %>%
  group_by(Site, Date, Restoration.type) %>%
  rowwise() %>%
  group_by(Site) %>% 
  mutate(visit = row_number()) %>%
  mutate(siteCode = str_extract(Site, pattern = "\\w+\\d+")) %>%
  mutate(siteCode = factor(siteCode))

# join the above two dataframes
modelData <- prepBirdData %>%
  full_join(prepVegData, by=c("Site","siteCode","Restoration.type"))

14.4 Running the generalized linear mixed models

We now run generalized linear mixed models (GLMM) assuming gaussian errors to examine the effects of restoration type (benchmark, actively restored and passively restored) on the proportion of bird species detections (for rainforest and open-country species), followed by TukeyHSD multiple comparisons tests of means.

14.5 Examining effect of restoration type on proportion of rainforest and open-country species detections

Rainforest bird species

glmm_rainForestProp <- glmer(propRF ~ Restoration.type + +(1|siteCode) + (1|visit), data = modelData, family = gaussian(link="identity"))
summary(glmm_rainForestProp)

tukey_glmmRainForestProp <- summary(glht(glmm_rainForestProp, linfct=mcp(Restoration.type ="Tukey")))
cld(tukey_glmmRainForestProp)
report::report(glmm_rainForestProp)

# We fitted a linear mixed model (estimated using REML and nloptwrap optimizer) to predict propRF with Restoration.type (formula: propRF ~ Restoration.type). The model included siteCode and visit as random effects (formula: list(~1 | siteCode, ~1 | visit)). The model's total explanatory power is substantial (conditional R2 = 0.52) and the part related to the fixed effects alone (marginal R2) is of 0.37. The model's intercept, corresponding to Restoration.type = Active, is at 0.77 (95% CI [0.73, 0.82], t(251) = 35.87, p < .001). Within this model:

#  - The effect of Restoration type [Benchmark] is statistically significant and positive (beta = 0.14, 95% CI [0.10, 0.17], t(251) = 8.51, p < .001; Std. beta = 1.06, 95% CI [0.82, 1.31])
#  - The effect of Restoration type [Passive] is statistically significant and negative (beta = -0.05, 95% CI [-0.07, -0.02], t(251) = -3.19, p = 0.002; Std. beta = -0.36, 95% CI [-0.58, -0.14])

Open-country species

glmm_openCountryProp <- glmer(propOC ~ Restoration.type + +(1|siteCode) + (1|visit), data = modelData, family = gaussian(link="identity"))
summary(glmm_openCountryProp)

tukey_glmmOpenCountryProp <- summary(glht(glmm_openCountryProp, linfct=mcp(Restoration.type ="Tukey")))
cld(tukey_glmmOpenCountryProp)
report::report(glmm_openCountryProp)

# We fitted a linear mixed model (estimated using REML and nloptwrap optimizer) to predict propOC with Restoration.type (formula: propOC ~ Restoration.type). The model included siteCode and visit as random effects (formula: list(~1 | siteCode, ~1 | visit)). The model's total explanatory power is substantial (conditional R2 = 0.52) and the part related to the fixed effects alone (marginal R2) is of 0.37. The model's intercept, corresponding to Restoration.type = Active, is at 0.23 (95% CI [0.18, 0.27], t(251) = 10.44, p < .001). Within this model:

#  - The effect of Restoration type [Benchmark] is statistically significant and negative (beta = -0.14, 95% CI [-0.17, -0.10], t(251) = -8.51, p < .001; Std. beta = -1.06, 95% CI [-1.31, -0.82])
#  - The effect of Restoration type [Passive] is statistically significant and positive (beta = 0.05, 95% CI [0.02, 0.07], t(251) = 3.19, p = 0.002; Std. beta = 0.36, 95% CI [0.14, 0.58])

14.6 Testing the role of habitat

Testing the role of habitat (vegetation structure) and on species proportions within a generalized linear modeling framework

# rainforest birds
glmm_Rainforest <- glmer(propRF ~ PC1 + PC2 + (1|siteCode) + (1|visit), data = modelData, family = gaussian(link="identity"))
summary(glmm_Rainforest)

plot_model(glmm_Rainforest, type="pred", terms=c("PC1","PC2"))
report::report(glmm_Rainforest)

# We fitted a linear mixed model (estimated using REML and nloptwrap optimizer) to predict propRF with PC1 and PC2 (formula: propRF ~ PC1 + PC2). The model included siteCode and visit as random effects (formula: list(~1 | siteCode, ~1 | visit)). The model's total explanatory power is substantial (conditional R2 = 0.48) and the part related to the fixed effects alone (marginal R2) is of 0.20. The model's intercept, corresponding to PC1 = 0 and PC2 = 0, is at 0.82 (95% CI [0.78, 0.86], t(251) = 38.90, p < .001). Within this model:

#  - The effect of PC1 is statistically significant and negative (beta = -0.03, 95% CI [-0.04, -0.02], t(251) = -6.51, p < .001; Std. beta = -0.43, 95% CI [-0.55, -0.30])
#  - The effect of PC2 is statistically non-significant and positive (beta = 0.01, 95% CI [-2.25e-03, 0.03], t(251) = 1.67, p = 0.096; Std. beta = 0.11, 95% CI [-0.02, 0.24])

# open-country birds
glmm_openCountry <- glmer(propOC ~ PC1 + PC2 + (1|siteCode) + (1|visit), data = modelData, family = gaussian(link="identity"))
summary(glmm_openCountry)

plot_model(glmm_openCountry, type="pred", terms=c("PC1","PC2"))
report::report(glmm_openCountry)

# We fitted a linear mixed model (estimated using REML and nloptwrap optimizer) to predict propOC with PC1 and PC2 (formula: propOC ~ PC1 + PC2). The model included siteCode and visit as random effects (formula: list(~1 | siteCode, ~1 | visit)). The model's total explanatory power is substantial (conditional R2 = 0.48) and the part related to the fixed effects alone (marginal R2) is of 0.20. The model's intercept, corresponding to PC1 = 0 and PC2 = 0, is at 0.18 (95% CI [0.14, 0.22], t(251) = 8.57, p < .001). Within this model:

#  - The effect of PC1 is statistically significant and positive (beta = 0.03, 95% CI [0.02, 0.04], t(251) = 6.51, p < .001; Std. beta = 0.43, 95% CI [0.30, 0.55])
#  - The effect of PC2 is statistically non-significant and negative (beta = -0.01, 95% CI [-0.03, 2.25e-03], t(251) = -1.67, p = 0.096; Std. beta = -0.11, 95% CI [-0.24, 0.02])

14.7 Effect of planting year

Lastly, running a generalized linear model to test the effect of year since restoration on bird species proportions

# add planting Year
allRestored <- modelData %>%
  left_join(vegData[,-c(2:9)], by=c("Site"="Site_ID"))

# filter only data for restored sites
allRestored <- allRestored %>%
  filter(Restoration.type=="Active") %>%
  mutate(yearSinceRestoration = (2022-plantingYear))

# rainforest birds
glmmRain <- glmer(propRF ~ yearSinceRestoration + (1|visit), data = allRestored, family = gaussian(link="identity"))
summary(glmmRain)

plot_model(glmmRain, type="pred")
report::report(glmmRain)

# We fitted a linear mixed model (estimated using REML and nloptwrap optimizer) to predict propRF with yearSinceRestoration (formula: propRF ~ yearSinceRestoration). The model included visit as random effect (formula: ~1 | visit). The model's total explanatory power is weak (conditional R2 = 0.09) and the part related to the fixed effects alone (marginal R2) is of 2.65e-03. The model's intercept, corresponding to yearSinceRestoration = 0, is at 0.74 (95% CI [0.60, 0.88], t(80) = 10.79, p < .001). Within this model:

#  - The effect of yearSinceRestoration is statistically non-significant and positive (beta = 2.11e-03, 95% CI [-6.42e-03, 0.01], t(80) = 0.49, p = 0.624; Std. beta = 0.05, 95% CI [-0.16, 0.26])

# open-country birds
glmmOpen <- glmer(propOC ~ yearSinceRestoration + (1|visit), data = allRestored, family = gaussian(link="identity"))
summary(glmmOpen)

plot_model(glmmOpen, type="pred")
report::report(glmmOpen)

# We fitted a linear mixed model (estimated using REML and nloptwrap optimizer) to predict propOC with yearSinceRestoration (formula: propOC ~ yearSinceRestoration). The model included visit as random effect (formula: ~1 | visit). The model's total explanatory power is weak (conditional R2 = 0.09) and the part related to the fixed effects alone (marginal R2) is of 2.65e-03. The model's intercept, corresponding to yearSinceRestoration = 0, is at 0.26 (95% CI [0.12, 0.40], t(80) = 3.79, p < .001). Within this model:

#  - The effect of yearSinceRestoration is statistically non-significant and negative (beta = -2.11e-03, 95% CI [-0.01, 6.42e-03], t(80) = -0.49, p = 0.624; Std. beta = -0.05, 95% CI [-0.26, 0.16])