Section 5 Indicator species analysis
This analysis aims to identify what species are “indicators” of groups of samples or treatment type and asks if this varies between point count estimates and acoustic surveys.
If the aim is to determine which species can be used as indicators of certain site group, an approach commonly used in ecology is the Indicator Value (Dufrene and Legendre, 1997). These authors defined an Indicator Value (IndVal) index to measure the association between a species and a site group. The method of Dufrene and Legendre (1997) calculates the IndVal index between the species and each site group and then looks for the group corresponding to the highest association value. Finally, the statistical significance of this relationship is tested using a permutation test.
Diagnostic (or indicator) species are an important tool in vegetation science, because these species can be used to characterize and indicate specific plant community types. A statistic commonly used to determine the association (also known as fidelity, not to be confounded with the indicator value component) between species and vegetation types is Pearson’s phi coefficient of association (Chytry et al., 2002). This coefficient is a measure of the correlation between two binary vectors. The abundance-based counterpart of the phi coefficient is called the point biserial correlation coefficient (which is defined as “r.g”).
See: https://cran.r-project.org/web/packages/indicspecies/vignettes/indicspeciesTutorial.pdf
5.1 Load necessary libraries
library(tidyverse)
library(dplyr)
library(stringr)
library(vegan)
library(ggplot2)
library(scico)
library(psych)
library(ecodist)
library(RColorBrewer)
library(ggforce)
library(ggalt)
library(patchwork)
library(sjPlot)
library(indicspecies)
# Source any custom/other internal functions necessary for analysis
source("code/01_internal-functions.R")
5.4 Estimate detections across visits for point count data and acoustic data
These detections are calculate at the site level (for a total of five/six visits for acoustic & point count data, respectively). If the across_visit_detections = 5, that means that a species was detected every single time across each of the five visits to that site. This value ranges from 1 to 5 (or 1 to 6 for point count data). These detections are estimated for the indicator species analysis since the matrix required for the same cannot be a presence/absence matrix.
## we will estimate abundance across point counts by site-date (essentially corresponding to visit)
abundance <- datSubset %>%
filter(data_type == "point_count") %>%
group_by(date, site_id, restoration_type, scientific_name,
common_name, eBird_codes, habitat, foraging_habit) %>% summarise(totAbundance = sum(number)) %>%
ungroup()
# estimate across visit detections for point count data
pc_visit_detections <- abundance %>%
mutate(forDetections = case_when(totAbundance > 0 ~ 1)) %>%
group_by(scientific_name, site_id,restoration_type) %>%
summarise(across_visit_detections = sum(forDetections)) %>%
mutate(data_type = "point_count") %>%
ungroup()
# estimate total number of detections across the acoustic data by site-date (essentially corresponds to a visit)
# note: we cannot call this abundance as it refers to the total number of vocalizations across a 16-min period across all sites
detections <- datSubset %>%
filter(data_type == "acoustic_data") %>%
group_by(date, site_id, restoration_type, scientific_name,
common_name, eBird_codes, habitat, foraging_habit) %>% summarise(totDetections = sum(number)) %>%
ungroup()
# estimate across visit detections for acoustic data
aru_visit_detections <- detections %>%
mutate(forDetections = case_when(totDetections > 0 ~ 1)) %>%
group_by(scientific_name, site_id,restoration_type) %>%
summarise(across_visit_detections = sum(forDetections)) %>%
mutate(data_type = "acoustic_data") %>%
ungroup()
5.5 Indicator species analysis of the point-count dataset
The detections across visits is converted to a wide format for the sake of analysis.
## preparing the point count data for the indicator species analysis
pc_indicator <- pc_visit_detections %>%
group_by(scientific_name, site_id, restoration_type) %>%
pivot_wider(names_from = scientific_name,
values_from = across_visit_detections,
values_fill = list(across_visit_detections=0))
## indicator species analysis
indic_pc <- multipatt(pc_indicator[,4:ncol(pc_indicator)], pc_indicator$restoration_type, func = "r.g", control = how(nperm=999))
# analyze/summarize the results
summary(indic_pc)
Multilevel pattern analysis for point count data
Association function: r.g
Significance level (alpha): 0.05
Total number of species: 83
Selected number of species: 25
Number of species associated to 1 group: 16
Number of species associated to 2 groups: 9
List of species associated to each combination:
Group BM #sps. 13
stat p.value
Alcippe poioicephala 0.812 0.001 ***
Culicicapa ceylonensis 0.739 0.001 ***
Hypothymis azurea 0.665 0.001 ***
Pellorneum ruficeps 0.599 0.001 ***
Ducula badia 0.596 0.001 ***
Harpactes fasciatus 0.565 0.001 ***
Cyornis pallidipes 0.561 0.001 ***
Leptocoma minima 0.531 0.002 **
Phylloscopus magnirostris 0.505 0.004 **
Irena puella 0.457 0.005 **
Dicrurus paradiseus 0.439 0.010 **
Chalcophaps indica 0.380 0.032 *
Dicrurus aeneus 0.376 0.029 *
Group NR #sps. 3
stat p.value
Cinnyris asiaticus 0.702 0.001 ***
Acrocephalus dumetorum 0.507 0.003 **
Machlolophus aplonotus 0.396 0.021 *
Group AR+BM #sps. 2
stat p.value
Muscicapa muttui 0.509 0.003 **
Iole indica 0.498 0.003 **
Group AR+NR #sps. 7
stat p.value
Streptopelia chinensis 0.744 0.001 ***
Pycnonotus jocosus 0.744 0.001 ***
Dicaeum concolor 0.474 0.002 **
Psittacula columboides 0.403 0.020 *
Zosterops palpebrosus 0.394 0.025 *
Orthotomus sutorius 0.389 0.026 *
Psittacula cyanocephala 0.376 0.023 *
Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 .
5.6 Indicator species analysis for acoustic data
The detections across visits is converted to a wide format for the sake of analysis.
## preparing the acoustic data for the indicator species analysis
aru_indicator <- aru_visit_detections %>%
group_by(scientific_name, site_id, restoration_type) %>%
pivot_wider(names_from = scientific_name,
values_from = across_visit_detections,
values_fill = list(across_visit_detections=0))
## indicator species analysis
indic_aru <- multipatt(aru_indicator[,4:ncol(aru_indicator)], aru_indicator$restoration_type, func = "r.g", control = how(nperm=999))
# analyze/summarize the results
summary(indic_aru)
Multilevel pattern analysis for the acoustic dataset
Association function: r.g
Significance level (alpha): 0.05
Total number of species: 113
Selected number of species: 29
Number of species associated to 1 group: 19
Number of species associated to 2 groups: 10
List of species associated to each combination:
Group AR #sps. 1
stat p.value
Chloropsis aurifrons 0.396 0.044 *
Group BM #sps. 12
stat p.value
Alcippe poioicephala 0.666 0.001 ***
Hypothymis azurea 0.612 0.001 ***
Phylloscopus magnirostris 0.602 0.001 ***
Cyornis pallidipes 0.601 0.001 ***
Ducula badia 0.468 0.009 **
Culicicapa ceylonensis 0.460 0.005 **
Sitta frontalis 0.459 0.007 **
Harpactes fasciatus 0.454 0.012 *
Dryocopus javensis 0.430 0.007 **
Muscicapa muttui 0.412 0.009 **
Leptocoma minima 0.395 0.030 *
Picumnus innominatus 0.346 0.042 *
Group NR #sps. 6
stat p.value
Streptopelia chinensis 0.554 0.002 **
Orthotomus sutorius 0.503 0.002 **
Tephrodornis sylvicola 0.467 0.002 **
Carpodacus erythrinus 0.421 0.015 *
Phylloscopus affinis 0.408 0.020 *
Lonchura kelaarti 0.367 0.021 *
Group AR+NR #sps. 10
stat p.value
Pycnonotus jocosus 0.830 0.001 ***
Acrocephalus dumetorum 0.543 0.002 **
Corvus splendens 0.507 0.003 **
Psittacula cyanocephala 0.447 0.007 **
Merops leschenaulti 0.441 0.009 **
Dicrurus leucophaeus 0.438 0.014 *
Copsychus saularis 0.430 0.016 *
Pavo cristatus 0.426 0.014 *
Halcyon smyrnensis 0.403 0.015 *
Dinopium benghalense 0.386 0.028 *
Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 .