Case studies for: Modelling approaches for meta-analyses with dependent effect sizes in ecology and evolution

Author

Coralie Williams

Published

September 4, 2025

This webpage is supporting information for the case studies in the following manuscript:

Williams, C., Yang, Y., Warton, D. I., & Nakagawa, S. (2025). Modelling approaches for meta-analyses with dependent effect sizes in ecology and evolution: A simulation study. Methods in Ecology and Evolution. DOI: 10.1111/2041-210x.70156

Overview

Here we provide code to reproduce the results from the case studies in our manuscript “Modelling approaches for meta-analyses with dependent effect sizes in ecology and evolution: a simulation study”. This code can be used to produce similar analyses to find an appropriate sampling variance covariance matrix assuming a constant within-study correlation.

We note three important and cautionary points.

  1. For the purpose of illustrating the application of modelling approaches to handle dependent sampling errors, we reanalysed two published meta-analyses. The models have been simplified compared with those in the original studies, so the results are for illustration only and should not be used to draw substantive conclusions.

  2. If the primary studies provided sufficient information to compute covariances and correlations among effect sizes, then the sampling \(\mathbf{V^*}\) structure can be more complex (i.e. non-constant correlation within-studies) and specified using a combination of known and arbitrary chosen values depending how much information is available.

  3. The usual model specification and model selection should be undertaken before comparing model fits with different sampling \(\mathbf{V^*}\) matrix specification; adding a random effect at the level of the individual estimate (very important and is often overlooked), accounting for potential heterogeneity at each level of hierarchical or correlated clustering (this can be difficult as it can be unclear what consists of a `level’ in certain datasets), and finally consider if a level should be modelled as a random effect or fixed effect (read more in section 6 in Gelman (2025)).

For any comments or questions please contact: Coralie Williams (coraliewilliams@outlook.com)


Before starting let’s load and install the necessary R packages.

# load and install packages
library(pacman)
p_load(readxl, tidyverse, metafor, ape, dplyr,
       here, clubSandwich, corpcor, xtable,
       kableExtra)

1. Case study: Meta-analysis on plant coexistence

Crawford et al. (2019) used a large meta-analysis dataset of pairwise plant-soil feedback measures to investigate whether these feedbacks contribute to plant species coexistence. First we want to load the meta-analytical dataset provided in Crawford et al. (2019).

# link to download data file: "https://figshare.com/ndownloader/files/14874749"
dat <- read_excel(here("data/case_studies/Supplementary Table 1.xlsx"), sheet="Data")
dat <- dat |> filter(`Mycorrhizal status`=="different")

Now, we want to set up a vector of within-study correlations \(\rho\) to test (including \(\rho=0\)). Then we specify the sampling error variances (the diagonal of the assumed sampling error \(\mathbf{V^*}\)) and an empty dataframe to store the results from all models.

# corr values to test
cor_values <- seq(0, 0.9, 0.1)

# set up sampling error variances and effect size id
vi <- dat$`Variance(rrIs)`
es.id <- dat$`Pairwise comparison`

# empty dataframe to store results
results <- data.frame(rho = numeric(), beta = numeric(),
                      se = numeric(), se.cr1 = numeric(), 
                      pval = numeric(), pval.cr1 = numeric(),
                      s2.study = numeric(), s2.es = numeric(),
                      s2.phylogeny = numeric(), s2.species = numeric(),
                      loglik = numeric())

Using a for loop we generate a model per each within study constant \(\rho\) value. First, we derive the assumed sampling error \(\mathbf{V^*}\). Second, we fit a multilevel model with a t-distribution and using the adjusted degrees of freedom as recommended by (see more recommended practices here). Third, we obtain the cluster robust CR2 model estimates. Finally, we store the results within the specified dataframe.

for (rho in cor_values) {
  
  # get VCV matrix 
  V <- vcalc(vi,
             cluster=Study,
             obs=es.id,
             data=dat,
             rho=rho)
  
  # fit the multilevel meta-analysis model with this VCV
  model <- rma.mv(yi = rrIs, 
                  V = V,
                  random = list(~1 | Study,
                                ~1 | es.id),
                  data = dat,
                  test = "t",
                  dfs = "contain",
                  method = "REML")
  
  # get CR2 estimates
  mod.cr2 <- coef_test(model, vcov="CR2", cluster = dat$Study) ## default test is Satterthwaite
  
  # store model output
  results <- rbind(results, data.frame(rho = rho,
                                       beta = round(model$beta[1], 2),
                                       se = round(model$se[1], 3),
                                       se.cr2 = round(mod.cr2$SE, 3),
                                       pval = round(model$pval[1], 4),
                                       pval.cr2 = round(mod.cr2$p_Satt, 4),
                                       s2.es = round(model$sigma2[2], 3),
                                       s2.study = round(model$sigma2[1], 3),
                                       loglik = round(logLik(model), 3)))
  
}

kable(results)
rho beta se se.cr2 pval pval.cr2 s2.es s2.study loglik
0.0 -0.04 0.155 0.154 0.7857 0.7852 0.190 0.229 -56.290
0.1 -0.03 0.152 0.152 0.8413 0.8409 0.193 0.211 -55.794
0.2 -0.02 0.151 0.150 0.8895 0.8891 0.198 0.198 -55.384
0.3 -0.01 0.150 0.149 0.9299 0.9296 0.204 0.187 -55.043
0.4 -0.01 0.150 0.149 0.9628 0.9626 0.211 0.178 -54.757
0.5 0.00 0.150 0.149 0.9887 0.9886 0.219 0.170 -54.517
0.6 0.00 0.150 0.148 0.9918 0.9917 0.229 0.163 -54.316
0.7 0.00 0.150 0.148 0.9783 0.9781 0.239 0.156 -54.151
0.8 0.01 0.150 0.149 0.9705 0.9703 0.251 0.150 -54.020
0.9 0.01 0.150 0.149 0.9682 0.9679 0.264 0.143 -53.923

2. Case study: Phylogenetic multilevel meta-analysis on behavioural predictability

Horváth et al. (2023) investigated whether behavioural type (mean behaviour) and behavioural predictability (within-individual variation) evolve independently or under system-specific constraints across multiple species. Here we load and prepare the meta-analytical dataset and phylogenetic tree provided in Horváth et al. (2023).

### code from supplementary HTML file of Horváth et al. (2023)

library(ape); library(Matrix);library(rotl); library(rncl);
library(Hmisc); library(lattice); library(HDInterval)
# Set up meta-analysis data
xdata <- read_excel(here("data/case_studies/Metadata_ALL_NEWER.xlsx"))
xdata$N_OBS<-as.numeric(xdata$N_OBS)
xdata$N_ID<-as.numeric(xdata$N_ID)
xdata$N_rep_mean<-as.numeric(xdata$N_rep_mean)
xdata$ES.ID<-as.factor(xdata$ES.ID)
xdata$scientific.name <- tolower(as.factor(xdata$scientific.name))
xdata$species<-as.factor(xdata$species)
xdata$taxa <- as.factor(xdata$taxa)
xdata$study <- as.factor(xdata$study)
xdata$behaviour <-as.factor(xdata$behaviour)
xdata$environment <-as.factor(xdata$environment)
xdata$measures.use<-as.factor(xdata$measures.use)
xdata$age <- as.factor(xdata$age)
xdata$sex <- as.factor(xdata$sex)
xdata$temporal.context <- as.factor(xdata$temporal.context)

## transform correlation coefficients to Fisher’s normalized correlation coefficients (Zr)
Zr.corr<-function(r){
  Zr<-0.5*log((1+r)/(1-r))
}
# get Zr
xdata$yi.z <- Zr.corr(xdata$Corr)
# multiply these Zr values by -1 to make positive correlations equate risk-prone animals being more variable
xdata$yi.z[xdata$measures.use == 'latency']<-xdata$yi.z[xdata$measures.use == 'latency']*-1
# get sampling variances
vi<-xdata$N_rep_mean/(2*(xdata$N_ID-2)*(xdata$N_rep_mean-1))
xdata<-data.frame(xdata,vi)


##### Phylogenetic tree
species <- unique(xdata$scientific.name)
species <- as.data.frame(species)
species[,1] <- as.character(species[,1])

taxa <- tnrs_match_names(unique(xdata$scientific.name), context="Animals")
taxon_map <- structure(taxa$search_string, names=taxa$unique_name)
tr <- tol_induced_subtree(ott_id(taxa)[is_in_tree(ott_id(taxa))])
tr <- readRDS(here("data/case_studies/phylotree.rds"))

# clean up tip labels
otl_tips <- strip_ott_ids(tr$tip.label, remove_underscores=TRUE)
tr$tip.label <- taxon_map[ otl_tips ]
any(duplicated(tr$node.label))
tr$node.label <- NULL
xdata <- xdata[xdata$scientific.name %in% tr$tip.label, ]

# create VCV phylogenetic matrix
xdata$animal <- xdata$scientific.name
phylo_matrix <- vcv(compute.brlen(tr, corr = T))

Now, let’s set up the vector of within-study correlations \(\rho\) that we want to test (including \(\rho=0\)) and an empty dataframe to store the results from all models.

# corr values to test
cor_values <- seq(0, 0.9, 0.1)

# empty dataframe to store results
results <- data.frame(rho = numeric(), beta = numeric(),
                      se = numeric(), se.cr1 = numeric(), 
                      pval = numeric(), pval.cr1 = numeric(),
                      s2.study = numeric(), s2.es = numeric(),
                      s2.phylogeny = numeric(), s2.species = numeric(),
                      loglik = numeric())

For each \(\rho\) value we derive the assumed sampling error \(\mathbf{V^*}\) and fit the phylogenetic meta-analytical model with a t-distribution and using the adjusted degrees of freedom as recommended by (see more recommended for metafor practices here). Additionally, we derive the cluster robust CR1 model estimates.

for (rho in cor_values) {
  
  # get VCV matrix 
  V <- vcalc(vi,
             cluster=study,
             obs=ES.ID,
             data=xdata,
             rho=rho)
  
  # fit the multilevel meta-analysis model with this VCV
  model <- rma.mv(yi = yi.z, 
                  V = V,
                  random = list(~1 | study,
                                ~1 | ES.ID,
                                ~1 | animal,   #phylogenetic component
                                ~1 | species), #non phylogenetic component
                  data = xdata,
                  R = list(animal = phylo_matrix),
                  test = "t",
                  dfs = "contain",
                  method = "REML")
  
  # get CR1 estimates
  mod.cr1 <- robust(model, cluster = study, adjust=TRUE)
  
  # store model output
  results <- rbind(results, data.frame(rho = rho,
                                       beta = round(model$beta[1], 2),
                                       se = round(model$se[1], 3),
                                       se.cr1 = round(mod.cr1$se, 3),
                                       pval = round(model$pval[1], 4),
                                       pval.cr1 = round(mod.cr1$pval, 4),
                                       s2.es = round(model$sigma2[2], 3),
                                       s2.study = round(model$sigma2[1], 3),
                                       s2.phylogeny = round(model$sigma2[3], 3),
                                       s2.species = round(model$sigma2[4], 3),
                                       loglik = round(logLik(model), 3)))
  
}

kable(results)
rho beta se se.cr1 pval pval.cr1 s2.es s2.study s2.phylogeny s2.species loglik
0.0 -0.05 0.207 0.083 0.7953 0.5199 0.381 0.133 0.115 0 -102.696
0.1 -0.05 0.207 0.083 0.7946 0.5183 0.382 0.132 0.115 0 -102.703
0.2 -0.05 0.207 0.083 0.7940 0.5168 0.384 0.130 0.115 0 -102.710
0.3 -0.05 0.207 0.083 0.7933 0.5153 0.385 0.129 0.116 0 -102.718
0.4 -0.05 0.207 0.083 0.7927 0.5138 0.386 0.128 0.116 0 -102.725
0.5 -0.06 0.208 0.084 0.7920 0.5122 0.388 0.127 0.116 0 -102.733
0.6 -0.06 0.208 0.084 0.7914 0.5107 0.389 0.126 0.117 0 -102.741
0.7 -0.06 0.208 0.084 0.7908 0.5093 0.390 0.125 0.117 0 -102.749
0.8 -0.06 0.208 0.084 0.7901 0.5078 0.391 0.124 0.117 0 -102.757
0.9 -0.06 0.208 0.084 0.7895 0.5063 0.393 0.123 0.117 0 -102.765

References

Crawford, K. M., Bauer, J. T., Comita, L. S., Eppinga, M. B., Johnson, D. J., Mangan, S. A., Queenborough, S. A., Strand, A. E., Suding, K. N., Umbanhowar, J., & Bever, J. D. (2019). When and where plant-soil feedback may promote plant coexistence: a meta-analysis. Ecology letters, 22(8), 1274–1284. https://doi.org/10.1111/ele.13278

Gelman, A. (2005). Analysis of variance: Why it is more important than ever. Annals of Statistics, 33(1), 1–53. https://doi.org/10.1214/009053604000001048

Horváth, G., Garamszegi, L. Z., & Herczeg, G. (2023). Phylogenetic meta-analysis reveals system-specific behavioural type–behavioural predictability correlations. Royal Society Open Science, 10(9). https://doi.org/10.1098/rsos.230303

Session information

library(sessioninfo)
library(details)

si <- session_info()
si$packages <- si$packages %>% 
  filter(package %in% c("metafor", "ape", "clubSandwich", "Matrix", "corpcor", "dplyr", "kableExtra", "xtable", "rotl", "Hmisc", "lattice"))

details(si, summary = 'Current session info', open = FALSE)
Current session info

─ Session info ───────────────────────────────────────────────────────────────
 setting  value
 version  R version 4.4.2 (2024-10-31 ucrt)
 os       Windows 11 x64 (build 26100)
 system   x86_64, mingw32
 ui       RTerm
 language (EN)
 collate  English_Australia.utf8
 ctype    English_Australia.utf8
 tz       America/Denver
 date     2025-09-04
 pandoc   3.4 @ C:/Program Files/RStudio/resources/app/bin/quarto/bin/tools/ (via rmarkdown)

─ Packages ───────────────────────────────────────────────────────────────────
 package      * version date (UTC) lib source
 ape          * 5.8-1   2024-12-16 [1] CRAN (R 4.4.2)
 clubSandwich * 0.5.11  2024-06-20 [1] CRAN (R 4.4.1)
 corpcor      * 1.6.10  2021-09-16 [1] CRAN (R 4.4.0)
 dplyr        * 1.1.4   2023-11-17 [1] CRAN (R 4.4.1)
 Hmisc        * 5.2-2   2025-01-10 [1] CRAN (R 4.4.2)
 kableExtra   * 1.4.0   2024-01-24 [1] CRAN (R 4.4.3)
 lattice      * 0.22-6  2024-03-20 [2] CRAN (R 4.4.2)
 Matrix       * 1.7-1   2024-10-18 [1] CRAN (R 4.4.2)
 metafor      * 4.8-0   2025-01-28 [1] CRAN (R 4.4.3)
 rotl         * 3.1.0   2023-06-15 [1] CRAN (R 4.4.2)
 xtable       * 1.8-4   2019-04-21 [1] CRAN (R 4.4.2)

 [1] C:/Users/z5394590/AppData/Local/R/win-library/4.4
 [2] C:/Program Files/R/R-4.4.2/library

──────────────────────────────────────────────────────────────────────────────